Commit Graph

5219 Commits

Author SHA1 Message Date
Alexandre Flament 5d9db6c2f7 [remove] yandex engine 2021-02-11 14:28:06 +01:00
Alexandre Flament 35dd069402 [fix] fix seznam engine
no paging support
2021-02-11 12:53:19 +01:00
Alexandre Flament 7d6e69e2f9 [upd] wikipedia engine: return an empty result on query with illegal characters
on some queries (like an IT error message), wikipedia returns an HTTP error 400.
this commit returns an empty result instead of showing an error to the user.
2021-02-11 12:29:21 +01:00
Alexandre Flament ff84a1af35 [mod] json_engine: add content_html_to_text and title_html_to_text
Some JSON API returns HTML in either in the HTML or the content.
This commit adds two new parameters to the json_engine:
content_html_to_text and title_html_to_text, False by default.

If True, then the searx.utils.html_to_text removes the HTML tags.

Update crossref, openairedatasets and openairepublications engines
2021-02-10 16:42:11 +01:00
Alexandre Flament 436d366448
Merge pull request #2544 from mrwormo/congresslibrary
[Engine] Add Library of Congress engine
2021-02-10 10:13:46 +01:00
Alexandre Flament eafd27f42a
Merge pull request #2556 from dalf/fix-apk-mirror
[fix] fix apk_mirror engine
2021-02-10 10:12:37 +01:00
Alexandre Flament c40316d957
Merge pull request #2558 from dalf/remove-google-play-music
[upd] remove google_play_music engine
2021-02-10 10:12:21 +01:00
Alexandre Flament d2dac11392 [mod] duckduckgo engine: better support of the language preference
After the main request, send a second to https://duckduckgo.com/t/sl_h

See https://github.com/searx/searx/issues/2259
2021-02-09 14:36:43 +01:00
Alexandre Flament 74d56f6cfb [mod] poolrequests: for one (user request, engine) always use the same HTTPAdapter
The duckduckgo engine requires an additional request after the results have been sent.
This commit makes sure that the second request uses the same HTTPAdapter
= the same IP address, and the same proxy.
2021-02-09 14:33:36 +01:00
Markus Heiser bc1be3f0e9 [enh] add engine MediathekViewWeb (API)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-09 13:08:01 +01:00
mrwormo 051da88328 Add Library of Congress engine 2021-02-09 12:45:39 +01:00
Alexandre Flament 9211cdfe9b [upd] remove google_play_music engine
Google Play Music has been replaced by Youtube music.
2021-02-09 11:38:50 +01:00
Alexandre Flament aedf03c0f7 Fix: activate raise_for_error by default
Fix commit d703119d3a :
Some engines need to parse the HTTP error but
raise_for_error is always set to False in the "request" function.
2021-02-09 11:27:41 +01:00
Alexandre Flament 5e055b069b [fix) fix apk_mirror engine 2021-02-09 11:02:12 +01:00
Alexandre Flament f03ad0a3c0
Merge pull request #2555 from dalf/fix-github-data-update
[fix] fix github action data-update.yml
2021-02-09 10:48:55 +01:00
Alexandre Flament 966a7a1f25 [fix] fix github action data-update.yml 2021-02-09 09:58:59 +01:00
Alexandre Flament e4cc7f13a3
Merge pull request #2542 from kvch/fix-naver-engine
Fix XPATHs in Naver engine
2021-02-09 08:52:38 +01:00
Alexandre Flament bec9e30fe7
Merge pull request #2554 from MarcAbonce/zh-variants-in-wikipedia
Add support for Chinese variants in Wikipedia
2021-02-09 08:49:59 +01:00
Alexandre Flament 6c513095e4
Merge pull request #2553 from danielhones/improve-results-highlighting-updated
Ignore double-quotes when highlighting query parts
2021-02-09 08:39:07 +01:00
Daniel Hones 138f32471c Updated webutils.highlight_content to ignore double-quotes when highlighting query parts 2021-02-08 23:58:54 -05:00
Marc Abonce Seguin 64e81794fe add support for Chinese variants in Wikipedia 2021-02-08 21:56:45 -07:00
Noémi Ványi ac309f5b8d Fix naver engine
Closes #2540
2021-02-07 18:58:13 +01:00
Noémi Ványi ab8739809c
Merge pull request #2538 from return42/drop-metager
[drop] metager - xpath engine won't work anymore
2021-02-07 15:21:40 +01:00
Markus Heiser 41c03cf011 [drop] metager - xpath engine won't work anymore
The new version of MetaGer needs to reload the reults (into a iframe) with a
unique tag (see HTML response below).

Implementing a dedicated metager-engine for searx makes no sense to me. The
great days of MetaGer seems to be ended.  I remember the good old days this
project started in the 90's of the last century.  But in the last few years it
becomes more and more crap.  As the name suggested, MetaGer was made for
germans in the first place.  They have added a english and spain translation but
the i18n is very poor compared to what searx offers.

It's a pity, lets drop MetaGer.

This is the first response, the id (b82679980656899ba5a17ffd02a56846) is unique
for each query:

    $ curl "https://metager.org/meta/meta.ger3?eingabe=foo&submit-query=&focus=web"
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <link rel="stylesheet" href="/index.css?id=b82679980656899ba5a17ffd02a56846">
        <script src="/index.js?id=b82679980656899ba5a17ffd02a56846"></script>
    <title>foo - MetaGer</title>
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
    </head>
    <body>
        <iframe id="mg-framed" src="https://metager.org/meta/meta.ger3?eingabe=foo&amp;submit-query=&amp;focus=web&amp;mgv=b82679980656899ba5a17ffd02a56846" autofocus="true" onload="this.contentWindow.focus();"></iframe>
     </body>

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-07 14:55:21 +01:00
Noémi Ványi 1f09d7d561
Merge pull request #2539 from OliveiraHermogenes/recoll/paged_json
[feat] recoll: add support for paging
2021-02-07 14:28:57 +01:00
Hermógenes Oliveira 514faa9162 [feat] recoll: paged json support 2021-02-07 10:05:35 -03:00
Alexandre Flament 1e35c3ccce
Merge pull request #2531 from MarcAbonce/fix-browser-locale
[fix] Get correct locale with country from browser
2021-02-05 10:55:37 +01:00
Marc Abonce Seguin c937a9e85f [fix] get correct locale with country from browser
Some of our interface locales include uppercase country codes,
which are separated by `_` instead of the more common `-`.
Also, a browser's `Accept-Language` header could be in lowercase.

This commit attempts to normalize those cases so a browser's
language+country codes can better match with our locales.

This solution assumes that our UI locales have nothing more than
language and optionally country. If we ever add a script specific
locale like `zh-Hant-TW` this would have to change to accomodate
that, but the idea would be pretty much the same as this fix.
2021-02-04 19:53:59 -07:00
Alexandre Flament 321788f14a
Merge pull request #2528 from dalf/mod-ci-gh-pages
[mod] CI: minor changes
2021-02-04 23:12:27 +01:00
Noémi Ványi ffaf785f82
Merge pull request #2533 from mrwormo/ccengine
[Engine] Add Creative Commons search engine
2021-02-04 22:35:08 +01:00
mrwormo c4c1636b18 Add Creative Commons search engine 2021-02-04 11:31:35 +01:00
Noémi Ványi 006f206dc9
Merge pull request #2530 from return42/fix-user-hb
[fix] make books/user.pdf
2021-02-02 20:50:35 +01:00
Markus Heiser 89554e42a9 [fix] make books/user.pdf
Error:

  Configuration error:
  There is a programmable error in your configuration file:
  ...
  NameError: name 'DOCS_URL' is not defined
  make: *** [utils/makefile.sphinx:156: books/user.latex] Fehler 2

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-02 20:14:07 +01:00
Alexandre Flament 90b9d0d6a8 [mod] CI: minor changes
* utils/makefile.python: travis-gh-pages renamed ci-gh-pages
2021-02-02 08:53:57 +01:00
Alexandre Flament 34de715e62
Merge pull request #2500 from dalf/github-action-data
[enh] every Sunday, call utils/fetch_*.py scripts and create a PR automatically
2021-02-01 17:16:58 +01:00
Alexandre Flament 1742355eb8
Merge pull request #2499 from dalf/remove-language-support-variable
[mod] dynamically set language_support variable
2021-02-01 17:16:18 +01:00
Alexandre Flament ca93a01844 [mod] dynamically set language_support variable
The language_support variable is set to True by default,
and set to False in only 5 engines.

Except the documentation and the /config URL, this variable is not used.

This commit remove the variable definition in the engines, and
set value according to supported_languages length: False when the length is 0,
True otherwise.

Close #2485
2021-02-01 17:10:37 +01:00
Alexandre Flament 99244440e4
Merge pull request #2514 from return42/fix-gh-pages
[fix] Makefile target gh-pages & flatten history of branch gh.pages
2021-02-01 17:07:08 +01:00
Alexandre Flament 0a8799b834
Merge pull request #2517 from dalf/debug-ci
Update pyenv pyenvinstall Make targets
2021-02-01 17:01:34 +01:00
Markus Heiser 8c45f1149d [hardening] github workflows - corrupted cache
aka: ensure that 'make test' works as expected

The cache contains a copy './local' which is - under some circumstance -
corrupted.  It is not possible to clear the cache [1] (see the top of the page).

Ensure that 'make test' works as expected [2] even if

- the python interpreter is missing
- the virtualenv exists but pyyaml is missing

To hardening when the workflow cache fails, this patch adds the new target
'travis.test' into the workflow.  This target probes to import a python module
'yaml'.  If this fails the virtualenv will be completely new build.

[1] https://github.com/actions/cache/issues/2#issuecomment-673493515
[2] https://github.com/searx/searx/pull/2517#discussion_r567240235

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01 16:58:04 +01:00
Markus Heiser 38b39ef0ae [fix] re-add 'pip-exe' target - partial revert 9b48ae47
Target pip-exe is a prerequisite of the targets:

  - pyinstall
  - pyuninstall

and was accidentally deleted in commit 9b48ae47.

HINT:
  do not confuse pyinstall with penvinstall

pyinstall & pyuninstall
    Installing into user's HOME using pip from OS,
    therefore the message is needed.

pyenvinstall & pyenvuninstall
    Installing into virtualenv (./local) using pip which is provided by
    prerequisite 'pyenv' in the virtualenv.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-01 16:58:04 +01:00
Alexandre Flament d70c5a621a [mod] more robust make pyenv / make pyenvinstall
"make pyenv" ensures that ./local/py3/bin/python is an executable
2021-02-01 16:58:04 +01:00
Alexandre Flament 806af50738
Merge pull request #2494 from return42/rm-fabfile
[fix] remove Fabric file
2021-02-01 15:09:35 +01:00
Markus Heiser 40d2a116e1 [fix] Makefile target gh-pages & flatten history of branch gh.pages
1. This patch fixes error:

    rm -rf gh-pages/
    make V=1 gh-pages
    make[1]: Leaving directory '/800GBPCIex4/share/searx'
    [ -d "gh-pages/.git" ] || git clone  gh-pages
    fatal: repository 'gh-pages' does not exist

2. The gh-page build has been moved to ./build/gh-pages this also affects
   'travis-gh-pages'

3. The gh-pages commit messages now includes a ref to the repository and commit

4. Since a gh-pages history has only the drawback that the reposetory grows
   fast, this patch also flattens the history:

    cd build/gh-pages/; git log --oneline
    bash: cd: build/gh-pages/: Datei oder Verzeichnis nicht gefunden
    026126be (HEAD -> gh-pages, origin/gh-pages) make gh-pages: from https://github.com/return42/searx.git@71d66979c2935312e0aed7fc7c3cf6199fbe88a2

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-29 11:41:48 +01:00
Alexandre Flament 71d66979c2
Merge pull request #2482 from return42/fix-google-video
[fix] revise of the google-Video engine
2021-01-28 11:11:07 +01:00
Markus Heiser 7f505bdc6f [fix] google: avoid unnecessary SearxEngineXPathException errors
Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser e436287385 [mod] checker: add some additional tests
BTW: fix indentation by 2 spaces

The additional tests has been commented out in the google engines to not release
any CAPTCHA issues.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser b1fefec40d [fix] normalize the language & region aspects of all google engines
BTW: make the engines ready for search.checker:

- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:46 +01:00
Alexandre Flament 0f18e885bf
Merge pull request #2479 from Tobi823/master
Document workaround for using 2 languages simultaneously #1508
2021-01-27 21:29:42 +01:00
Alexandre Flament b661c3f5d4
Merge pull request #2509 from return42/fix-morty-key
[doc] improve admin-docs about result proxy (morty) configuration
2021-01-27 15:31:29 +01:00