Commit Graph

2858 Commits

Author SHA1 Message Date
Adam Tauber
06b754ad67 [mod] increase lobste.rs engine timeout to avoid timeouts most of the time 2021-03-25 01:22:36 +01:00
Adam Tauber
0ba71c3644 [fix] make ina engine compatible with the new response json 2021-03-25 01:20:41 +01:00
Adam Tauber
6255b33c9d [fix] rewrite hoogle to use html/xpath instead of json
the json response has been changed and it contains html chunks which is
not compatible with our json engine, so we have to switch to html/xpath
parsing
2021-03-25 01:13:24 +01:00
Adam Tauber
45f0e1a859 [fix] update geektimes.ru url - it redirects to habr.com 2021-03-25 01:02:19 +01:00
Adam Tauber
50ba2b9e87 [fix] update google play movies xpath 2021-03-25 00:55:53 +01:00
Adam Tauber
88657fe9c2 [fix] update google play apps xpath 2021-03-25 00:55:43 +01:00
Adam Tauber
5f450fda74 [enh] add year filter to duckduckgo 2021-03-25 00:25:36 +01:00
Adam Tauber
fd737dc9d8 [fix] remove debug code 2021-03-24 23:54:39 +01:00
Alexandre Flament
d648001688 [mod] preferences: a tooltip is shown when the mouse is over the engine names 2021-03-22 08:22:59 +01:00
Alexandre Flament
6bd01bf81f [mod] oscar: fix the sourcemap URL in *.min.css
Close https://github.com/searx/searx/issues/2670

Note: clean-css contains a bug:
* a multiline comment or URL adds "$stdin" to the sourcemap (see src/less/logicodev/search.less)
* in this case when the user opens the devtools, the browser fails to load this "https://.../$stdin" URL
* it is not a problem and the error appears only when the user actively tries to debug the CSS.
* seems related to https://github.com/jakubpawlowicz/clean-css/issues/593
2021-03-21 18:03:40 +01:00
Alexandre Flament
a48ec0b4bd
Merge pull request #2671 from searx/update-soundcloud
[mod] soundcloud: faster initialization
2021-03-21 15:10:39 +01:00
Alexandre Flament
30c950a2c7
Merge pull request #2660 from dalf/upd-translations
[mod] replace /translations.js with an embedded JSON
2021-03-21 12:39:26 +01:00
Alexandre Flament
38c210d746
[mod] soundcloud: faster initialization
The get_cliend_id() function:
* fetches https://soundcloud.com
* then fetches each referenced javascript URL to get the client id.

This commit fetches the javascript URLs in the reverse order: the client id is in the last javascript URL.
2021-03-21 09:29:53 +01:00
James Higginbotham
ce6eb81a71
Update settings.yml to enable HTTP for yacy
Added a line to the yacy entry to enable HTTP if the local yacy instance isn't using HTTPS. Otherwise, an error will be thrown in the logs: "No connection adapters were found for 'http://localhost:8090/yacysearch.json...'". This is likely related to ticket #2641 that forces HTTPS by default.
2021-03-19 15:06:25 -06:00
Alexandre Flament
2b0dd96bd3 [mod] oscar: remove space
* reduce by 15% the uncompressed output (on average)
* dos2unix searx/templates/oscar/result_templates/files.html
2021-03-17 09:22:05 +01:00
Dr. Rolf Jansen
7a9dc63d74
Merge branch 'master' into conditional-sigusr1 2021-03-16 08:45:57 -03:00
Alexandre Flament
6553c79029 [mod] replace /translations.js by embedded JSON
In webapp.py, there is a new function "get_translations" lists available translations

Close #2064
2021-03-16 11:22:21 +01:00
Alexandre Flament
32cd0d31b3 [mod] upgrade pygments
add searx_extra/update/update_pygments.py to update the css style of the oscar and simple themes.
2021-03-16 09:07:08 +01:00
Dr. Rolf Jansen
2a6dbeb6a5
Merge branch 'master' into conditional-sigusr1 2021-03-15 19:31:44 -03:00
Adam Tauber
4c631ac6d0 [fix] remove debug code 2021-03-15 21:47:27 +01:00
Dr. Rolf Jansen
4a27dabcf7
Merge branch 'master' into conditional-sigusr1 2021-03-15 17:03:36 -03:00
Noémi Ványi
8158d8654a fix Microsoft Academic engine 2021-03-15 20:21:28 +01:00
Adam Tauber
f97b4ff7b6 [fix] update youtube_noapi paging 2021-03-15 17:22:31 +01:00
Adam Tauber
dd34ac396c
Merge pull request #2652 from kvch/solr-engine
Add Apache Solr engine
2021-03-15 15:39:39 +01:00
Alexandre Flament
1664258061
Merge pull request #2655 from return42/fix-imports
[fix] remove unused import from yahoo-news engine
2021-03-15 08:38:34 +01:00
Alexandre Flament
5b176b3496
Merge pull request #2659 from MarcAbonce/onions-http-fix
Fix HTTP error in onion engines
2021-03-15 08:33:38 +01:00
Marc Abonce Seguin
f4a0a4d756 fix HTTP error in onion engines
regression from https://github.com/searx/searx/pull/2641
most onion websites only serve HTTP, so it must be enabled
2021-03-14 20:23:07 -07:00
Rolf
80025c3244 Windows does not support SIGUSR1, so don't use it unconditionally. 2021-03-14 19:04:36 -03:00
Markus Heiser
6e1f1085ef [fix] remove unused import from yahoo-news engine
Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-03-14 15:13:57 +01:00
Markus Heiser
3703ebb22a [drop] Acgsou engine - www.acgsou.com no longer exists
- https://www.acgsou.com/ acgsou.com is redirected to 36dm.club
- @rinpatch do not plan on maintaining the engine [1]

[1] https://github.com/searx/searx/pull/1283#issuecomment-798783585

Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-03-14 11:49:18 +01:00
Noémi Ványi
ff527e2681 Add Solr engine 2021-03-13 21:18:09 +01:00
Alexandre Flament
9292571304
Merge pull request #2346 from dalf/upgrade-oscar
[mod] oscar: upgrade dependencies
2021-03-13 09:29:13 +01:00
Alexandre Flament
92dd5e245e
Merge pull request #2626 from mikeri/solidtorrents
Add Solid Torrents engine
2021-03-12 19:45:22 +01:00
Alexandre Flament
a1a492baed
Merge pull request #2641 from dalf/disable_http_by_default
[mod] by default allow only HTTPS, not HTTP
2021-03-12 19:21:46 +01:00
Alexandre Flament
cb04d42806 [mod] oscar: update README.rst 2021-03-11 09:33:04 +01:00
Alexandre Flament
86912e2272 [mod] oscar: get bootstrap and typeahead from NPM 2021-03-11 09:33:01 +01:00
Alexandre Flament
44407353ef [mod] oscar: get leaflet and jquery from NPM
easy to upgrade (package.json)
2021-03-11 09:32:22 +01:00
Alexandre Flament
c7133efb12 [mod] oscar: move compiled files to the src directory 2021-03-10 19:28:51 +01:00
Alexandre Flament
eda3b513ac [mod] oscar: remove polyfills for Internet Explorer 2021-03-10 19:01:16 +01:00
Alexandre Flament
1268910274 [mod] oscar: remove unused images 2021-03-10 19:01:16 +01:00
Alexandre Flament
bdb41bea7b [mod] theme: remove require-2.1.15.min.js
See https://github.com/requirejs/requirejs/issues/1816

requirejs loads one file: leaflet.

This commit:
* removes requirejs
* load leaflet using <script src...> HTML tag in searx/templates/oscar/base.html
2021-03-10 19:01:15 +01:00
Alexandre Flament
2f3d5ec2af [mod] oscar: upgrade npm dependencies 2021-03-10 19:01:14 +01:00
Markus Heiser
96422e5c9f [fix] APKMirror engine - update xpath selectors and fix img_src
BTW: make the code slightly more readable

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-03-09 08:34:57 +01:00
Markus Heiser
d2faea423a [fix] rewrite Yahoo-News engine
Many things have been changed since last review of this engine.  This patch fix
xpath selectors, implements suggestion and is a complete review / rewrite of the
engine.

Signed-off-by: Markus Heiser <markus@darmarit.de>
2021-03-08 11:43:34 +01:00
Alexandre Flament
99e0651cea [mod] by default allow only HTTPS, not HTTP
Related to https://github.com/searx/searx/pull/2373
2021-03-08 11:35:08 +01:00
Michael Ilsaas
5549d58de3 Add Solid Torrents engine 2021-03-07 18:14:30 +01:00
Adam Tauber
44f4a9d49a [enh] add ability to send engine data to subsequent requests 2021-03-06 12:12:35 +01:00
Alexandre Flament
87f4cc4a9e
Merge pull request #2631 from searx/update_data_update_languages.py
Update searx.data - update_languages.py
2021-03-06 10:03:00 +01:00
Markus Heiser
4845183128 [mod] don't dump traceback of SearxEngineResponseException on init
When initing engines a "SearxEngineResponseException" is logged very verbose,
including full traceback information:

    ERROR:searx.engines:yggtorrent engine: Fail to initialize
    Traceback (most recent call last):
      File "share/searx/searx/engines/__init__.py", line 293, in engine_init
        init_fn(get_engine_from_settings(engine_name))
      File "share/searx/searx/engines/yggtorrent.py", line 42, in init
        resp = http_get(url, allow_redirects=False)
      File "share/searx/searx/poolrequests.py", line 197, in get
        return request('get', url, **kwargs)
      File "share/searx/searx/poolrequests.py", line 190, in request
        raise_for_httperror(response)
      File "share/searx/searx/raise_for_httperror.py", line 60, in raise_for_httperror
        raise_for_captcha(resp)
      File "share/searx/searx/raise_for_httperror.py", line 43, in raise_for_captcha
        raise_for_cloudflare_captcha(resp)
      File "share/searx/searx/raise_for_httperror.py", line 30, in raise_for_cloudflare_captcha
        raise SearxEngineCaptchaException(message='Cloudflare CAPTCHA', suspended_time=3600 * 24 * 15)
    searx.exceptions.SearxEngineCaptchaException: Cloudflare CAPTCHA, suspended_time=1296000

For SearxEngineResponseException this is not needed.  Those types of exceptions
can be a normal use case.  E.g. for CAPTCHA errors like shown in the example
above. It should be enough to log a warning for such issues:

    WARNING:searx.engines:yggtorrent engine: Fail to initialize // Cloudflare CAPTCHA, suspended_time=1296000

closes: #2612

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-03-05 17:26:22 +01:00
Alexandre Flament
0165e14a7f
Merge pull request #2632 from searx/update_data_update_wikidata_units.py
Update searx.data - update_wikidata_units.py
2021-03-05 11:59:44 +01:00
Alexandre Flament
152f6fc1da
Merge pull request #2630 from searx/update_data_update_ahmia_blacklist.py
Update searx.data - update_ahmia_blacklist.py
2021-03-05 11:59:20 +01:00
dalf
1e8b846954 Update searx.data - update_currencies.py 2021-03-05 10:56:57 +00:00
dalf
2f8a708481 Update searx.data - update_wikidata_units.py 2021-03-05 10:56:49 +00:00
dalf
d9dc3376d0 Update searx.data - update_languages.py 2021-03-05 10:56:46 +00:00
dalf
2857473553 Update searx.data - update_ahmia_blacklist.py 2021-03-05 10:56:33 +00:00
Alexandre Flament
aac37f288f
Merge pull request #2593 from dalf/update-autocomplete
Update autocomplete
2021-03-04 10:51:09 +01:00
Alexandre Flament
63f17d2e4c [enh] autocomplete refactoring, autocomplete on external bangs 2021-03-01 19:12:32 +01:00
Markus Heiser
d48e2e7b0b [enh] google scholar - python implementation of the engine
The old xpath configuration for google scholar did not work and is replaced by a
python implementation.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-03-01 15:16:37 +01:00
Alexandre Flament
4fa1290c11 [fix] answers: don't crash when the query is an empty string 2021-03-01 10:52:39 +01:00
Alexandre Flament
e2fb500892
Merge pull request #2608 from return42/unittest2
[py2to3] use unittest from py3, remove unittest2 from py2
2021-03-01 10:05:38 +01:00
Alexandre Flament
0c663e25fc
Merge pull request #2604 from searx/update_data_firefox_version
Update searx.data - firefox_version
2021-03-01 10:03:39 +01:00
Alexandre Flament
f77983e174
Merge pull request #2602 from MarcAbonce/fix-bing-fetch-languages
Fix fetch_languages for Bing
2021-03-01 09:06:37 +01:00
GazoilKerozen
5f6ac3afa2
Add Freesound engine (#2596)
Add freesound engine with player.

Co-authored-by: Gazoil <maildeguzel@gmail.com>
2021-03-01 08:52:36 +01:00
Markus Heiser
3bae35940a [py2to3] use unittest from py3, remove unittest2 from py2
- unittest2 is a backport of the new features added to the unittest testing
  framework in Python 2.7

- unittest2 was only needed in py2 and can be dropped now

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-28 11:37:06 +01:00
Alexandre Flament
b05f4d0664
Merge pull request #2605 from searx/update_data_currencies
Update searx.data - currencies
2021-02-26 12:56:32 +01:00
Alexandre Flament
aec5188b51
Merge pull request #2606 from searx/update_data_wikidata_units
Update searx.data - wikidata_units
2021-02-26 12:55:51 +01:00
dalf
893b6e4901 Update searx.data - ahmia_blacklist 2021-02-26 08:31:15 +00:00
dalf
7b9005df31 Update searx.data - wikidata_units 2021-02-26 08:31:01 +00:00
dalf
4c8ae5b7ed Update searx.data - firefox_version 2021-02-26 08:30:45 +00:00
dalf
d2778b5efe Update searx.data - currencies 2021-02-26 08:30:45 +00:00
Marc Abonce Seguin
d6681fd33b remove articles number from engines_languages.json 2021-02-25 23:54:21 -07:00
Marc Abonce Seguin
9b6ffed061 fix fetch_languages for bing
Bing has a list of regions that it supports and some of these regions
may have more than one possible language.

In some cases, like Switzerland, these languages are always shown as
options, so there is no issue. But in other cases, like Andorra, Bing
will only show one language at the time, either the region's default or
the request's language if the latter is supported by that region.

For example, if the HTTP request is in French, Andorra will appear as
fr-AD but if the same page is requested in any other language Andorra
will appear as ca-AD.

This is specially a problem when Bing assumes that the request is in
English because it overrides enough language codes to make several major
languages like Arabic dissappear from the languages.py file.

To avoid that issue, I set the Accept-Language header to a language
that's only supported in one region to hopefully avoid these overrides.
2021-02-25 23:51:49 -07:00
Alexandre Flament
7c1847d5f2 [mod] add utils/fetch_external_bangs.py
Based on duckduckgo bangs
Store bangs on a trie to allow autocomplete (not in this commit)
2021-02-24 18:48:36 +01:00
Alexandre Flament
5f4a085fc4
Merge pull request #2595 from dalf/update-wikidata-units
[mod] update wikidata_units.json and fetch_wikidata_units.py
2021-02-23 17:22:37 +01:00
Alexandre Flament
46ca32c3cc [mod] update currencies.json and fetch_currencies.py
use a sparql request on wikidata to get the list of currencies.

currencies.json contains the translation for all supported searx languages.

Supersede #993
2021-02-23 16:42:28 +01:00
Alexandre Flament
93d1da4906 [mod] update wikidata_units.json and fetch_wikidata_units.py
The fetch_wikidata_units.py result won't change randomly.
See comments in the script.
2021-02-23 13:10:38 +01:00
Noémi Ványi
1be6ab2a91 Fix paging of Bing Images 2021-02-22 21:19:34 +01:00
datagram1
1d0a32a2c5 Added rumble.com video search engine. TODO video embedding.
Update rumble.py

some lines too long.

Disable Rumble engine

disabled : True

PEP8 fix

change line spacing
2021-02-20 12:48:56 +00:00
Alexandre Flament
44a6593c13
Merge pull request #2573 from unixfox/yggtorrent
update yggtorrent url + add it back
2021-02-16 08:22:07 +01:00
Emilien Devos
4b37e10dd9 fix yggtorrent url + add it back 2021-02-15 13:38:34 +01:00
Thorben Günther
fbbd4cc21f
Improve peertube searching
At the moment videos without a description are not shown - setting
default content to "" fixes this.
Another current bug is that thumbnails are not displayed. This is caused
by a double slash in the url. For this every trailing slash is now
stripped (for backwards compatibility) and the API response is correctly
parsed.
2021-02-13 19:47:33 +01:00
Alexandre Flament
45027765e3
Merge pull request #2566 from dalf/remove-yandex
[remove] yandex engine
2021-02-12 17:12:07 +01:00
Alexandre Flament
c22d4c764c [fix] duckduckgo engine: "!ddg !g" do not redirect to google
* searx understand "!ddg !g time" as : send "!g time" to DDG
* !g a DDG bang for Google: DDG return a HTTP redirect to Google

This commit adds a the allows_redirect param not to follow HTTP redirect.

The DDG engine returns a empty result as before without HTTP redirect.
2021-02-12 11:10:08 +01:00
Alexandre Flament
d76660463b
Merge pull request #2562 from dalf/mod-json-engine
[mod] json_engine: add content_html_to_text and title_html_to_text
2021-02-12 10:58:28 +01:00
Alexandre Flament
7dcf67a47a
Merge pull request #2565 from dalf/upd-wikipedia
[upd] wikipedia engine: return an empty result on query with illegal characters
2021-02-12 10:57:05 +01:00
Alexandre Flament
2b60d0d243
Merge pull request #2564 from dalf/fix-seznam
[fix] fix seznam engine
2021-02-12 10:56:53 +01:00
Alexandre Flament
7e83818879
Merge pull request #2560 from dalf/fix-duckduckgo
Fix duckduckgo
2021-02-12 10:56:40 +01:00
Alexandre Flament
63d6ccfbc2
Merge pull request #2557 from dalf/fix-raise_for_httperror
Fix: activate raise_for_error by default
2021-02-12 10:56:25 +01:00
Alexandre Flament
74c8b5606f
Merge pull request #2541 from return42/mediathekviewweb
[enh] add engine MediathekViewWeb (API)
2021-02-11 15:11:26 +01:00
Alexandre Flament
5d9db6c2f7 [remove] yandex engine 2021-02-11 14:28:06 +01:00
Alexandre Flament
35dd069402 [fix] fix seznam engine
no paging support
2021-02-11 12:53:19 +01:00
Alexandre Flament
7d6e69e2f9 [upd] wikipedia engine: return an empty result on query with illegal characters
on some queries (like an IT error message), wikipedia returns an HTTP error 400.
this commit returns an empty result instead of showing an error to the user.
2021-02-11 12:29:21 +01:00
Alexandre Flament
ff84a1af35 [mod] json_engine: add content_html_to_text and title_html_to_text
Some JSON API returns HTML in either in the HTML or the content.
This commit adds two new parameters to the json_engine:
content_html_to_text and title_html_to_text, False by default.

If True, then the searx.utils.html_to_text removes the HTML tags.

Update crossref, openairedatasets and openairepublications engines
2021-02-10 16:42:11 +01:00
Alexandre Flament
436d366448
Merge pull request #2544 from mrwormo/congresslibrary
[Engine] Add Library of Congress engine
2021-02-10 10:13:46 +01:00
Alexandre Flament
eafd27f42a
Merge pull request #2556 from dalf/fix-apk-mirror
[fix] fix apk_mirror engine
2021-02-10 10:12:37 +01:00
Alexandre Flament
d2dac11392 [mod] duckduckgo engine: better support of the language preference
After the main request, send a second to https://duckduckgo.com/t/sl_h

See https://github.com/searx/searx/issues/2259
2021-02-09 14:36:43 +01:00
Alexandre Flament
74d56f6cfb [mod] poolrequests: for one (user request, engine) always use the same HTTPAdapter
The duckduckgo engine requires an additional request after the results have been sent.
This commit makes sure that the second request uses the same HTTPAdapter
= the same IP address, and the same proxy.
2021-02-09 14:33:36 +01:00
Markus Heiser
bc1be3f0e9 [enh] add engine MediathekViewWeb (API)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-09 13:08:01 +01:00
mrwormo
051da88328 Add Library of Congress engine 2021-02-09 12:45:39 +01:00
Alexandre Flament
9211cdfe9b [upd] remove google_play_music engine
Google Play Music has been replaced by Youtube music.
2021-02-09 11:38:50 +01:00
Alexandre Flament
aedf03c0f7 Fix: activate raise_for_error by default
Fix commit d703119d3a :
Some engines need to parse the HTTP error but
raise_for_error is always set to False in the "request" function.
2021-02-09 11:27:41 +01:00
Alexandre Flament
5e055b069b [fix) fix apk_mirror engine 2021-02-09 11:02:12 +01:00
Alexandre Flament
e4cc7f13a3
Merge pull request #2542 from kvch/fix-naver-engine
Fix XPATHs in Naver engine
2021-02-09 08:52:38 +01:00
Alexandre Flament
bec9e30fe7
Merge pull request #2554 from MarcAbonce/zh-variants-in-wikipedia
Add support for Chinese variants in Wikipedia
2021-02-09 08:49:59 +01:00
Daniel Hones
138f32471c Updated webutils.highlight_content to ignore double-quotes when highlighting query parts 2021-02-08 23:58:54 -05:00
Marc Abonce Seguin
64e81794fe add support for Chinese variants in Wikipedia 2021-02-08 21:56:45 -07:00
Noémi Ványi
ac309f5b8d Fix naver engine
Closes #2540
2021-02-07 18:58:13 +01:00
Markus Heiser
41c03cf011 [drop] metager - xpath engine won't work anymore
The new version of MetaGer needs to reload the reults (into a iframe) with a
unique tag (see HTML response below).

Implementing a dedicated metager-engine for searx makes no sense to me. The
great days of MetaGer seems to be ended.  I remember the good old days this
project started in the 90's of the last century.  But in the last few years it
becomes more and more crap.  As the name suggested, MetaGer was made for
germans in the first place.  They have added a english and spain translation but
the i18n is very poor compared to what searx offers.

It's a pity, lets drop MetaGer.

This is the first response, the id (b82679980656899ba5a17ffd02a56846) is unique
for each query:

    $ curl "https://metager.org/meta/meta.ger3?eingabe=foo&submit-query=&focus=web"
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <link rel="stylesheet" href="/index.css?id=b82679980656899ba5a17ffd02a56846">
        <script src="/index.js?id=b82679980656899ba5a17ffd02a56846"></script>
    <title>foo - MetaGer</title>
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
    </head>
    <body>
        <iframe id="mg-framed" src="https://metager.org/meta/meta.ger3?eingabe=foo&amp;submit-query=&amp;focus=web&amp;mgv=b82679980656899ba5a17ffd02a56846" autofocus="true" onload="this.contentWindow.focus();"></iframe>
     </body>

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-02-07 14:55:21 +01:00
Hermógenes Oliveira
514faa9162 [feat] recoll: paged json support 2021-02-07 10:05:35 -03:00
Marc Abonce Seguin
c937a9e85f [fix] get correct locale with country from browser
Some of our interface locales include uppercase country codes,
which are separated by `_` instead of the more common `-`.
Also, a browser's `Accept-Language` header could be in lowercase.

This commit attempts to normalize those cases so a browser's
language+country codes can better match with our locales.

This solution assumes that our UI locales have nothing more than
language and optionally country. If we ever add a script specific
locale like `zh-Hant-TW` this would have to change to accomodate
that, but the idea would be pretty much the same as this fix.
2021-02-04 19:53:59 -07:00
mrwormo
c4c1636b18 Add Creative Commons search engine 2021-02-04 11:31:35 +01:00
Alexandre Flament
ca93a01844 [mod] dynamically set language_support variable
The language_support variable is set to True by default,
and set to False in only 5 engines.

Except the documentation and the /config URL, this variable is not used.

This commit remove the variable definition in the engines, and
set value according to supported_languages length: False when the length is 0,
True otherwise.

Close #2485
2021-02-01 17:10:37 +01:00
Markus Heiser
7f505bdc6f [fix] google: avoid unnecessary SearxEngineXPathException errors
Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser
e436287385 [mod] checker: add some additional tests
BTW: fix indentation by 2 spaces

The additional tests has been commented out in the google engines to not release
any CAPTCHA issues.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser
b1fefec40d [fix] normalize the language & region aspects of all google engines
BTW: make the engines ready for search.checker:

- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:46 +01:00
Markus Heiser
ff6804e545 [data] make engines.languages
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:52:32 +01:00
Markus Heiser
8cdad5d85d [fix] google-videos: parse values for 'length' & 'author'
The 'video.html' template from the 'oscar' design supports replacement
for *author* and *length*.  Google-videos does not have an author, alternatively
the publisher info from is used for the *author*.

Hint: these replacements are not supported by the 'simple' design.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:51:24 +01:00
Markus Heiser
89b3050b5c [fix] revise of the google-Video engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-24 09:39:30 +01:00
Alexandre Flament
8c46b767d0 [fix] google_news: avoid one HTTP redirect except for the English results
also add
params['soft_max_redirects'] = 1
to avoid false error reporting in /stats/errors
2021-01-24 08:53:35 +01:00
Markus Heiser
5f92dfcdbe [fix] google-news: query uses locale without country tag
Wthout country-region tag google will redirect to correct the contry tag [1]:

    SEARX_DEBUG=1 searx-checker -v "google news"
    ...
    https://news.google.com:443 "GET /search?q=computer&hl=en...      HTTP/1.1" 302 0
    https://news.google.com:443 "GET /search?q=computer&hl=en-US&.... HTTP/1.1" 200 None
    ...

[1] https://github.com/searx/searx/pull/2483#issuecomment-765600849

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-23 11:37:14 +01:00
Markus Heiser
baec54c492 [fix] revise of the google-news engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22 18:49:45 +01:00
Alexandre Flament
73c86f9bf2 [mod] checker: disable by default 2021-01-19 21:44:48 +01:00
Alexandre Flament
3b7b852aa8 [fix] checker: minor fix about language detection 2021-01-19 21:29:31 +01:00
Alexandre Flament
aa887eb375 [mod] checker : replace pycld3 by langdetect
pycld3 requires the native library cld3
langdetect is a pure python package
2021-01-19 21:26:04 +01:00
Alexandre Flament
67a1aab0d5 [fix] /stats/checker : remove the timestamp field when the checker is disabled 2021-01-18 08:19:53 +01:00
Alexandre Flament
d473407ec9 [fix] checker: fix engine statistics
Without this commit, the URL /stats/errors shows percentage above 100% after the checker has run.
2021-01-18 08:19:44 +01:00
Alexandre Flament
ca76f3119a [fix] error_recorder: record code and lineno about the engine
since the PR #2225 , code and lineno were sometimes meaningless
see /stats/errors
2021-01-17 16:25:11 +01:00
Alexandre Flament
80d7411f2c
Merge pull request #2452 from kvch/add-wilby-engine
Add wiby.me engine
2021-01-16 22:36:31 +01:00
Alexandre Flament
b405646749
Merge pull request #2451 from mrwormo/invidious-engine
[Fix] Invidious Engine
2021-01-16 19:25:45 +01:00
Alexandre Flament
a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
mrwormo
2dff3887f0 [fix] Invidious engine by enabling requests by randomly picking amongst working instances 2021-01-14 12:12:56 +01:00
Alexandre Flament
912c7e975c [fix] checker: don't run the checker when uwsgi is not properly configured
Before this commit, even with the scheduler disabled, the checker was running
at least once for each uwsgi worker.
2021-01-13 14:07:39 +01:00
Alexandre Flament
7f0c508598 [fix] checker: fix typo unknown instead of unknow 2021-01-12 11:47:17 +01:00
Alexandre Flament
a0c8b413a6 [mod] searx.shared: minor tweaks
searx.shared.shared_abstract.SharedDict inherit from abc.ABC
searx.shared.shared_uwsgi.schedule can schedule multiple functions without issue
2021-01-12 11:47:17 +01:00
Alexandre Flament
87bafbc32b [mod] checker: add status and timestamp to the result
for each engine: replace status by success
2021-01-12 11:47:17 +01:00
Alexandre Flament
f3e1bd308f [mod] checker: minor adjustements on the default tests
the query "time" is convinient because most of the search engine will return some results,
but some engines in the general category will return documentation about the HTML tags <time> or <input type="time">
2021-01-12 11:47:17 +01:00
Alexandre Flament
45bfab77d0 |mod] checker: improve searx-checker command line
* output is unbuffered
* verbose mode describe more precisly the errrors
2021-01-12 11:47:17 +01:00
Alexandre Flament
3a9f513521 [enh] checker: background check
See settings.yml for the options
SIGUSR1 signal starts the checker.
The result is available at /stats/checker
2021-01-12 11:47:17 +01:00
Alexandre Flament
6e2872f436 [enh] add searx.shared
shared dictionary between the workers (UWSGI or werkzeug)
scheduler: run a task once every x seconds (UWSGI or werkzeug)
2021-01-12 11:47:17 +01:00
Markus Heiser
9c581466e1 [fix] do not colorize output on dumb terminals
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-12 11:47:17 +01:00
Alexandre Flament
ca0889d488 [enh] checker: wikidata & ddd: add specific tests 2021-01-12 11:47:17 +01:00
Alexandre Flament
16a889dd8f [enh] checker: add rosebud test 2021-01-12 11:47:17 +01:00
Alexandre Flament
8cbc9f2d58 [enh] add checker 2021-01-12 11:47:17 +01:00
Alexandre Flament
f7e11fd722
Merge pull request #2459 from dalf/update-python
Update python
2021-01-12 11:02:58 +01:00
Alexandre Flament
9c55d772e9
Merge pull request #2408 from return42/rm-brand-make
[mod] move brand options from Makefile to settings.yml
2021-01-12 10:52:42 +01:00
Alexandre Flament
f5c3cb7afa [mod] drop Python 3.5 support 2021-01-12 09:45:16 +01:00
Alexandre Flament
8d0312d014
Merge pull request #2458 from MarcAbonce/hide-links-mobile2
Hide links panel in mobile screens
2021-01-12 08:27:24 +01:00
Marc Abonce Seguin
635c6516a4 hide links panel in mobile screens 2021-01-11 20:40:21 -07:00
Alexandre Flament
424e6abc7e [mod] settings.yml: move brand settings to a dedicated section 2021-01-11 22:59:52 +01:00
Markus Heiser
d0338cb504 [fix] add missing brand.CONTACT_URL to /config API endpoint
Suggested-by: @dalf / https://github.com/searx/searx-stats2/issues/59#issuecomment-747961582
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-11 22:12:38 +01:00
Markus Heiser
9e53470b4c [mod] get rid of searx/brand.py
Removes module searx/brand.py and creates a namespace at searx.brand.

This patch is a first 'proof of concept'.  Later we can decide to remove the
brand namespace entirely or not.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-11 22:12:38 +01:00
Markus Heiser
9485179064 [mod] move brand options from Makefile to settings.yml
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-11 22:12:38 +01:00
Alexandre Flament
c2646df496
Merge pull request #2454 from MarcAbonce/fix-empty-lang-bang
Fix empty colon in query from selecting Chinese
2021-01-10 11:01:32 +01:00
Marc Abonce Seguin
571ce9ff07 fix empty colon in query from selecting Chinese 2021-01-09 22:11:41 -07:00
Noémi Ványi
a6dd1de4a8 Add wiby.me engine
Closes #2339
2021-01-08 23:11:18 +01:00
Markus Heiser
b0bb0a3a0f [fix] Library Genesis links shifted by 1 #1998
Fixes: #1998
Suggested-by: @linuxmue
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-07 14:47:34 +01:00
Émilien Devos
fc6cfc3b58
Remove voat due to its shutdown
Voat shutted down on December 25th, 2020 at 12 noon PST: https://voat.co/host/voat/static/inactive.min.html?ReturnUrl=/
2021-01-06 10:45:02 +00:00
Alexandre Flament
54e69d0367 [upd] update dependencies
minor change in the oscar theme becase the last version of jinja2
respect more carefully the spaces in the templates
2020-12-28 09:04:39 +01:00
Alexandre Flament
568b9465e9 [mod] check secret_key when searx.webapp is imported
Without this commit the module searx checks the secret_key value.

With this commit, make docs, utils/standalone_searx.py,
utils/fetch_firefox_version.py works without SEARX_DEBUG=1

For reference see https://github.com/searx/searx/pull/2386
2020-12-27 10:30:20 +01:00
Alexandre Flament
1956ab4b50
Merge pull request #2412 from dalf/update-buildenv
[fix] update buildenv
2020-12-27 08:31:23 +01:00
Markus Heiser
4de276e364 [upd] make SEARX_DEBUG=1 useragents.update
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-22 14:23:58 +01:00
Alexandre Flament
db5b060455 [fix] update buildenv
CONTACT_URL is unset in Makefile, but searx/brand.py and
utils/brand.env are not updated.

This commit fixes this issue.
2020-12-21 10:55:28 +01:00
Alexandre Flament
3f8ebf70b1 [fix] pylint: use "raise ... from ..." 2020-12-20 09:46:53 +01:00
Alexandre Flament
eb33ae6893 [fix] Python 3.9: use html.unescape instead of HTMLParser.unescape 2020-12-20 09:46:53 +01:00
Alexandre Flament
04447f8c1a
Merge pull request #2398 from dalf/mod-search-query
Mod search query
2020-12-20 09:32:54 +01:00
Alexandre Flament
f4983e7415 [mod] remove emojis from source code 2020-12-20 08:58:57 +01:00
Alexandre Flament
eda8934f15 [mod] searx.search.EngineRef: remove from_bang parameter
from_bang is True when the user query contains a bang.
In this case the category is also set to 'none'.

from_bang only usage was in searx.webadapter.parse_specific :
if from_bang is True, then the EngineRef category is ignored and force to 'none'.

This commit also removes the searx.webadapter.parse_sepecific function.
2020-12-18 12:29:48 +01:00
Alexandre Flament
995ba2f406 [mod] searx.search.SearchQuery: remove categories parameter
The categories parameter is useless in the constructor:
it is always the categories from the EngineRef.

The categories becomes a property.
2020-12-18 12:29:48 +01:00
Alexandre Flament
14c7cc0e11 [mod] Makefile: make CONTACT_URL optional 2020-12-18 09:54:03 +01:00
BBaoVanC
19fce74443
Add link to contact instance maintainer to footer of each page (#2391) 2020-12-18 09:53:28 +01:00
Alexandre Flament
5c6a5407a0 [fix] fix of PR #2225 2020-12-17 16:49:48 +01:00
Alexandre Flament
02fc4147ce [mod] dictzone, translated, currency_convert: use engine_type online_curency and online_dictionnary 2020-12-17 11:39:36 +01:00
Alexandre Flament
7ec8bc3ea7 [mod] split searx.search into different processors
see searx.search.processors.abstract.EngineProcessor

First the method searx call the get_params method.

If the return value is not None, then the searx call the method search.
2020-12-17 11:39:36 +01:00
Alexandre Flament
c0cc01e936 [mod] searx.search: search_multiple_requests is a method of Search class 2020-12-17 11:39:36 +01:00
Alexandre Flament
3b87efb3db [mod] move seax/search.py to searx/search/__init__.py 2020-12-17 11:39:36 +01:00
Alexandre Flament
9bc1856e2b [mod] themes: remove legacy, courgette and pix-art themes 2020-12-17 11:33:28 +01:00
Alexandre Flament
88660fde90
Merge pull request #2396 from lucky13820/patch-1
Fix the StartPage result title is showing the url
2020-12-17 08:23:34 +01:00
lucky13820
fea8958e99
Fix the StartPage result title is showing the url
Fix the issue 2395 where StartPage result title is showing the url. https://github.com/searx/searx/issues/2395
2020-12-16 13:54:14 -08:00
Markus Heiser
9db7d6357b [themes] add hyperlink to searx instances list in error message
closes: #2383

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-16 20:24:42 +01:00
Alexandre Flament
39ac81478c prepare release 0.18.0 2020-12-14 19:03:09 +01:00
Alexandre Flament
292b73a3fc
Merge pull request #2385 from joshu9h/patch-1
[Fix] Startpage
2020-12-14 17:56:48 +01:00
Alexandre Flament
36600118fb
Merge pull request #2372 from dalf/remove-broken-engines
[remove] remove searchcode_doc and twitter
2020-12-13 17:11:05 +01:00
joshu9h
8260435c8b
[Fix] Startpage 2020-12-13 15:43:50 +01:00
Alexandre Flament
3c4a9c1188
Merge pull request #2358 from dalf/fix-command
[fix] command engine: SearchQuery.query is str not bytes
2020-12-11 14:53:24 +01:00
Alexandre Flament
d703119d3a [enh] add raise_for_httperror
check HTTP response:
* detect some comme CAPTCHA challenge (no solving). In this case the engine is suspended for long a time.
* otherwise raise HTTPError as before

the check is done in poolrequests.py (was before in search.py).

update qwant, wikipedia, wikidata to use raise_for_httperror instead of raise_for_status
2020-12-11 14:37:08 +01:00
Alexandre Flament
033f39bff7
Merge pull request #2376 from dalf/fix-mojeek
Fix mojeek
2020-12-11 13:14:54 +01:00
Alexandre Flament
6bc6d5e9fd
Merge pull request #2371 from dalf/mod-genius
[mod) genious: return valid results even if contents are empty
2020-12-11 13:14:03 +01:00
Alexandre Flament
0ba74cd812 [mod] results: don't crash when an engine don't have a category
According to
820b468bfe/searx/engines/__init__.py (L87-L88)

an engine can have no category at all.

Without this commit, searx raise an exception in searx/results.py

Note: in this case, the engine is not shown in the preferences.
2020-12-10 10:57:07 +01:00
Alexandre Flament
d41cafd5f3 [fix] xpath, mojeek: fix commit 58d72f2692
before commit 58d72f2, category was not set in xpath.py,
so searx/engines/__init__py was setting the category to ['general']

the commit 58d72f2 set the category to [] which is not replaced by searx/engines/__init__.py
consequence: the mojeek engine is hidden in the preferences.

this commit revert the xpath.py change.

close #2368
2020-12-10 10:52:06 +01:00
Noémi Ványi
3a63dfbdd7 display if an engine does not support https
Closes #302
2020-12-09 20:49:54 +01:00
Alexandre Flament
1c9e7cef50 [remove] remove searchcode_doc and twitter
* twitter: the API has changed. the engine needs to rewritten.
* searchcode_doc: the API about documentation doesn't exist anymore.
2020-12-09 13:14:31 +01:00
Alexandre Flament
fa73f10f11 [mod) genious: return valid results even if contents are empty 2020-12-09 13:01:34 +01:00
Alexandre Flament
42a194898b
Merge pull request #2360 from dalf/update-libgen
[mod] libgen: update the URL to http://libgen.rs/
2020-12-08 20:33:53 +01:00
Alexandre Flament
a77d8c8227
Merge pull request #2359 from dalf/update-duden
[mod] duden engine
2020-12-08 20:33:38 +01:00
Alexandre Flament
bd4869ecd0
Merge pull request #2366 from dalf/remove-seedpeer
[remove] seedpeer engine
2020-12-08 20:33:23 +01:00
Alexandre Flament
56c64d6b64 [remove] seedpeer engine
the website is offline.
2020-12-07 21:02:29 +01:00
Alexandre Flament
c1a9732268
Merge pull request #2364 from dalf/fix-youtube-noapi
[fix] youtube_noapi engine
2020-12-07 20:26:00 +01:00
Alexandre Flament
13d3004703
Merge pull request #2365 from dalf/fix-soundcloud
[fix] soundclound: accept result without content
2020-12-07 20:25:17 +01:00
Alexandre Flament
62073c0e1d
Merge pull request #2361 from dalf/fix-1x
[fix] 1x engine
2020-12-07 20:24:47 +01:00
Alexandre Flament
923bc02c17
Merge pull request #2363 from dalf/fix-wikipedia-minor
[fix] wikipedia: minor fix: return no result instead of crash in some very few cases.
2020-12-07 18:33:37 +01:00