Commit Graph

1297 Commits

Author SHA1 Message Date
Martin Fischer
bb06758a7b [refactor] add type hints & remove Setting._post_init
Previously the Setting classes used a horrible _post_init
hack that prevented proper type checking.
2022-01-06 14:21:14 +01:00
Alexandre Flament
aedd6279b3
Merge pull request #634 from not-my-profile/powered-by
Introduce `categories_as_tabs` & group engines in tabs
2022-01-06 09:22:02 +01:00
Alexandre Flament
d3ecadd3f8
Merge pull request #679 from dalf/brand-searxng
searxng.org: update setup.py & settings.yml
2022-01-05 19:07:53 +01:00
Martin Fischer
d01e8aa8cc [mod] introduce searx.engines.Engine for type hinting 2022-01-05 11:03:44 +01:00
Martin Fischer
1e195f5b95 [mod] move group_engines_in_tab to searx.webutils 2022-01-05 11:03:44 +01:00
Martin Fischer
5d74bf3820 [enh] move dictionaries, Erowid & IMDb out of general category
The general category is the category that is searched by default.
From a privacy standpoint it doesn't make sense to send all general
queries to specialized search engines that cannot deal with those
queries anyway.
2022-01-05 11:03:44 +01:00
Martin Fischer
ab90e2ac49 [enh] show categories not in any tab category in "Other" preferences tab
Previously we didn't have a good place to put search engines that don't
fit into any of the tab categories. This commit automatically puts
search engines that don't belong to any tab category in an "other"
category, that is only displayed in the user preferences (and not above
search results).
2022-01-05 11:03:44 +01:00
Martin Fischer
b02f762687 [enh] add more categories 2022-01-05 11:00:11 +01:00
Martin Fischer
8e9ad1ccc2 [enh] introduce categories_as_tabs
Previously all categories were displayed as search engine tabs.
This commit changes that so that only the categories listed under
categories_as_tabs in settings.yml are displayed.

This lets us introduce more categories without cluttering up the UI.
Categories not displayed as tabs  can still be searched with !bangs.
2022-01-03 07:01:49 +01:00
Martin Fischer
df34b1ddcf [enh] settings.yml: allow granular overwrites for about 2022-01-03 07:01:49 +01:00
Alexandre Flament
d83aa2b0d2
Merge pull request #613 from return42/pylint-bing-images
[pylint] Bing (Images) engine
2022-01-02 22:00:55 +01:00
Alexandre Flament
76cbfbbdda reference docs.searxng.org 2022-01-02 21:18:29 +01:00
Markus Heiser
61ce0c2244 [fix] bing engines: fetch_supported_languages
The Request to and the Response from https://www.bing.com/account/general has
been changed.

[1] https://github.com/searxng/searxng/pull/672#discussion_r777104919

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-01 17:31:38 +01:00
Markus Heiser
dc4f1f705d [pylint] Bing (Images) engine
Fix remarks from pylint and remove obsolete try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-28 14:43:39 +01:00
Markus Heiser
6d7a38a912 [pylint] Bing (Videos) engine
Fix remarks from pylint and remove obsolete try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-28 14:33:05 +01:00
Markus Heiser
d84226bf63 [fix] issues reported by pylint
Fix pylint issues from commit (3d96a983)

    [format.python] initial formatting of the python code

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 10:16:20 +01:00
Markus Heiser
3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Markus Heiser
fcdc2c2cd2 [format.python] disable py code formatting for some hunks of code
Disable the python code formatting from python-black, where the readability of
code suffers by formatting.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:16:03 +01:00
Martin Fischer
e28c6bda35 [doc] introduce about.language and sort engines by it 2021-12-21 09:58:51 +01:00
Markus Heiser
7a215e07e7
Merge pull request #611 from return42/fix-bing
[fix] bing engine: fix paging support, show inital page.
2021-12-20 10:08:52 +01:00
Markus Heiser
2af50c2588 [pylint] Reddit engine
Add Reddit engine to pylint process

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-18 17:59:47 +01:00
Markus Heiser
6b85607274 [fix] bing engine: fix paging support, show inital page.
Follow up queries for the pages needed to be fixed.

- Split search-term in one for initial query and one for following queries.
- Set some headers in HTTP requests, bing needs for paging support.
- IMO //div[@class="sa_cc"] does no longer match in a bing response.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-18 13:50:38 +01:00
Markus Heiser
b2177e5916 [pylint] Bing (Web) engine
Fix remarks from pylint and improved code-style.  In preparation for a bug-fix
of the Bing (Web) engine I add this engine to the pylint-list.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-18 13:40:36 +01:00
Markus Heiser
f41734a543 [fix] engine bing-news: replace the http:// by https://
BTW: add bing_news to the pylint process

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-17 13:25:50 +01:00
Markus Heiser
8cc7c880ae
Merge pull request #587 from dalf/fix-gigablast
[fix] gigablast engine
2021-12-12 15:58:13 +01:00
Markus Heiser
b5c9cc4ff3
Merge pull request #586 from dalf/remove-yggtorrent
[del] remove yggtorrent
2021-12-07 07:00:47 +01:00
Alexandre Flament
1a6207574e [fix] gigablast engine
fetch extra params after 3000 seconds
2021-12-06 22:55:15 +01:00
Alexandre Flament
fbc2a6ab4b [del] remove yggtorrent
yggtorrent is behind cloudflare now
close #580
2021-12-06 21:59:51 +01:00
Alexandre Flament
037cb7dd3d [fix] imdb: don't crash when there is no result 2021-12-06 21:49:18 +01:00
Markus Heiser
6e06618e0c [fix] google-videos engine: ignore news articles
In the video search, google also sometimes includes news.  E.g. in the DE
language when you search for `!gov paris`, google adds an article from a german
newspaper (FAZ), I assume these are sponsored link (not tagged advertisement?)

Those links do not have an image / this patch ignores *video links* wqithout an
image ID.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-26 17:11:20 +01:00
Markus Heiser
1ce09df9aa [fix] google video engine - rework of the HTML parser
The google video response has been changed slightly, a rework of the parser was
needed.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-26 01:14:17 +01:00
Markus Heiser
488ace1da9 [fix] google engine - suggestion
BTW: google no longer offers *spelling suggestions*

Closes: https://github.com/searxng/searxng/issues/442
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-25 19:42:03 +01:00
Markus Heiser
5b28c9109f [fix] google images: @href index 0 not found
Sometimes there is no href in the `<a ..>` tag of a *link_node* [1].

[1] https://github.com/searxng/searxng/issues/532

Reported-by: @TheEssem
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-21 09:55:59 +01:00
Markus Heiser
4c82ac7670 [drop] engine digg - https://digg.com/api is no longer available
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-11-19 15:00:22 +01:00
Tom
e1d60051ca
[fix] Qwant search query string
Search string: "!qwant time"
Resulting request URL: https://api.qwant.com/v3/search/web?q=q=time&count=10&offset=0&device=desktop&safesearch=1&locale=en_US
Notice the double "q="

Resulting request URL after fix: https://api.qwant.com/v3/search/web?q=time&count=10&offset=0&device=desktop&safesearch=1&locale=en_US
2021-11-17 18:13:54 +01:00
MrPaulBlack
41494d9f47 [fix] make reddit only in social media category avail.
fix https://github.com/searxng/searxng/issues/470
2021-11-01 20:37:17 +01:00
Alexandre Flament
64b29ad838 [mod] microsoft academic: increase timeout to 6 seconds
also avoid a crash when there is no result

close #433
2021-10-26 12:26:43 +02:00
Markus Heiser
713814547a [fix] yahoo engine - don't lump all search suggestions together
Closes: https://github.com/searxng/searxng/issues/421
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-21 07:51:05 +00:00
Markus Heiser
f63ffbb22b [fix] engine - yahoo: rewrite and fix issues
Languages are supported by mapping the language to a domain.  If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.

BTW: fix issue reported at https://github.com/searx/searx/issues/3020

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-16 20:05:26 +00:00
Markus Heiser
38a157b56f [pylint] engines: yahoo fix several issues reported from pylint
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-16 20:05:26 +00:00
MrPaulBlack
00b0394e19 [fix] language param for qwant 2021-10-14 16:11:44 +00:00
Noémi Ványi
4cc1ee8565 [fix] qwant engine - only get results from categories
Reported-by: https://github.com/searx/searx/issues/3014
Cherry-picked: https://github.com/searx/searx/commit/3bcca43
2021-10-12 18:42:50 +00:00
Paolo Basso
64df011e2f [mod] engines - add zlibrary engine 2021-10-11 14:58:44 +00:00
Markus Heiser
3abbe6d25b [fix] engine torznab - categories, before join convert int to str
BTW add init() function and replace SearxEngineAPIException by ValueError.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-07 15:27:55 +00:00
Markus Heiser
9fb77065bd [fix] engine torznab - marginal issues reported from linters
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-07 15:27:55 +00:00
Paolo Basso
d803df8d89 [mod] engines - add torznab WebAPI 2021-10-07 15:27:55 +00:00
Markus Heiser
19e41c137e [mod] set 'engine.supported_languages' from the origin python module
The key of the dictionary 'searx.data.ENGINES_LANGUAGES' is the *engine name*
configured in settings.xml.  When multiple engines are configured to use the
same origin engine (e.g. `engine: google`)::

    - name: google
      engine: google
      use_mobile_ui: false
      ...

    - name: google italian
      engine: google
      use_mobile_ui: false
      language: it
      ...

    - name: google mobile ui
      engine: google
      shortcut: gomui
      use_mobile_ui: true

There exists no entry for ENGINES_LANGUAGES[engine.name] (e.g. `name: google
mobile ui` or `name: google italian`).  This issue can be solved by recreate the
ENGINES_LANGUAGES::

    make data.languages

But this is nothing an SearXNG admin would like to do when just configuring
additional engines, since this just doubles entries in ENGINES_LANGUAGES and
BTW: `make data.languages` has various external requirements which might be not
installed or not available, on a production host.

With this patch, if engine.name fails, ENGINES_LANGUAGES[engine.engine] is used
to get the engine.supported_languages (e.g. `google` for the engine named
`google mobile`).

For an engine, when there is `language: ...` in the YAML settings, the engine
supports only one language, in this case engine.supported_languages should
contains this value defined in settings.yml (e.g. `it` for the engine named
`google italian`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Closes: https://github.com/searxng/searxng/issues/384
2021-10-07 08:45:02 +02:00
Alexandre Flament
8a897b86f1 [mod] engines - IMDB: add thumbnails 2021-10-05 09:10:02 +02:00
Paul Alcock
823d44ed0a [mod] engines - add IMDB / Internet Movie Database
Merged from @Guilvareux's commit [1] and slightly modfied / see [2].

[1] https://github.com/searx/searx/pull/2980/commits/f2f90071
[2] https://github.com/searx/searx/pull/2980
2021-10-03 11:44:25 +02:00
Markus Heiser
a5b7ed9550 [mod] engine duckduckgo - update supported_languages_url
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-01 20:01:41 +02:00
Markus Heiser
4c9b8b29ee [mod] engine duckduckgo - use DuckDuckGo-Lite
Implement a scrapper for DuckDuckGo-Lite [1].  The existing DuckDuckGo [2]
engine does not support paging.  DuckDuckgo-Lite is much faster, less verbose
and does have a paging option (reversed engineered from the input form of [1]).

[1] https://lite.duckduckgo.com/lite
[2] https://duckduckgo.com/

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-10-01 20:01:41 +02:00
Markus Heiser
ecb3912bd0 [fix] engine stackexchange - decode HTML entities in title & content
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-29 08:08:18 +02:00
Markus Heiser
b62851559b [mod] replace old stackoverflow engine by Stack Exchange API v2.3
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-28 19:12:37 +02:00
Markus Heiser
55fee1e45d [mod] engines - add Stack Exchange API v2.3
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-28 19:01:04 +02:00
Alexandre Flament
b046322c7b
Merge pull request #333 from dalf/enh-engine-descriptions
RFC: /preferences: display engine descriptions
2021-09-25 11:29:25 +02:00
Alexandre Flament
ab569c1e12 [fix] openstreetmap engine: optmizer SPARQL query
add
hint:Query hint:optimizer "None".
to the SPARQL query to keep the response time small.

It tells the optimizer to follow the path from ?item to the different property values
instead of the other way around.
See https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_optimization#Property_paths
2021-09-25 11:16:22 +02:00
Alexandre Flament
8961131497 [fix] fix the about section of some engines 2021-09-24 20:20:30 +02:00
Alexandre Flament
6f11b61cd5 [fix] openstreetmap engine: map "all" language to English 2021-09-24 20:12:18 +02:00
Markus Heiser
443bf35e09 [pylint] fix global-variable-not-assigned issues
If there is no write access, there is no need for global.  Remove global
statement if there is no assignment.

global-variable-not-assigned:
  Using global for names but no assignment is done Used when a variable is
  defined through the "global" statement but no assignment to this variable is
  done.

In Pylint 2.11 the global-variable-not-assigned checker now catches global
variables that are never reassigned in a local scope and catches (reassigned)
functions [1][2]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2.11.html
[2] https://github.com/PyCQA/pylint/issues/1375

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-17 10:14:27 +02:00
Alexandre Flament
602cbc2c99
Merge pull request #297 from dalf/engine-logger-enh
debug mode: more readable logging
2021-09-14 07:06:28 +02:00
Alexandre Flament
f8793fbda0 [fix] logger per engine: make .logger is always initialized
the openstreetmap engine imports code from the wikidata engine.
before this commit, specific code make sure to copy the logger variable to the wikidata engine.

with this commit searx.engines.load_engine makes sure the .logger is initialized.
The implementation scans sys.modules for module name starting with searx.engines.
2021-09-13 08:47:59 +02:00
Alexandre Flament
0e42db9da1 [mod] xpath engine: remove logging of the requested URL 2021-09-11 10:13:16 +02:00
Markus Heiser
f0059b80ed [pylint] engines: drop no longer needed 'missing-function-docstring'
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 13:26:59 +02:00
Markus Heiser
82847df300 [fix] add 'categories' to PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES
androp no longer needed (see line 591 in 7b235a1)::

    # pylint: disable=undefined-variable

Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 10:29:38 +02:00
Markus Heiser
cd033b5416 [fix] drop useless pylint: disable=undefined-variable
Since 7b235a1 (see line 591) it is no longer needed to disable
'undefined-variable' for names defined in::

   PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES

Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 10:26:15 +02:00
Alexandre Flament
ea60c03827 [fix] fix openstreetmap engine
close #298

This is a workaround: inside engine code, any call to function in another engine can crash
since the logger won't be initialized except if it is done explicitly.
2021-09-06 22:44:22 +02:00
Markus Heiser
aecfb2300d [mod] one logger per engine - drop obsolete logger.getChild
Remove the no longer needed `logger = logger.getChild(...)` from engines.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06 18:05:46 +02:00
Markus Heiser
7b235a1c36 [mod] one logger per engine
Suggested-by: @dalf in https://github.com/searxng/searxng/issues/98#issuecomment-849013518
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-06 17:47:28 +02:00
Markus Heiser
9ff881f937 [fix] remove minimum length of content for XPath engine
Instead of raising an exception and therefore hiding all results of the engine.

It make sense to remove that requirement in order to allow the implementation of
search engines that do not always have a description.  In fact some search
engines that in 99% of the case have a description like Brave Search or Mojeek
crash completely if they for some reason included a result with no description.

To test this patch try Mojeek:

    !mjk xyz

before and after the patch.

Suggested-by: 0xhtml in https://github.com/searx/searx/discussions/2933
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-04 12:41:23 +02:00
Allen
a5a0a4e106 [fix] Correct engine name in for Rumble 2021-09-04 10:22:26 +02:00
Allen
49bbd250d9 [fix] Update about section of Invidious
Another website and new documentation
2021-09-04 10:22:07 +02:00
Markus Heiser
b83c14cf6b [pylint] Pylint 2.10 - fix use-list-literal & use-dict-literal
Pylint 2.10 added new default checks [1]:

use-list-literal
  Emitted when list() is called with no arguments instead of using []

use-dict-literal
  Emitted when dict() is called with no arguments instead of using {}

[1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-31 10:40:29 +02:00
Noémi Ványi
3d5e6e0abb [enh] google: add filter=0 to Google engine for more results
backport from searx ( 23b3b56a06ef831af0a1b30a12c26ebd50e329bb )
2021-08-21 17:46:16 +02:00
Samuel Dudik
7a7ef9cea6 [fix] Seznam engine - some XPath selectors has been changed
Merged from https://github.com/dudik/searx/commit/5a4207759

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-27 07:13:41 +02:00
Alexandre Flament
48fe83b901
Merge pull request #221 from dalf/fix-peertube_fetch_supported_languages
[fix] peertube: update _fetch_supported_languages
2021-07-25 10:30:53 +02:00
Markus Heiser
fe67f1478f [fix] qwant engine - prevent API locale exception on lang 'all'
Has been reported in [1], error message::

    Error
        Error: searx.exceptions.SearxEngineAPIException
        Percentage: 0
        Parameters: ('API error::locale must be a string,locale must be one of
        the following values: en_gb, en_ie, en_us, en_ca, en_in, en_my, en_au,
        en_nz, cy_gb, gd_gb, de_de, de_ch, de_at, fr_fr, br_fr, fr_be, fr_ch,
        fr_ca, fr_ad, fc_ca, ec_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx,
        es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_br,
        pt_pt, pt_ad, nl_be, nl_nl, pl_pl, zh_hk, zh_cn, fi_fi, bg_bg, et_ee,
        hu_hu, da_dk, nb_no, sv_se, ko_kr, th_th, cs_cz, ro_ro, el_gr',)
        File name: searx/engines/qwant.py:114
        Function: response
        Code: raise SearxEngineAPIException('API error::' + msg)

[1] https://github.com/searxng/searxng/issues/222

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-24 14:48:27 +02:00
Markus Heiser
ca57c7421b [fix] qwant engine - prevent exception on date/time value is None
Has been reported in [1], error messages::

  Error
       Error: ValueError
       Percentage: 0
       Parameters: ()
       File name: searx/engines/qwant.py:159
       Function: response
       Code: pub_date = datetime.fromtimestamp(item['date'], None)

    Error
        Error: TypeError
        Percentage: 0
        Parameters: ('an integer is required (got type NoneType)',)
        File name: searx/engines/qwant.py:196
        Function: response
       Code: pub_date = datetime.fromtimestamp(item['date'])

Fix timedelta from seconds to milliseconds [1], error message::

    Error
        Error: TypeError
        Percentage: 0
        Parameters: ('unsupported type for timedelta seconds component: NoneType',)
        File name: searx/engines/qwant.py:195
        Function: response
        Code: length = timedelta(seconds=item['duration'])

[1] https://github.com/searxng/searxng/issues/222

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-24 14:48:14 +02:00
Alexandre Flament
b0a12924a0 [fix] peertube: update _fetch_supported_languages
update the regex to match the changes in peertube source code
fix "make data.languages"
2021-07-23 12:03:16 +02:00
Alexandre Flament
f523fd3ea7
Merge pull request #211 from MarcAbonce/onions_v3_fix_searxng
Update onion engines to v3
2021-07-16 17:25:37 +02:00
Alexandre Flament
d47b8e36cf
Merge pull request #207 from return42/mongodb
[enh] add mongodb offline engine
2021-07-16 16:15:01 +02:00
Alexandre Flament
0d65a81b1c [mod] qwant engine: fix typos / minor change
minor modification of commit 628b5703f3
(no functionnal change)
2021-07-16 15:32:12 +02:00
Marc Abonce Seguin
1b05ea6a6b update onion engines to v3
remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
2021-07-16 01:36:34 -07:00
Markus Heiser
0a9cd08bf1 [enh] add mongodb offline engine
Cherry-Pick: https://github.com/searx/searx/commit/198aad43
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-15 21:35:33 +02:00
Markus Heiser
628b5703f3 [mod] improve video results of the qwant engine
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-15 20:10:37 +02:00
Alexandre Flament
f376b4ed3e
Merge pull request #205 from unixfox/patch-2
Add missing parameter for mobile UI search
2021-07-15 17:19:12 +02:00
Émilien Devos
6c9f276571
Add missing parameter for mobile UI search 2021-07-15 13:00:32 +00:00
Markus Heiser
ef6e1bd6b9 [fix] Qwant engines - implement API v3 and add 'quant videos'
The implementation uses the Qwant API (https://api.qwant.com/v3). The API is
undocumented but can be reverse engineered by reading the network log of
https://www.qwant.com/ queries.

This implementation is used by different qwant engines in the settings.yml::

  - name: qwant
    categories: general
    ...
  - name: qwant news
    categories: news
    ...
  - name: qwant images
    categories: images
    ...
  - name: qwant videos
    categories: videos
    ...

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-14 09:47:32 +02:00
Markus Heiser
513c73a309 [drop] engine torrentz: torrentz2.eu and torrentz2.is are offline
[1] https://torrentfreak.com/torrentz2-eu-domain-suspended-by-registry-on-public-prosecutors-order-200628/

Suggested-by: @rasos https://github.com/searx/searx/issues/1875#issuecomment-877755872
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-07-11 13:24:33 +02:00
Émilien Devos
d9d9bd720d
Fix google images
Proposed fix in https://github.com/searx/searx/pull/2115#issuecomment-876716010
2021-07-10 14:09:29 +00:00
Markus Heiser
0ef6aa5126 [docs] add documentation from the sources of the google engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21 18:25:52 +02:00
Markus Heiser
05e90f2e57 [fix] google answers: normalize space of the answers.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21 16:50:25 +02:00
Markus Heiser
f096d68ec6 [mod] google engine: reduce mobile UI parameters to what is needed
Reverse engineering shows that not all of the parameters used by google's mobile
UI (aka "more results" button) are needed [1].

[1] https://github.com/searxng/searxng/pull/160#issuecomment-865013625

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-21 16:50:23 +02:00
Alexandre Flament
7a5c36408a [mod] google: add "use_mobile_ui" parameter to use mobile endpoint.
disable by default, it has to be enabled in settings.yml

related to  #159
2021-06-21 14:52:04 +02:00
Markus Heiser
9328c66e93 [fix] google news - send CONSENT Cookie to not be redirected
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking.  To get the consent
from the user, google-news requests are redirected to confirm and get a CONSENT
Cookie from https://consent.google.de/s?continue=...

This patch adds a CONSENT Cookie to the google-news request to avoid
redirection.

The behavior of the CONTENTS cookies over all google engines seems similar but
the pattern is not yet fully clear to me, here are some random samples from my
analysis ..

Using common google search from different domains::

    google.com:        CONSENT=YES+cb.{{date}}-14-p0.de+FX+816
    google.de:         CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    google.fr:         CONSENT=YES+srp.gws-{{date}}-0-RC2.fr+FX+826

When searching about videos (google-videos)::

    google.es:         CONSENT=YES+srp.gws-{{date}}-0-RC2.es+FX+076
    google.de:         CONSENT=YES+srp.gws-{{date}}-0-RC2.de+FX+171

Google news has only one domain for all languages::

    news.google.com:   CONSENT=YES+cb.{{date}}-14-p0.de+FX+816

Using google-scholar search from different domains::

    scholar.google.de: CONSENT=YES+cb.{{date}}-14-p0.de+FX+333
    scholar.google.fr: does not use such a cookie / did not ask the user
    scholar.google.es: does not use such a cookie / did not ask the user

Interim summary:

  Pattern is unclear and I won't apply the CONSENT cookie to all google engines.
  More experience is need before we generalize the CONSENT cookies over all
  google engines.

Related:

- e9a6ab401 [fix] youtube - send CONSENT Cookie to not be redirected
- https://github.com/benbusby/whoogle-search/issues/311
- https://github.com/benbusby/whoogle-search/issues/243

[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18 13:21:20 +02:00
Markus Heiser
dd7b53d369 [fix] google-news engine - KeyError: 'hl in request
Since we added

- 1c67b6aec [enh] google engine: supports "default language"

there is a KeyError: 'hl in request,error pattern::

    ERROR:searx.searx.search.processor.online:engine google news : exception : 'hl'
    Traceback (most recent call last):
      File "searx/search/processors/online.py", line 144, in search
        search_results = self._search_basic(query, params)
      File "searx/search/processors/online.py", line 118, in _search_basic
        self.engine.request(query, params)
      File "searx/engines/google_news.py", line 97, in request
        if lang_info['hl'] == 'en':
      KeyError: 'hl'

Closes: https://github.com/searxng/searxng/issues/154
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-18 11:34:11 +02:00
Markus Heiser
343570f7fb [pylint] searx/engines/duckduckgo_definitions.py
BTW: normalize indentations

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-14 09:22:29 +02:00
Markus Heiser
2ac3e5b20b [fix] log messages from: google- images, news, scholar, videos
- HTTP header Accept-Language --> lang_info['headers']['Accept-Language']
- remove obsolete query_url log messages which is already logged by
  httpx._client:HTTP request

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11 16:31:50 +02:00
Markus Heiser
1ac3961336 [mod] google - get_lang_info add documentataion & comments
BTW: remove obsolete log messages from google engine

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-11 16:06:36 +02:00
Alexandre Flament
1c67b6aece [enh] google engine: supports "default language"
Same behaviour behaviour than Whoogle [1].  Only the google engine with the
"Default language" choice "(all)"" is changed by this patch.

When searching for a locate place, the result are in the expect language,
without missing results [2]:

  > When a language is not specified, the language interpretation is left up to
  > Google to decide how the search results should be delivered.

The query parameters are copied from Whoogle.  With the ``all`` language:

- add parameter ``source=lnt``
- don't use parameter ``lr``
- don't add a ``Accept-Language`` HTTP header.

The new signature of function ``get_lang_info()`` is:

    lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language)

Argument ``supported_any_language`` is True for google.py and False for the other
google engines.  With this patch the function now returns:

- query parameters: ``lang_info['params']``
- HTTP headers: ``lang_info['headers']``
- and as before this patch:
  - ``lang_info['subdomain']``
  - ``lang_info['country']``
  - ``lang_info['language']``

[1] https://github.com/benbusby/whoogle-search
[2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
2021-06-10 10:22:01 +02:00
Markus Heiser
bf10b4a857 [fix] openstreetmap - fix some minor whitespace & indentation issues
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-06-09 18:08:23 +02:00