searxngRebrandZaclys/searx/engines
Markus Heiser 2499899554 [mod] Google: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the Google engines including a improved language
and region handling based on the engine.traits_v1 data.

When ever possible the implementations of the Google engines try to make use of
the async REST APIs.  The get_lang_info() has been generalized to a
get_google_info() function / especially the region handling has been improved by
adding the cr parameter.

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - Google (WEB),
  - Google images,
  - Google news,
  - Google scholar and
  - Google videos

  and remove data from obsolete data type "supported_languages".

  A traits.custom type that maps region codes to *supported_domains* is fetched
  from https://www.google.com/supported_domains

searx/autocomplete.py:
  Reversed engineered autocomplete from Google WEB.  Supports Google's languages and
  subdomains.  The old API suggestqueries.google.com/complete has been replaced
  by the async REST API: https://{subdomain}/complete/search?{args}

searx/engines/google.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - always use the async REST API (formally known as 'use_mobile_ui')
  - use *supported_domains* from traits
  - improved the result list by fetching './/div[@data-content-feature]'
    and parsing the type of the various *content features* --> thumbnails are
    added

searx/engines/google_images.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - if exists, freshness_date is added to the result
  - issue 1864: result list has been improved a lot (due to the new cr parameter)

searx/engines/google_news.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
    *supported_domains* is not needed but a ceid list has been added.
  - different region handling compared to Google WEB
  - fixed for various languages & regions (due to the new ceid parameter) /
    avoid CONSENT page
  - Google News do no longer support time range
  - result list has been fixed: XPath of pub_date and pub_origin

searx/engines/google_videos.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - add paging support
  - implement a async request ('asearch': 'arc' & 'async':
    'use_ac:true,_fmt:html')
  - simplified code (thanks to '_fmt:html' request)
  - issue 1359: fixed xpath of video length data

searx/engines/google_scholar.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - request(): include patents & citations
  - response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager)
  - hardening XPath to iterate over results
  - fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class)
  - issue 1769 fixed: new request implementation is no longer incompatible

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
..
__init__.py [mod] replace engines_languages.json by engines_traits.json 2023-03-24 10:37:42 +01:00
9gag.py get the not cropped version of the thumbnail when the image height is not too important 2022-08-24 18:33:11 +07:00
1337x.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
ahmia.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
apkmirror.py rollback test 2023-03-15 19:55:20 +01:00
apple_app_store.py remove thumbnail from results 2022-08-27 06:23:30 +07:00
apple_maps.py add poi's website and phone number, doesn't crash when there is no displayMapRegion, query the token on the first request 2022-08-27 06:17:58 +07:00
archlinux.py [enh] add more categories 2022-01-05 11:00:11 +01:00
artic.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
arxiv.py Science category: update the engines 2022-09-23 20:45:58 +02:00
bandcamp.py [mod] bandcamp & genius: in result set img_src instead thumbnail 2022-02-21 22:12:07 +01:00
base.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
bing_images.py [mod] bing_images: use async API & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
bing_news.py [mod] bing_news: use async API & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
bing_videos.py [mod] bing_videos: use async API & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
bing.py [doc] add a description of bing engines (web, news, video, images) 2023-03-24 10:37:42 +01:00
btdigg.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
command.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
core.py [fix] doc of the paper.html template (isbn, issn) 2022-09-25 15:46:29 +02:00
crossref.py [mod] make python code pylint 2.16.1 compliant 2023-02-10 13:59:21 +01:00
currency_convert.py [pylint] engines/currency_convert.py 2022-02-01 08:02:42 +01:00
dailymotion.py [mod] Dailymotion: fetch engine traits (data_type: supported_languages) 2023-03-24 10:37:42 +01:00
deepl.py [mod] add deepl translation engine 2022-08-10 09:14:36 +02:00
deezer.py [mod] templates: rename field for <iframe> URL to iframe_src 2022-02-18 19:00:49 +01:00
demo_offline.py [mod] replace engines_languages.json by engines_traits.json 2023-03-24 10:37:42 +01:00
demo_online.py [mod] add 'Accept-Language' HTTP header to online processores 2022-08-01 17:01:59 +02:00
deviantart.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
dictzone.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
digbt.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
docker_hub.py [mod] make python code pylint 2.16.1 compliant 2023-02-10 13:59:21 +01:00
doku.py [fix] issues reported by pylint 2021-12-27 10:16:20 +01:00
duckduckgo_definitions.py [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
duckduckgo_images.py [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
duckduckgo_weather.py [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
duckduckgo.py [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
duden.py [fix] engine duden - don't raise exception on empty result list 2022-08-20 08:41:03 +02:00
dummy-offline.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
dummy.py [enh] engines: add about variable 2021-01-14 20:57:17 +01:00
ebay.py [mod] Pass desired ebay domain in settings 2022-04-16 19:10:35 +02:00
elasticsearch.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
emojipedia.py [fix] emojipedia - update XPath to be relative 2022-07-24 19:14:26 +02:00
fdroid.py [enh] add more categories 2022-01-05 11:00:11 +01:00
flickr_noapi.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
flickr.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
framalibre.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
freesound.py [mod] result_templates/default.html replace embedded HTML by data_src audio_src 2022-02-13 14:20:47 +01:00
frinkiac.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
genius.py [mod] bandcamp & genius: in result set img_src instead thumbnail 2022-02-21 22:12:07 +01:00
gentoo.py [enh] add more categories 2022-01-05 11:00:11 +01:00
gigablast.py [mod] make python code pylint 2.16.1 compliant 2023-02-10 13:59:21 +01:00
github.py [fix] typos / reported by @kianmeng in searx PR-3366 2022-09-27 18:32:14 +02:00
google_images.py [mod] Google: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
google_news.py [mod] Google: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
google_play_apps.py [mod] add 'Accept-Language' HTTP header to online processores 2022-08-01 17:01:59 +02:00
google_scholar.py [mod] Google: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
google_videos.py [mod] Google: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
google.py [mod] Google: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
imdb.py [enh] move dictionaries, Erowid & IMDb out of general category 2022-01-05 11:03:44 +01:00
ina.py [fix] ina engine 2022-01-28 22:33:41 +01:00
invidious.py [mod] templates: rename field for <iframe> URL to iframe_src 2022-02-18 19:00:49 +01:00
jisho.py [format.python] based on bugfix in 9ed626130 2022-05-07 18:23:10 +02:00
json_engine.py [enh] Initial no paging support for Yep.com 2022-06-11 14:17:44 +02:00
kickass.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
lingva.py [mod] Adds Lingva translate engine 2022-07-04 19:06:45 +02:00
loc.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
mediathekviewweb.py [fix] engine mediathekviewweb: replace http links by https 2022-03-07 19:49:16 +01:00
mediawiki.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
meilisearch.py [pylint] engines: drop no longer needed 'missing-function-docstring' 2021-09-07 13:26:59 +02:00
metacpan.py Add MetaCPAN engine 2022-11-07 08:07:06 -06:00
mixcloud.py [mod] add artwork to mixcloud & soundcloud engines 2022-02-19 21:59:12 +01:00
mongodb.py [fix] pyright repported errors 2022-07-30 18:04:44 +02:00
mysql_server.py [fix] pyright repported errors 2022-07-30 18:04:44 +02:00
nyaa.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
opensemantic.py [enh] engines: add about variable 2021-01-14 20:57:17 +01:00
openstreetmap.py [mod] add 'Accept-Language' HTTP header to online processores 2022-08-01 17:01:59 +02:00
openverse.py [fix] ccengine engine - avoid unwanted redirects 2022-01-07 14:14:31 +01:00
pdbe.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
peertube.py [mod] Peertube: re-engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
petal_images.py [enh] Initial Petalsearch Images support 2022-06-02 14:32:37 +02:00
photon.py [fix] typos / reported by @kianmeng in searx PR-3366 2022-09-27 18:32:14 +02:00
piratebay.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
postgresql.py [fix] pyright repported errors 2022-07-30 18:04:44 +02:00
pubmed.py Science category: update the engines 2022-09-23 20:45:58 +02:00
qwant.py [mod] qwant: fetch engine traits (data_type: traits_v1) 2023-03-24 10:37:42 +01:00
recoll.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
reddit.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
redis_server.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
rumble.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
scanr_structures.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
searchcode_code.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
searx_engine.py reference docs.searxng.org 2022-01-02 21:18:29 +01:00
semantic_scholar.py [mod] science category: various update of about PR 1705 2022-09-23 20:52:55 +02:00
sepiasearch.py [mod] templates: rename field for <iframe> URL to iframe_src 2022-02-18 19:00:49 +01:00
seznam.py [enh] add more categories 2022-01-05 11:00:11 +01:00
sjp.py [fix] sjp engine - convert enginename to a latin1 compliance name 2022-07-24 21:10:55 +02:00
solidtorrents.py [fix] solidtorrents engine: store random bas_url in param 2022-02-04 14:55:21 +01:00
solr.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
soundcloud.py [mod] add artwork to mixcloud & soundcloud engines 2022-02-19 21:59:12 +01:00
spotify.py [mod] templates: rename field for <iframe> URL to iframe_src 2022-02-18 19:00:49 +01:00
springer.py [fix] springer: unsupported operand type(s) for +: 'NoneType' and 'str' 2022-09-25 15:25:55 +02:00
sqlite.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
stackexchange.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
startpage.py [mod] Startpage: reversed engineered & upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
tineye.py [fix] engine tineye: handle 422 response of not supported img format 2022-07-23 16:00:58 +02:00
tokyotoshokan.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
torznab.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
translated.py [enh] move dictionaries, Erowid & IMDb out of general category 2022-01-05 11:03:44 +01:00
twitter.py add explanation of token 2022-08-17 19:45:42 +07:00
unsplash.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
vimeo.py [mod] templates: rename field for <iframe> URL to iframe_src 2022-02-18 19:00:49 +01:00
wikidata.py [mod] wikipedia & wikidata: upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
wikipedia.py [mod] wikipedia & wikidata: upgrade to data_type: traits_v1 2023-03-24 10:37:42 +01:00
wolframalpha_api.py [fix] typos / reported by @kianmeng in searx PR-3366 2022-09-27 18:32:14 +02:00
wolframalpha_noapi.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
wordnik.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
wttr.py simplify infobox result 2022-08-31 18:29:50 +07:00
www1x.py [fix] 1x engine 2022-01-30 19:48:40 +01:00
xpath.py [fix] typos / reported by @kianmeng in searx PR-3366 2022-09-27 18:32:14 +02:00
yacy.py [format.python] initial formatting of the python code 2021-12-27 09:26:22 +01:00
yahoo_news.py [fix] issues reported by pylint 2021-12-27 10:16:20 +01:00
yahoo.py [mod] yahoo: fetch engine traits (data_type: traits_v1) 2023-03-24 10:37:42 +01:00
youtube_api.py [mod] templates: rename field for <iframe> URL to iframe_src 2022-02-18 19:00:49 +01:00
youtube_noapi.py [fix] google & youtube - set EU consent cookie 2022-07-25 13:27:06 +02:00
zlibrary.py [fix] engine z-zlibrary https URL 2022-07-05 22:27:55 +02:00