Commit graph

1380 commits

Author SHA1 Message Date
Markus Heiser
a5580c785d WIP
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-29 19:39:27 +02:00
Markus Heiser
744d96a16c [fix] startpage engine: language/region & time support & fix CAPTCHA
One reason for the often seen CAPTCHA of the startpage requests are the
incomplete requests SearXNG sends to startpage.com.  To avoid CAPTCHA we need to
send a well formed HTTP POST request with a cookie, we need to form a request
that is identical to the request build by startpage.com itself:

- in the cookie the **region** is selected
- in the POST arguments the **language** is selected

Based on the *engine_properties* boilerplate, SearXNG's startpage engine now
implements a `_fetch_engine_properties()` function to fetch regions & languages
from startpage.com.

This patch is a complete new implementation of the request() function, reversed
engineered from the startpage.com page.  The new implementation adds

- time-range support
- save-search support

to the startpage engine which has been missed in the past.

The locale code 'no_NO' from startpage does not exists and is mapped to nb-NO.
For reference see languages-subtag at iana [1], `no` is the macrolanguage::

     type: language
     Subtag: nb
     Description: Norwegian Bokmål
     Added: 2005-10-16
     Suppress-Script: Latn
     Macrolanguage: no

Additional hints:

- To fetch languages from startpage, this patch makes use of the
  EngineProperties implemented in 7bf0d46c

- Te get Startpage's locale & language, the function get_engine_locale from
  9ae409a is used.

[1] https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
[2] https://www.w3.org/International/questions/qa-choosing-language-tags#langsubtag

Closes: https://github.com/searxng/searxng/issues/1081
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-29 19:27:50 +02:00
Markus Heiser
da659123c1 [mod] qwant: moved supported_languages to type EngineProperties
"type": "engine_properties"

Supported languages in qwant are locales with a territory tag (aka regions).
Moved `supported_languages` to `EngineProperties.regions`.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-29 18:13:46 +02:00
Markus Heiser
3b10d63e2f [mod] engines_languages.json: add new type EngineProperties
This patch adds the boilerplate code, needed to fetch properties from engines.
In the past we only fetched *languages* but some engines need *regions* to
parameterize the engine request.

To fit into our *fetch language* procedures the boilerplate is implemented in
the `searxng_extra/update/update_languages.py` and the *engine_properties* are
stored along in the `searx/data/engines_languages.json`.

This implementation is downward compatible to the `_fetch_fetch_languages()`
infrastructure we have.  If there comes the day we have all
`_fetch_fetch_languages()` implementations moved to `_fetch_engine_properties()`
implementations, we can rename the files and scripts.

The new type `EngineProperties` is a dictionary with keys `languages` and
`regions`.  The values are dictionaries to map from SearXNG's language & region
to option values the engine does use::

    engine_properties = {
        'type' : 'engine_properties',  # <-- !!!
        'regions': {
            # 'ca-ES' : <engine's region name>
        },
        'languages': {
            # 'ca' : <engine's language name>
        },
    }

Similar to the `supported_languages`, in the engine the properties are available
under the name `supported_properties`.

Initial we start with languages & regions, but in a wider sense the type is
named *engine properties*.  Engines can store in whatever options they need and
may be in the future there is a need to fetch additional or complete different
properties.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-29 18:13:46 +02:00
Alexandre Flament
56000d5162
Merge pull request #1699 from liimee/eng-app-store
add apple app store engine
2022-08-27 07:43:23 +02:00
Alexandre Flament
44bc94c36e
Merge pull request #1700 from liimee/eng-ddm
add apple maps engine
2022-08-27 07:36:16 +02:00
ta
5057007270 remove thumbnail from results 2022-08-27 06:23:30 +07:00
ta
525946d7dd add poi's website and phone number, doesn't crash when there is no displayMapRegion, query the token on the first request 2022-08-27 06:17:58 +07:00
ta
5dce299b22 add apple maps engine 2022-08-25 17:05:40 +07:00
ta
cef7bbab22 get the not cropped version of the thumbnail when the image height is not too important 2022-08-24 18:33:11 +07:00
ta
78bff4618c add safesearch support 2022-08-24 18:31:04 +07:00
ta
bcae7ae4e3 add developer info as author 2022-08-24 17:50:38 +07:00
ta
e5c1b64b1d add the apple app store engine
The Apple App Store is the digital app distribution platform for iOS & iPadOS.
2022-08-24 17:27:36 +07:00
ta
040e24f9ad support playing videos directly 2022-08-24 16:48:31 +07:00
ta
79d06509c1 add tags as suggestions 2022-08-23 05:18:35 +07:00
ta
d22f469010 use invalid-name instead of C0103 for pylint 2022-08-22 18:27:35 +07:00
ta
dd9127492f add 9gag engine
9GAG is a social media website where users upload and share user-generated images and videos
2022-08-22 17:35:07 +07:00
M Asenov
faa32d5773 fixed xpath selector for appropriate results 2022-08-21 20:08:00 +01:00
Alexandre Flament
5ed40af3ba
Merge pull request #1661 from liimee/eng-tw
Add twitter engine
2022-08-21 15:21:18 +02:00
Markus Heiser
77a0f33819 [fix] engine duden - don't raise exception on empty result list
Duden expects a word in German, so with query "amazing" the site finds nothing
and respons a 404:

    httpx.HTTPStatusError: Client error '404 Not Found' for url\
      'https://www.duden.de/suchen/dudenonline/amazing'

[1] https://github.com/searxng/searxng/issues/1543#issuecomment-1193317054

Suggested-by: @allendema [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-20 08:41:03 +02:00
ta
05851978cf add explanation of token 2022-08-17 19:45:42 +07:00
ta
c8acd4a3b6 add profile image to user results 2022-08-17 14:30:59 +07:00
ta
b6fd7cd571 add thumbnail to results if available 2022-08-17 14:25:22 +07:00
Markus Heiser
27385e7898 [mod] qwant - add safesearch option
Closes: https://github.com/searxng/searxng/issues/1640
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:36:14 +02:00
Markus Heiser
6579d6d558 [fix] qwant - API error::locale must be one ..
The request function should not request a language (aka locale) that is not
supported by qwant. Select a locale like zh-TW ends in qwant's API error:

  ERROR searx.engines.qwant news: exception : \
  API error::locale must be one of the following values: \
    en_gb, en_ie, en_us, en_ca, en_my, en_au, en_nz, de_de, de_ch, de_at, fr_fr, \
    fr_be, fr_ch, fr_ca, fr_ad, fc_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx, \
    es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_pt, pt_ad, \
    nl_be, nl_nl

The existing searx.utils.match_language function is unsuitable for this purpose,
it is replaced by function searx.locales.get_engine_locale that is based on the
methods from the babel package.

The quant's _fetch_supported_languages function has been revised to filter out
languages 8aka locales) not supported by qwant.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:36:14 +02:00
Markus Heiser
75bb8c45d0 [mod] decouple qwant's categories from SearXNG's categories
By using new property `qwant_categ:` the category of qwant is no longer bound to
the category of SearXNG.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:26:54 +02:00
ta
96ea355a1f add twitter engine 2022-08-14 08:39:41 +07:00
Markus Heiser
eb02cc77c5 [fix] google - simplify XPath selectors to fetch more results
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-10 18:55:31 +02:00
Émilien Devos
b9f16a77db output format protobuf to HTML for google mobile 2022-08-10 09:36:06 +00:00
Brock Vojković
24210fb10b
Revert PR #1633
This reverts the changes made to the Google results XPath in PR #1633.
2022-08-10 03:41:39 +02:00
Léon Tiekötter
94b3656b4a [fix] google engine: results XPath
Seems google rolls out changes first on the `google.com` domain and later on the
"language" domains.  By example: yesterday [1] `google.com` did not work but
`google.de` and `google.fr` did work, today they do not work any longer and this
fix is needed on all domains.

Closes: https://github.com/searxng/searxng/issues/1628
[1] https://github.com/searxng/searxng/issues/1628#issuecomment-1208191816
2022-08-09 06:23:59 +02:00
Markus Heiser
8df1f0c47e [mod] add 'Accept-Language' HTTP header to online processores
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).

- add new engine option: send_accept_language_header

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-01 17:01:59 +02:00
Alexandre Flament
2babf59adc [fix] pyright repported errors
The errors make pyright usage useless since a new error won't be seen [1].

[1] https://github.com/searxng/searxng/pull/1569

```
  searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]"
    "Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]"
    Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues)
  searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str"
    Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues)
  searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int"
    Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues)
  searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join"
    Type "str" cannot be assigned to type "BytesPath"
      "str" is incompatible with "bytes"
      "str" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['themes']" cannot be assigned to type "BytesPath"
      "Literal['themes']" is incompatible with "bytes"
      "Literal['themes']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "str | Any | None" cannot be assigned to type "BytesPath"
      Type "str" cannot be assigned to type "BytesPath"
        "str" is incompatible with "bytes"
        "str" is incompatible with protocol "PathLike[bytes]"
          "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['img']" cannot be assigned to type "BytesPath"
      "Literal['img']" is incompatible with "bytes"
      "Literal['img']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports)
  searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports)
  searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource)
  searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable)
  searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable)
  searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown
    Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType)
  searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
  searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
```
2022-07-30 18:04:44 +02:00
Markus Heiser
c72d70d45c Revert "Quick fix for google engine for EU countries"
This reverts commit 747cf1a246.
2022-07-26 06:39:44 +02:00
Léon Tiekötter
950f036c03
[fix] google engine: results XPath 2022-07-26 00:24:15 +02:00
Émilien Devos
747cf1a246
Quick fix for google engine for EU countries
This revert part of the commit of 5fb2071cb2
2022-07-25 20:48:50 +00:00
Markus Heiser
0be0e63117 [fix] demo_online.py - fixed typo
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25 20:04:00 +02:00
Emilien Devos
5fb2071cb2 [fix] google & youtube - set EU consent cookie
This change the previous bypass method for Google consent using
``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``.

The youtube_noapi and google have a similar API, at least for the consent[1].

Get CONSENT cookie from google reguest::

    curl -i "https://www.google.com/search?q=time&tbm=isch" \
         -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \
         | grep -i consent
    ...
    location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1
    set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure
    ...

PENDING & YES [2]:

  Google change the way for consent about YouTube cookies agreement in EU
  countries. Instead of showing a popup in the website, YouTube redirects the
  user to a new webpage at consent.youtube.com domain ...  Fix for this is to
  put a cookie CONSENT with YES+ value for every YouTube request

[1] https://github.com/iv-org/invidious/pull/2207
[2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592

Closes: https://github.com/searxng/searxng/issues/1432
2022-07-25 13:27:06 +02:00
Markus Heiser
4231a5770b [fix] sjp engine - convert enginename to a latin1 compliance name
The engine name is not only a *name* its also a identifier that is used in
logs, HTTP headers and more.  Unicode characters in the name of an engine could
cause various issues.

Closes: https://github.com/searxng/searxng/issues/1544
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-24 21:10:55 +02:00
james-still
2516e21c58 [fix] emojipedia - update XPath to be relative 2022-07-24 19:14:26 +02:00
Markus Heiser
1540891561 [fix] engine tineye: handle 422 response of not supported img format
Closes: https://github.com/searxng/searxng/issues/1449
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-23 16:00:58 +02:00
Markus Heiser
4e05197444
Merge pull request #1475 from return42/Emojipedia
[mod] Add engine for Emojipedia
2022-07-15 09:30:40 +02:00
Jay
10edcbe3c2 [mod] Add engine for Emojipedia
Emojipedia is an emoji reference website which documents the meaning and
common usage of emoji characters in the Unicode Standard.  It is owned by Zedge
since 2021. Emojipedia is a voting member of The Unicode Consortium.[1]

Cherry picked from @james-still [2[3] and slightly modified to fit SearXNG's
quality gates.

[1] https://en.wikipedia.org/wiki/Emojipedia
[2] 2fc01eb20f
[3] https://github.com/searx/searx/pull/3278
2022-07-15 09:26:44 +02:00
Alexandre Flament
44f2eb50a5
Merge pull request #1219 from dalf/follow_bing_redirect
bing.py: remove redirection links
2022-07-10 18:06:22 +02:00
Emilien Devos
6face215b8 bypass google consent with ucbcb=1 2022-07-09 21:33:24 +00:00
Alexandre Flament
a1e8af0796 bing.py: resolve bing.com/ck/a redirections
add a new function searx.network.multi_requests to send multiple HTTP requests at once
2022-07-08 22:02:21 +02:00
Markus Heiser
970a69012b [fix] engine z-zlibrary https URL
before this patch:

    DEBUG   searx.engines.z-library : using base_url: https:https://de1lib.org

with this patch URL is fixed to:

    DEBUG   searx.engines.z-library : using base_url: https://de1lib.org

Closes: https://github.com/searxng/searxng/issues/1435
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-05 22:27:55 +02:00
ta
14756a2674 [mod] Adds Lingva translate engine
Add the lingva engine (which grabs data from google translate).  Results from
Lingva are added to the infobox results.
2022-07-04 19:06:45 +02:00
Markus Heiser
5831c15b49 [fix] engines/openstreetmap.py typo: user_langage --> user_language
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-02 16:51:25 +02:00
Alexandre Flament
6716c6b0c3 openstreetmap engine: return the localized named.
For example: display "Tokyo" instead of "東京都" when the language is English.
2022-07-02 16:51:25 +02:00