Commit Graph

1538 Commits

Author SHA1 Message Date
Markus Heiser 15eaf0f15f [mod] bing_news: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser ff80e7637e [mod] bing_images: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser bc21d28298 [mod] bing_videos: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser d0f465e6fa [mod] bing: add time_range support & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 7daf4f95ef [mod] Wikipedia: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Wikipedia engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser f78f908383 [mod] Google: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Google engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser dba8977b09 [mod] DuckDuckGo: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the DuckDuckGo engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser ef143729a0 [mod] yahoo: fetch engine traits (data_type: traits_v1)
Implements a fetch_traits function for the Yahoo engine.

.. note::

   Includes migration of the request methode from 'supported_languages' to
   'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser c1ae2ef57c [mod] qwant: fetch engine traits (data_type: traits_v1)
Implements a fetch_traits function for the Qwant engines.

.. note::

   Includes migration of the request methode from 'supported_languages' to
   'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser fc0c775030 [mod] Dailymotion: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Dailymotion engine.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 61383edb27 [mod] Startpage: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Startpage engine.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser d3aa690a7a [mod] bing: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Bing engines.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser a7fe22770a [mod] Peertube: re-engineered & upgrade to data_type: traits_v1
- fetch_traits(): Fetch languages from peertube's search-index source code.

  [mod] Include migration of the request methode from 'supported_languages'
        to 'traits' (EngineTraits) object.
  [fix] old supported_languages_url is no longer valid since the sources
        has been moved to a different path.

- fixed code to pass pylint
- request(): complete re-implementation based on the API docs [1]
- response(): complete re-implementation, adds serveral fields missed before
- add source code documentation

[1] https://docs.joinpeertube.org/api-rest-reference.html#tag/Search/operation/searchVideos

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 6e5f22e558 [mod] replace engines_languages.json by engines_traits.json
Implementations of the *traits* of the engines.

Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*.  Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.

To load traits from the persistence::

    searx.enginelib.traits.EngineTraitsMap.from_data()

For new traits new properties can be added to the class::

    searx.enginelib.traits.EngineTraits

.. hint::

   Implementation is downward compatible to the deprecated *supported_languages
   method* from the vintage implementation.

   The vintage code is tagged as *deprecated* an can be removed when all engines
   has been ported to the *traits method*.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Solirs ac169a0f75 Pass black formatting test 2023-03-21 00:41:36 +01:00
Solirs e26bce33d4 WIKIDATA: Add description for results 2023-03-21 00:14:54 +01:00
Alexandre Flament 3e9cddc606
rollback test 2023-03-15 19:55:20 +01:00
Alexandre Flament 41ed0ef0c7
test 2023-03-15 19:53:53 +01:00
Markus Heiser 4c06837a50 [mod] make python code pylint 2.16.1 compliant
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-10 13:59:21 +01:00
Markus Heiser 257dc7d6c4 [fix-2146] set different HTTP Referer header to DuckDuckGo requests
For what ever reasons, ddg-lite don't like the Referer

  https://lite.duckduckgo.com/

In an interactive session in the WEB browser the the Reverer has exactly this
value, but ddg-lite don't like this value when the request is build up by
SearXNG.  The new value is:

  https://google.com/

What fakes a user comes from a google link.

Related: https://github.com/searxng/searxng/pull/2081
Closes: https://github.com/searxng/searxng/issues/2146

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-03 08:45:51 +01:00
Alexandre Flament 9d102fb08f
Merge pull request #2132 from dalf/update_pr_1967
search.suspended_time settings: bug fixes
2023-01-29 20:48:43 +01:00
Alexandre Flament bfca63c536 wikipedia engine: update _fetch_supported_languages
the layout https://meta.wikimedia.org/wiki/List_of_Wikipedias has changed
2023-01-29 10:01:58 +00:00
Alexandre Flament 8256de2fe8 peertube engine: update _fetch_supported_languages
There is now an API to get the list of supported languages
https://docs.joinpeertube.org/api-rest-reference.html#tag/Video/operation/getLanguages
2023-01-29 10:01:54 +00:00
Alexandre Flament 37addec69e search.suspended_time settings: bug fixes
* fix type in settings.yml: replace suspend_times by suspended_times
* always use delay defined in settings.yml:
  * HTTP status 402 and 403: read the value from settings.yml instead of using the hardcoded value of 1 day.
  * startpage engine: CAPTCHA suspend the engine for one day instead of one week
2023-01-28 10:24:14 +00:00
Ahmad Alkadri 7fc8d72889 [fix] bing: parsing result; check to see if the element contains links
This patch is to hardening the parsing of the bing response:

1. To fix [2087] check if the selected result item contains a link, otherwise
   skip result item and continue in the result loop.  Increment the result
   pointer when a result has been added / the enumerate that counts for skipped
   items is no longer valid when result items are skipped.

   To test the bugfix use:   ``!bi :all cerbot``

2. Limit the XPath selection of result items to direct children nodes (list
   items ``li``) of the ordered list (``ol``).

   To test the selector use: ``!bi :en pontiac aztek wiki``

   .. in the result list you should find the wikipedia entry on top,
   compare [2068]

[2087] https://github.com/searxng/searxng/issues/2087
[2068] https://github.com/searxng/searxng/issues/2068
2023-01-09 15:08:24 +01:00
ahmad-alkadri 9ee99423fe [fix] Bing-Web engine: XPath to get the wikipedia result
Modify the XPath selector to get the wikipedia result plus small fixes.

About result content: especially with the Wikipedia result, we'd get several
paragraph elements, only the first paragraph would be taken and displayed on the
search result
2023-01-08 09:11:16 +01:00
Rudis Muiznieks 128b8c7f0a
Add HTTP Referer header to DuckDuckGo requests
closes #2080
2023-01-06 16:07:37 -06:00
Rudis Muiznieks 6804ff048d
Fix: add trailing slash to duckduckgo url
Close #1854
2022-12-22 07:49:58 -06:00
Alexandre Flament 269326063a Fix: don't crash when engine or name is missing in settings.yml
SearXNG crashes if the engine or name fields are missing.
With this commit, the app displays an error in the log and keeps loading.

Close #1951
2022-12-04 23:43:59 +01:00
Émilien Devos 46ad32343a Switch back to protobuf for raw HTML 2022-11-11 07:39:48 +00:00
ngosang 78be4b4c70 Fix Google search engine.
- Fix broken links. Resolves #1794
- Fix missing results. Resolves #1829
2022-11-11 07:34:19 +01:00
Alexandre Flament 8f19bdaf17
Merge pull request #1882 from fehho/metacpan
Add MetaCPAN engine
2022-11-07 21:54:11 +01:00
fehho fe351c2802 Add MetaCPAN engine 2022-11-07 08:07:06 -06:00
Vasilis Gerakaris 947b62c9d5
Fix floating point format in DDG weather humidity
Fixes #1836
2022-10-20 11:44:17 +03:00
Alexandre FLAMENT 035bc507ec [fix] startpage engine 2022-10-14 18:27:53 +00:00
Alexandre Flament a3148e5115
Merge pull request #1814 from return42/fix-typos
[fix] typos / reported by @kianmeng in searx PR-3366
2022-09-28 09:22:02 +02:00
Alexandre Flament 0e00af9c26
Merge pull request #1810 from return42/fix-1809
[fix] springer: unsupported operand type(s) for +: 'NoneType' and 'str'
2022-09-28 09:20:03 +02:00
Markus Heiser ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Markus Heiser 52023e3d6e [fix] doc of the paper.html template (isbn, issn)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-25 15:46:29 +02:00
Markus Heiser 0052887929 [fix] springer: unsupported operand type(s) for +: 'NoneType' and 'str'
- fix issue reported #1809
- filter out `None` value from issn and isbn list
- add comments (from publicationName)
- add publisher

Closes: https://github.com/searxng/searxng/issues/1809
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-25 15:25:55 +02:00
Markus Heiser e36b023508 [mod] core.ac.uk: add cetgory 'scientific publications'
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-24 16:16:22 +02:00
Alexandre Flament 16443d4f4a [mod] core.ac.uk: try multiple ways to get url
If the url is not found, using:
* the DOI
* the downloadUrl
* the ARK id
2022-09-24 15:02:39 +02:00
Markus Heiser c76830d8a8 [mod] core.ac.uk: use paper.html template
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-24 13:19:33 +02:00
Markus Heiser 3ff2ad939d [fix] ERROR searx.engines.core.ac.uk: list index out of range
Some result items from core.ac.uk do not have an URL::

  Traceback (most recent call last):
  File "searx/search/processors/online.py", line 154, in search
    search_results = self._search_basic(query, params)
  File "searx/search/processors/online.py", line 142, in _search_basic
    return self.engine.response(response)
  File "SearXNG/searx/engines/core.py", line 73, in response
    'url': source['urls'][0].replace('http://', 'https://', 1),

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-24 13:19:33 +02:00
Alexandre Flament d6446be38f [mod] science category: various update of about PR 1705 2022-09-23 20:52:55 +02:00
Alexandre FLAMENT e36f85b836 Science category: update the engines
* use the paper.html template
* fetch more data from the engines
* add crossref.py
2022-09-23 20:45:58 +02:00
Alexandre Flament bef3984d03
Merge pull request #1728 from liimee/eng-ddw
add duckduckgo weather engine
2022-09-23 18:14:09 +02:00
Alexandre Flament d3fec1388c
Merge pull request #1624 from liimee/eng-wttr
Add wttr.in engine
2022-09-23 18:13:37 +02:00
Alexandre Flament 1a7b6872b5
Merge pull request #1792 from unixfox/google-images-internal-api
use the internal API for google images
2022-09-21 19:50:38 +02:00
Markus Heiser cf7ee67f71 [mod] google-images: slightly improvements of the engine
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-21 18:59:55 +02:00
Emilien Devos df5f8d0e8e use the internal API for google images 2022-09-20 22:52:38 +02:00
Markus Heiser dcf1d408a5 [fix] google-news: origin result does not have a content area
The google news are in a rework, the content area of a news item has been
removed.

Closes: https://github.com/searxng/searxng/issues/1790
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-20 20:18:43 +02:00
Markus Heiser fbf07237ff [fix] and improve docs generated from source code.
Fix::

    searx/locales.py:docstring of searx.locales.get_engine_locale:17: \
      WARNING: Definition list ends without a blank line; unexpected unindent.

Improvement: don't show default values in the generated documentation whe it is
more a mess than a usefull information (`:meta hide-value:`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-18 12:44:12 +02:00
Alexandre FLAMENT dd0887be18 xpath engine: change raise_for_httperror to no_result_for_http_status
no_result_for_http_status contains a list of HTTP status.
These HTTP status are seen an empty result list.
In other cases an exception is thrown as usual.

Previously raise_for_httperror were ignoring all HTTP error,
which make defective engines invisible in the stats.
2022-09-04 09:07:28 +02:00
Markus Heiser a15dfa5ee1 [fix] engine woxikon.de - don't raise exception on empty result list
Woxikon expects a word in German, so with query "foo" the site finds nothing and
respons a 404:

    httpx.HTTPStatusError: Client error '404 Not Found' \
      for url 'https://synonyme.woxikon.de/synonyme/foo.php'

[1] https://github.com/searxng/searxng/issues/1543#issuecomment-1193317054

Closes: https://github.com/searxng/searxng/issues/1543
Suggested-by: @allendema [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-04 09:07:28 +02:00
Markus Heiser 8e9fb0b435
Merge pull request #1647 from return42/deepl-engine
[mod] add deepl translation engine
2022-09-02 14:09:22 +02:00
ta 85b5293e40 simplify infobox result 2022-08-31 18:29:50 +07:00
ta 12f7d4a46b add duckduckgo weather engine 2022-08-31 17:29:32 +07:00
Alexandre Flament 56000d5162
Merge pull request #1699 from liimee/eng-app-store
add apple app store engine
2022-08-27 07:43:23 +02:00
Alexandre Flament 44bc94c36e
Merge pull request #1700 from liimee/eng-ddm
add apple maps engine
2022-08-27 07:36:16 +02:00
ta 5057007270 remove thumbnail from results 2022-08-27 06:23:30 +07:00
ta 525946d7dd add poi's website and phone number, doesn't crash when there is no `displayMapRegion`, query the token on the first request 2022-08-27 06:17:58 +07:00
ta 5dce299b22 add apple maps engine 2022-08-25 17:05:40 +07:00
ta cef7bbab22 get the not cropped version of the thumbnail when the image height is not too important 2022-08-24 18:33:11 +07:00
ta 78bff4618c add safesearch support 2022-08-24 18:31:04 +07:00
ta bcae7ae4e3 add developer info as author 2022-08-24 17:50:38 +07:00
ta e5c1b64b1d add the apple app store engine
The Apple App Store is the digital app distribution platform for iOS & iPadOS.
2022-08-24 17:27:36 +07:00
ta 040e24f9ad support playing videos directly 2022-08-24 16:48:31 +07:00
ta 79d06509c1 add tags as suggestions 2022-08-23 05:18:35 +07:00
ta d22f469010 use `invalid-name` instead of `C0103` for pylint 2022-08-22 18:27:35 +07:00
ta dd9127492f add 9gag engine
9GAG is a social media website where users upload and share user-generated images and videos
2022-08-22 17:35:07 +07:00
ta e64cca8c3f don't raise error when nothing was found 2022-08-22 17:04:29 +07:00
M Asenov faa32d5773 fixed xpath selector for appropriate results 2022-08-21 20:08:00 +01:00
Alexandre Flament 5ed40af3ba
Merge pull request #1661 from liimee/eng-tw
Add twitter engine
2022-08-21 15:21:18 +02:00
Markus Heiser 77a0f33819 [fix] engine duden - don't raise exception on empty result list
Duden expects a word in German, so with query "amazing" the site finds nothing
and respons a 404:

    httpx.HTTPStatusError: Client error '404 Not Found' for url\
      'https://www.duden.de/suchen/dudenonline/amazing'

[1] https://github.com/searxng/searxng/issues/1543#issuecomment-1193317054

Suggested-by: @allendema [1]
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-20 08:41:03 +02:00
ta 05851978cf add explanation of token 2022-08-17 19:45:42 +07:00
ta c8acd4a3b6 add profile image to user results 2022-08-17 14:30:59 +07:00
ta b6fd7cd571 add thumbnail to results if available 2022-08-17 14:25:22 +07:00
Markus Heiser 27385e7898 [mod] qwant - add safesearch option
Closes: https://github.com/searxng/searxng/issues/1640
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:36:14 +02:00
Markus Heiser 6579d6d558 [fix] qwant - API error::locale must be one ..
The request function should not request a language (aka locale) that is not
supported by qwant. Select a locale like zh-TW ends in qwant's API error:

  ERROR searx.engines.qwant news: exception : \
  API error::locale must be one of the following values: \
    en_gb, en_ie, en_us, en_ca, en_my, en_au, en_nz, de_de, de_ch, de_at, fr_fr, \
    fr_be, fr_ch, fr_ca, fr_ad, fc_ca, co_fr, es_es, es_ar, es_cl, es_co, es_mx, \
    es_pe, es_ad, ca_es, ca_ad, ca_fr, eu_es, eu_fr, it_it, it_ch, pt_pt, pt_ad, \
    nl_be, nl_nl

The existing searx.utils.match_language function is unsuitable for this purpose,
it is replaced by function searx.locales.get_engine_locale that is based on the
methods from the babel package.

The quant's _fetch_supported_languages function has been revised to filter out
languages 8aka locales) not supported by qwant.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:36:14 +02:00
Markus Heiser 75bb8c45d0 [mod] decouple qwant's categories from SearXNG's categories
By using new property `qwant_categ:` the category of qwant is no longer bound to
the category of SearXNG.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:26:54 +02:00
ta 96ea355a1f add twitter engine 2022-08-14 08:39:41 +07:00
Markus Heiser eb02cc77c5 [fix] google - simplify XPath selectors to fetch more results
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-10 18:55:31 +02:00
Émilien Devos b9f16a77db output format protobuf to HTML for google mobile 2022-08-10 09:36:06 +00:00
Thomas Renard d4acbcfe63 [mod] add deepl translation engine
This implements the Deepl Translation engine. It works nearly like lingva but
directly to the deepl API.  This api only needs a to-lang, from-lang is a fake
by now.

There is a free option to use [1].

[1] https://www.deepl.com/pro-api?cta=header-pro-api for registering a free account.
2022-08-10 09:14:36 +02:00
Brock Vojković 24210fb10b
Revert PR #1633
This reverts the changes made to the Google results XPath in PR #1633.
2022-08-10 03:41:39 +02:00
Léon Tiekötter 94b3656b4a [fix] google engine: results XPath
Seems google rolls out changes first on the `google.com` domain and later on the
"language" domains.  By example: yesterday [1] `google.com` did not work but
`google.de` and `google.fr` did work, today they do not work any longer and this
fix is needed on all domains.

Closes: https://github.com/searxng/searxng/issues/1628
[1] https://github.com/searxng/searxng/issues/1628#issuecomment-1208191816
2022-08-09 06:23:59 +02:00
liimee 8c318562e2
add description and wikidata ID to wttr.in engine 2022-08-07 14:57:10 +07:00
ta 8aa018db95 add wttr.in engine 2022-08-07 13:04:18 +07:00
Markus Heiser 8df1f0c47e [mod] add 'Accept-Language' HTTP header to online processores
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).

- add new engine option: send_accept_language_header

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-01 17:01:59 +02:00
Alexandre Flament 2babf59adc [fix] pyright repported errors
The errors make pyright usage useless since a new error won't be seen [1].

[1] https://github.com/searxng/searxng/pull/1569

```
  searx/compat.py:11:27 - error: Expression of type "Type[cached_property[_T@cached_property]]" cannot be assigned to declared type "Type[cached_property]"
    "Type[cached_property[_T@cached_property]]" is incompatible with "Type[cached_property]"
    Type "Type[cached_property[_T@cached_property]]" cannot be assigned to type "Type[cached_property]" (reportGeneralTypeIssues)
  searx/utils.py:69:36 - error: Expression of type "None" cannot be assigned to parameter of type "str"
    Type "None" cannot be assigned to type "str" (reportGeneralTypeIssues)
  searx/utils.py:573:85 - error: Expression of type "None" cannot be assigned to parameter of type "int"
    Type "None" cannot be assigned to type "int" (reportGeneralTypeIssues)
  searx/webapp.py:1306:22 - error: Argument of type "str" cannot be assigned to parameter "__a" of type "BytesPath" in function "join"
    Type "str" cannot be assigned to type "BytesPath"
      "str" is incompatible with "bytes"
      "str" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:68 - error: Argument of type "Literal['themes']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['themes']" cannot be assigned to type "BytesPath"
      "Literal['themes']" is incompatible with "bytes"
      "Literal['themes']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:78 - error: Argument of type "str | Any | None" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "str | Any | None" cannot be assigned to type "BytesPath"
      Type "str" cannot be assigned to type "BytesPath"
        "str" is incompatible with "bytes"
        "str" is incompatible with protocol "PathLike[bytes]"
          "__fspath__" is not present (reportGeneralTypeIssues)
  searx/webapp.py:1306:85 - error: Argument of type "Literal['img']" cannot be assigned to parameter "paths" of type "BytesPath" in function "join"
    Type "Literal['img']" cannot be assigned to type "BytesPath"
      "Literal['img']" is incompatible with "bytes"
      "Literal['img']" is incompatible with protocol "PathLike[bytes]"
        "__fspath__" is not present (reportGeneralTypeIssues)
  searx/engines/mongodb.py:8:6 - warning: Import "pymongo" could not be resolved (reportMissingImports)
  searx/engines/mysql_server.py:9:8 - warning: Import "mysql.connector" could not be resolved (reportMissingImports)
  searx/engines/postgresql.py:9:8 - warning: Import "psycopg2" could not be resolved from source (reportMissingModuleSource)
  searx/engines/xpath.py:187:28 - warning: "categories" is not defined (reportUndefinedVariable)
  searx/search/__init__.py:184:82 - warning: "flask" is not defined (reportUndefinedVariable)
  searx/search/checker/background.py:19:26 - error: Type of "schedule" is partially unknown
    Type of "schedule" is "(delay: Any, func: Any, *args: Any) -> Literal[True]" (reportUnknownVariableType)
  searx/shared/__init__.py:8:12 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
  searx/shared/shared_uwsgi.py:5:8 - warning: Import "uwsgi" could not be resolved (reportMissingImports)
```
2022-07-30 18:04:44 +02:00
Markus Heiser c72d70d45c Revert "Quick fix for google engine for EU countries"
This reverts commit 747cf1a246.
2022-07-26 06:39:44 +02:00
Léon Tiekötter 950f036c03
[fix] google engine: results XPath 2022-07-26 00:24:15 +02:00
Émilien Devos 747cf1a246
Quick fix for google engine for EU countries
This revert part of the commit of 5fb2071cb2
2022-07-25 20:48:50 +00:00
Markus Heiser 0be0e63117 [fix] demo_online.py - fixed typo
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25 20:04:00 +02:00
Emilien Devos 5fb2071cb2 [fix] google & youtube - set EU consent cookie
This change the previous bypass method for Google consent using
``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``.

The youtube_noapi and google have a similar API, at least for the consent[1].

Get CONSENT cookie from google reguest::

    curl -i "https://www.google.com/search?q=time&tbm=isch" \
         -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \
         | grep -i consent
    ...
    location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1
    set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure
    ...

PENDING & YES [2]:

  Google change the way for consent about YouTube cookies agreement in EU
  countries. Instead of showing a popup in the website, YouTube redirects the
  user to a new webpage at consent.youtube.com domain ...  Fix for this is to
  put a cookie CONSENT with YES+ value for every YouTube request

[1] https://github.com/iv-org/invidious/pull/2207
[2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592

Closes: https://github.com/searxng/searxng/issues/1432
2022-07-25 13:27:06 +02:00
Markus Heiser 4231a5770b [fix] sjp engine - convert enginename to a latin1 compliance name
The engine name is not only a *name* its also a identifier that is used in
logs, HTTP headers and more.  Unicode characters in the name of an engine could
cause various issues.

Closes: https://github.com/searxng/searxng/issues/1544
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-24 21:10:55 +02:00
james-still 2516e21c58 [fix] emojipedia - update XPath to be relative 2022-07-24 19:14:26 +02:00
Markus Heiser 1540891561 [fix] engine tineye: handle 422 response of not supported img format
Closes: https://github.com/searxng/searxng/issues/1449
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-23 16:00:58 +02:00
Markus Heiser 4e05197444
Merge pull request #1475 from return42/Emojipedia
[mod] Add engine for Emojipedia
2022-07-15 09:30:40 +02:00
Jay 10edcbe3c2 [mod] Add engine for Emojipedia
Emojipedia is an emoji reference website which documents the meaning and
common usage of emoji characters in the Unicode Standard.  It is owned by Zedge
since 2021. Emojipedia is a voting member of The Unicode Consortium.[1]

Cherry picked from @james-still [2[3] and slightly modified to fit SearXNG's
quality gates.

[1] https://en.wikipedia.org/wiki/Emojipedia
[2] 2fc01eb20f
[3] https://github.com/searx/searx/pull/3278
2022-07-15 09:26:44 +02:00
Alexandre Flament 44f2eb50a5
Merge pull request #1219 from dalf/follow_bing_redirect
bing.py: remove redirection links
2022-07-10 18:06:22 +02:00
Emilien Devos 6face215b8 bypass google consent with ucbcb=1 2022-07-09 21:33:24 +00:00
Alexandre Flament a1e8af0796 bing.py: resolve bing.com/ck/a redirections
add a new function searx.network.multi_requests to send multiple HTTP requests at once
2022-07-08 22:02:21 +02:00
Markus Heiser 970a69012b [fix] engine z-zlibrary https URL
before this patch:

    DEBUG   searx.engines.z-library : using base_url: https:https://de1lib.org

with this patch URL is fixed to:

    DEBUG   searx.engines.z-library : using base_url: https://de1lib.org

Closes: https://github.com/searxng/searxng/issues/1435
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-05 22:27:55 +02:00
ta 14756a2674 [mod] Adds Lingva translate engine
Add the lingva engine (which grabs data from google translate).  Results from
Lingva are added to the infobox results.
2022-07-04 19:06:45 +02:00
Markus Heiser 5831c15b49 [fix] engines/openstreetmap.py typo: user_langage --> user_language
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-02 16:51:25 +02:00
Alexandre Flament 6716c6b0c3 openstreetmap engine: return the localized named.
For example: display "Tokyo" instead of "東京都" when the language is English.
2022-07-02 16:51:25 +02:00
ta 8883aed132 [fix] google play apps engine: implement engines/google_play_apps.py 2022-06-18 16:02:39 +02:00
Alexandre Flament 5bcbec9b06 Fix: use sys.modules.copy() to avoid RuntimeError
use sys.modules.copy() to avoid "RuntimeError: dictionary changed size during iteration"
see https://github.com/python/cpython/issues/89516
and https://docs.python.org/3.10/library/sys.html#sys.modules

close https://github.com/searxng/searxng/issues/1342
2022-06-18 07:39:46 +02:00
Alexandre Flament 2455f1d06a
Merge pull request #1308 from allendema/add-yep-com-json
[enh] Add yep.com via json_engine
2022-06-12 11:09:04 +02:00
Allen fd9a13a3e5 [enh] Initial no paging support for Yep.com
Upstream example query:
https://yep.com/web?q=test

https://yep.com/about
2022-06-11 14:17:44 +02:00
Alexandre Flament cd2dd5dd55 Wikidata engine: ignore dummy entities
Close #641
2022-06-11 11:09:21 +02:00
Alexandre Flament d068b67a71 Wikidata engine: minor change of the SPARQL request
The engine can be slow especially when the query won't return any answer.
See https://www.mediawiki.org/wiki/Wikidata_Query_Service/User_Manual/MWAPI#Find_articles_in_Wikipedia_speaking_about_cheese_and_see_which_Wikibase_items_they_correspond_to

Related to #1290
2022-06-11 10:50:11 +02:00
Markus Heiser 2de007138c [fix] prepare for pylint 2.14.0
Remove issue reported by Pylint 2.14.0:

- no-self-use: has been moved to optional extension [1]
- The refactoring checker now also raises 'consider-using-generator' messages
  for max(), min() and sum(). [2]

.pylintrc:
  - <option name>-hint has been removed since long, Pylint 2.14.0 raises an
    error on invalid options
  - bad-continuation and bad-whitespace have been removed [3]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-06-03 15:41:52 +02:00
Allen 43dc9eb7d6 [enh] Initial Petalsearch Images support
Upstream example query:

  https://petalsearch.com/search?query=test&channel=image&ps=50&pn=1&region=de-de&ss_mode=off&ss_type=normal

Depending on locale it will internally use some/all results from other
engines. See:

  https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/#general-indexing-search-engines
2022-06-02 14:32:37 +02:00
Émilien Devos 06cb15cbf7
Reflect the real world parameter from settings.yml 2022-05-10 20:44:35 +00:00
Markus Heiser 4326009d00 [format.python] based on bugfix in 9ed626130 2022-05-07 18:23:10 +02:00
capric98 8c7e6cc983 [fix] FutureWarning from lxml
Just in case if content is None, the original code will skip extract_text(), and
just append the None value to 'content'. So just add allow_none=True, and this
will return None without raising a ValueError in extract_text().
2022-04-22 16:09:36 +02:00
Alexandre Flament bbf13a4657
Merge pull request #1101 from allendema/pass-cookies-from-settings
[enh] Allow passing headers/cookies from settings.yml
2022-04-17 11:37:07 +02:00
Allen dae8a08089
[fix[ Update only cookies/headers 2022-04-17 11:29:23 +02:00
Allen 67fb6fba84
[lint] Remove whitespace
From GH GUI
2022-04-17 10:42:25 +02:00
Allen 15862ebc35
[mod] Pass desired ebay domain in settings
https://www.ebay.de
https://www.ebay.com
htttps://www.ebay.es

etc
2022-04-16 19:10:35 +02:00
Allen 155333f625
[enh] Allow passing headers/cookies from settings.yml
Example:

   - engine: xpath
   - search_url: example.org
   - headers: {'example_header': 'example_header'}
   - cookies: {'safesearch': 'off'}
2022-04-16 17:42:04 +02:00
Alexandre Flament c474616642
Merge pull request #1071 from return42/fix-lang-dailymotion
[fix] dailymotion engine: filter by language & country
2022-04-16 11:54:49 +02:00
Alexandre Flament 1a82e79b50 dailymotion: send valid value for the language parameter 2022-04-16 09:27:34 +02:00
Markus Heiser 3bb62823ec [fix] dailymotion engine: filter by language & country
- fix the issue of fetching more the 7000 *languages*
- improve the request function and filter by language & country
- implement time_range_support & safesearch
- add more fields to the response from dailymotion (allow_embed, length)
- better clean up of HTML tags in the 'content' field.

This is more or less a complete rework based on the '/videos' API from [1].
This patch cleans up the language list in SearXNG that has been polluted by the
ISO-639-3 2 and 3 letter codes from dailymotion languages which have never been
used.

[1] https://developers.dailymotion.com/tools/

Closes: https://github.com/searxng/searxng/issues/1065
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-04-16 09:27:34 +02:00
Jabster28 9eb1b04f48
change "Wolfram|Alpha" to "Wolfram Alpha" in search results 2022-04-12 10:37:33 +01:00
Alexandre Flament 592cea0e5e
Merge pull request #1030 from austinhuang0131/master
(feat) add jisho.org
2022-04-09 18:57:20 +02:00
Alexandre Flament 74c7aee9ec jisho : code refactoring 2022-04-09 18:01:57 +02:00
Austin Huang 19fa0095a0
(fix) satisfy the linter, and btw reduce timeout 2022-04-01 09:23:24 -04:00
Austin Huang a399248f56
update jisho.py according to suggestions 2022-04-01 09:18:19 -04:00
Alexandre FLAMENT f00cdb5e51 bing engine: _fetch_supported_languages: don't use the language code as a country
ref #1029
2022-03-31 20:03:34 +00:00
Austin Huang 934ae4e086
(feat) add jisho.org
Closes #1016
2022-03-31 14:45:39 -04:00
Alexandre Flament 378b29be2f fix startpage: update XPath in _fetch_supported_languages 2022-03-19 14:16:37 +01:00
Markus Heiser 53b5a804e2 [fix] engine mediathekviewweb: replace http links by https
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-07 19:49:16 +01:00
Markus Heiser 20f4538e13 [fix] engine: Semantic Scholar (Science) // rework & fix
Closes: https://github.com/searxng/searxng/issues/939
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-03-05 11:53:41 +01:00
Markus Heiser 8d937179ab
Merge pull request #913 from return42/add-artwork
[mod] add artwork to mixcloud & soundcloud engines
2022-02-21 22:24:40 +01:00
Markus Heiser b08b81b434 [mod] bandcamp & genius: in result set img_src instead thumbnail
Suggested-by: @dalf https://github.com/searxng/searxng/pull/900#issuecomment-1046009057
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-21 22:12:07 +01:00
Markus Heiser bded1ee280 [fix] genius: add player an avoid exceptional programming
Add player:

- The players are just playing 30sec from the title.  Some of the player will be
  blocked because of a cross-origin request and some players will link to apple
  when you press the play button.

Avoid exceptions and (and BTW improve results)

-  ERROR   searx.engines.genius          : list index out of range

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-21 22:12:07 +01:00
Markus Heiser 36aee70c24
Merge pull request #910 from tiekoetter/fix-909
[fix] google images engine: Fix 'scrap_img_by_id' function
2022-02-20 18:29:50 +01:00
Markus Heiser 2921d3cd17 [mod] add artwork to mixcloud & soundcloud engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 21:59:12 +01:00
Markus Heiser 4a28b593c2 [fix] google images engine: Fix 'scrap_img_by_id' function
The 'scrap_img_by_id' function didn't return any longer anything useful.  This
fix allows the google images engine to present the full source image instead of
only the thumbnail.

The function scrap_img_by_id() is rpelaced by a fully rewrite to parse image
URLs by a regular expression. The new function parse_urls_img_from_js(dom)
returns a mapping of data-id to image URL.

Closes: https://github.com/searxng/searxng/issues/909
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 14:33:56 +01:00
Alexandre Flament ace5401632
Merge pull request #900 from return42/fix-883
[fix] bandcamp: fix itemtype (album|track) and exceptions
2022-02-19 13:42:53 +01:00
Markus Heiser 943a7fdcb5 [mod] mediathekviewweb engine: add iframe_src and use videos template
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-19 00:50:54 +01:00
Markus Heiser 05c105b837 [fix] bandcamp: fix itemtype (album|track) and exceptions
BTW: polish implementation and show tracklist for albums

Closes: https://github.com/searxng/searxng/issues/883
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-18 22:44:43 +01:00
Markus Heiser 7352c6bc79 [mod] templates: rename field for <iframe> URL to iframe_src
Rename result field data_src to iframe_src

Suggested-by: @dalf https://github.com/searxng/searxng/pull/882#issuecomment-1037997402
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-18 19:00:49 +01:00
Markus Heiser 98cab4cf75 [mod] result_templates/default.html replace embedded HTML by data_src audio_src
Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src' (and 'audio_src'), an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-13 14:20:47 +01:00
Markus Heiser 46e131fdad [mod] result_templates/videos.html: replace embedded HTML by data_src
Embedded HTML breaks SearXNG architecture.  To modularize, HTML is generated in
the templates (oscar & simple) and result parameter 'embedded' is replaced by
'data_src', an URL for embedded content (<iframe>).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-02-13 14:20:47 +01:00
Émilien Devos 7d3e8118b0
Update the XPath for fetching the Google results 2022-02-09 14:34:14 +01:00