Commit Graph

425 Commits

Author SHA1 Message Date
Markus Heiser c80e82a855 [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the DuckDuckGo (DDG) engines including a
improved language and region handling based on the enigne.traits_v1 data.

- DDG Lite
- DDG Instant Answer API
- DDG Images
- DDG Weather

docs/src/searx.engine.duckduckgo.rst:
  Online documentation of the DDG engines (make docs.live)

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - "duckduckgo" (WEB),
  - "duckduckgo images" and
  - "duckduckgo weather"

  and remove data from obsolete data type "supported_languages".

searx/autocomplete.py:
  Reversed engineered Autocomplete from DDG.  Supports DDG's languages.

searx/engines/duckduckgo.py:
  - fetch_traits():  Fetch languages & regions from DDG.

  - get_ddg_lang(): Get DDG's language identifier from SearXNG's locale.  DDG
    defines its languages by region codes.  DDG-Lite does not offer a language
    selection to the user, only a region can be selected by the user.

  - Cache ``vqd`` value: The vqd value depends on the query string and is needed
    for the follow up pages or the images loaded by a XMLHttpRequest (DDG
    images).  The ``vqd`` value of a search term is stored for 10min in the
    redis DB.

  - DDG Lite engine: reversed engineered request method with improved Language
    and region support and better ``vqd`` handling.

searx/engines/duckduckgo_definitions.py: DDG Instant Answer API
  The *instant answers* API does not support languages, or at least we could not
  find out how language support should work.  It seems that most of the features
  are based on English terms.

searx/engines/duckduckgo_images.py: DDG Images
  Reversed engineered request method.  Improved language and region handling
  based on cookies and the enigne.traits_v1 data.  Response: add image format to
  the result list

searx/engines/duckduckgo_weather.py: DDG Weather
  Improved language and region handling based on cookies and the
  enigne.traits_v1 data.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser e9afc4f8ce [mod] Startpage: reversed engineered & upgrade to data_type: traits_v1
One reason for the often seen CAPTCHA of the Startpage requests are the
incomplete requests SearXNG sends to startpage.com: this patch is a complete new
implementation of the ``request()`` function, reversed engineered from the
Startpage's search form.  The new implementation:

- use traits of data_type: traits_v1 and drop deprecated data_type: supported_languages
- adds time-range support
- adds save-search support
- fix searxng/searxng/issues 1884
- fix searxng/searxng/issues 1081 --> improvements to avoid CAPTCHA

In preparation for more categories (News, Images, Videos ..) from Startpage, the
variable ``startpage_categ`` was set up.  The default value is ``web`` and other
categories from Startpage are not yet implemented.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 858aa3e604 [mod] wikipedia & wikidata: upgrade to data_type: traits_v1
BTW this fix an issue in wikipedia: SearXNG's locales zh-TW and zh-HK are now
using language `zh-classical` from wikipedia (and not `zh`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser e0a6ca96cc [doc] add a description of bing engines (web, news, video, images)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser c9cd376186 [mod] replace searx.languages by searx.sxng_locales
With the language and region tags from the EngineTraitsMap the handling of
SearXNG's tags of languages and regions has been normalized and is no longer
a *mystery*.  The "languages" became "locales" that are supported by babel and
by this, the update_engine_traits.py can be simplified a lot.

Other code places can be simplified as well, but these simplifications
should (respectively can) only be done when none of the engines work with the
deprecated EngineTraits.supported_languages interface anymore.

This commit replaces searx.languages by searx.sxng_locales and fix the naming of
some names from "language" to "locale" (e.g. language_codes --> sxng_locales).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 61383edb27 [mod] Startpage: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Startpage engine.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser a7fe22770a [mod] Peertube: re-engineered & upgrade to data_type: traits_v1
- fetch_traits(): Fetch languages from peertube's search-index source code.

  [mod] Include migration of the request methode from 'supported_languages'
        to 'traits' (EngineTraits) object.
  [fix] old supported_languages_url is no longer valid since the sources
        has been moved to a different path.

- fixed code to pass pylint
- request(): complete re-implementation based on the API docs [1]
- response(): complete re-implementation, adds serveral fields missed before
- add source code documentation

[1] https://docs.joinpeertube.org/api-rest-reference.html#tag/Search/operation/searchVideos

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 6e5f22e558 [mod] replace engines_languages.json by engines_traits.json
Implementations of the *traits* of the engines.

Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*.  Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.

To load traits from the persistence::

    searx.enginelib.traits.EngineTraitsMap.from_data()

For new traits new properties can be added to the class::

    searx.enginelib.traits.EngineTraits

.. hint::

   Implementation is downward compatible to the deprecated *supported_languages
   method* from the vintage implementation.

   The vintage code is tagged as *deprecated* an can be removed when all engines
   has been ported to the *traits method*.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Alexandre Flament d669da81fb
Merge pull request #2027 from dalf/fix_2018
Add "auto" as a language.
2023-02-20 12:17:38 +01:00
Alexandre Flament 6748e8e2d5 Add "Auto-detected" as a language.
When the user choose "Auto-detected", the choice remains on the following queries.
The detected language is displayed.

For example "Auto-detected (en)":
* the next query language is going to be auto detected
* for the current query, the detected language is English.

This replace the autodetect_search_language plugin.
2023-02-17 15:17:36 +00:00
Markus Heiser 5820dc78ce [doc] slight improvements to the doc of the settings (base_url)
Closes: https://github.com/searxng/searxng/issues/2190

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-17 12:08:58 +01:00
blob42 27809c84f8 [doc] add example for enabling an engine disabled by default 2023-02-12 18:33:38 +01:00
Markus Heiser 031162be04 [doc] settings.py document search.suspended_times
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-01-29 19:26:16 +00:00
Markus Heiser feccee01c0 [doc] Add doc-strings to searx.exceptions
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-01-29 19:06:19 +01:00
Alexandre Flament a9d6f7532a weblate: migration to https://translate.codeberg.org/ 2023-01-21 15:45:12 +00:00
Julian 6e8c7873ee
Correct my small mistake 2023-01-04 20:07:51 +01:00
Julian 2b886ab269
Correct a small mistake 2023-01-04 13:41:41 +01:00
Alexandre Flament 9e9f57e48b
Merge pull request #1954 from dalf/fix.redis.init.2
[fix] follow up of PR-1856
2022-12-14 07:08:19 +01:00
Markus Heiser ed901ab18e [mod] improve 'Autodetect search language' plugin
- Add documentation to the plugin
- Harmonize FastText language model with SearXNG's language model

Reosurces::

    import fasttext                                    # --> +10 MB
    fasttext.load_model(str(data_dir / 'lid.176.ftz')) # --> +4MB

Suggested-by: @dalf

- To speed up and simplify the deployment use fasttext-wheel instead of fasttext
- Building numpy on the Alpine Linux of docker-images takes ages --> install
  py3-numpy from Alpines package manager (apk)
- Alpine Linux on docker-images (musl libc) do not support fasttext-wheel (gnu
  libc) --> patch Dockerfile and build from fastetxt:

     sed -i s/fasttext-wheel/fasttext/ requirements.txt

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-12-11 11:26:07 +01:00
Alexandre Flament 3050e2b6e8 [fix] documentation about update-searxng.rst 2022-12-10 10:06:54 +01:00
Alexandre Flament b971167ced move searx.shared.redisdb to searx.redisdb 2022-12-10 09:26:38 +01:00
Alexandre FLAMENT e92755d358 Initialize Redis in searx/webapp.py
settings.yml:
* The default URL was unix:///usr/local/searxng-redis/run/redis.sock?db=0
* The default URL is now "false"

The default URL makes the log difficult to deal with:
if the admin didn't install a Redis instance, the logs record a false error.

It worked before because SearXNG initialized the Redis connection when the limiter started.

In this commit, SearXNG initializes Redis in searx/webapp.py
so various components can use Redis without taking care of the initialization step.
2022-11-05 17:45:52 +01:00
Alexandre Flament 32e8c2cf09 searx.network: add "verify" option to the networks
Each network can define a verify option:
* false to disable certificate verification
* a path to existing certificate.

SearXNG uses SSL_CERT_FILE and SSL_CERT_DIR when they are defined
see https://www.python-httpx.org/environment_variables/#ssl_cert_file
2022-10-14 13:59:22 +00:00
Alexandre Flament a3148e5115
Merge pull request #1814 from return42/fix-typos
[fix] typos / reported by @kianmeng in searx PR-3366
2022-09-28 09:22:02 +02:00
Markus Heiser ba8959ad7c [fix] typos / reported by @kianmeng in searx PR-3366
[PR-3366] https://github.com/searx/searx/pull/3366

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-27 18:32:14 +02:00
Markus Heiser 52023e3d6e [fix] doc of the paper.html template (isbn, issn)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-25 15:46:29 +02:00
Markus Heiser f08165f524 [docs] add description of the field 'type' from paper.html template
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-24 10:56:07 +02:00
Alexandre Flament d6446be38f [mod] science category: various update of about PR 1705 2022-09-23 20:52:55 +02:00
Markus Heiser 08b8859705 [doc] paper.html result template
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-23 20:45:58 +02:00
Markus Heiser ad8ffd222c [mod] option 'ui: cache_url:' to configure internet cache or archive
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-09-04 09:42:20 +02:00
Markus Heiser 8df1f0c47e [mod] add 'Accept-Language' HTTP header to online processores
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).

- add new engine option: send_accept_language_header

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-01 17:01:59 +02:00
Markus Heiser 48968bf46a [doc] list of changes that affect the infrastructure
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-31 13:08:06 +02:00
Markus Heiser bded8750d5 [mod] fix minor leftovers from PR #1332
Related: https://github.com/searxng/searxng/pull/1332
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-30 16:13:47 +02:00
Léon Tiekötter e5323b8aa2 [docs] corrections from @tiekoetter's review 2022-07-30 13:39:35 +02:00
Markus Heiser 6fbffe9d20 [docs] add section "Migrate and stay tuned!"
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-30 13:39:35 +02:00
Markus Heiser ed8a169029 [doc] update documentation of the installation procedures
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-30 13:39:35 +02:00
Markus Heiser 782f73540e [utils/searxng.sh] implement new script to install SearXNG
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-30 13:39:35 +02:00
Markus Heiser 81bba44869 [install scripts] rename SEARX_<name> variables to SEARXNG_<name>
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-30 13:39:35 +02:00
Markus Heiser 5795c1971f [fix] update documentation of Search-API (/dev)
This patch fixes a leftover from [#1548], the list of the plugins was not
up-to-date:

- HTTPS_rewrite has been removed (247c46c6b)
- DOAI_rewrite is named 'Open_Access_DOI_rewrite' (575159b)

[#1548] https://github.com/searxng/searxng/pull/1548

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-25 17:22:31 +02:00
Markus Heiser eb85474920 [fix] update documentation of the Search-API (/dev)
Closes: https://github.com/searxng/searxng/issues/1541
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-24 20:15:03 +02:00
Solirs 6d646129c3 [mod] add tor_check plugin - convenient tor checking trough searxng 2022-07-19 07:34:54 +02:00
Brock Vojković 84e2a3bd3f Add infinite scroll as a setting in settings.yml 2022-07-09 17:33:25 +00:00
Alexandre Flament df837d8b1b
Merge pull request #1428 from return42/fix-center_aligment
fix typo and document preference 'center_alignment' in the 'ui' section
2022-07-07 09:43:12 +02:00
Markus Heiser d3226b3df5 [fix] Sphinx 5.x: will warn about missleading extlink definitions
This patch fixes the WARNING messages that pops up since Sphinx 5.x:

    WARNING: extlinks: Sphinx-6.0 will require a caption string to contain
             exactly one '%s' and all other '%' need to be escaped as '%%'.

[1] https://www.sphinx-doc.org/en/master/usage/extensions/extlinks.html

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-05 17:10:19 +02:00
Markus Heiser eb5bea16ff [fix] sphinx 5.x: add `nav.contents` everywhere that `div.topic` is used
Previously, docutils produced `div.topic` for the contents directive, the latest
version produces `nav.contents`.  This means that those tables of contents
change appearance when switching to docutils 0.18 [1][2].

[1] https://github.com/sphinx-doc/sphinx/pull/10535/commits/5806f0a
[2] https://github.com/sphinx-doc/sphinx/issues/10534

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-05 17:05:06 +02:00
Markus Heiser d8de994e0f [docs] document preference 'center_alignment' in the 'ui' section.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-03 17:58:40 +02:00
Markus Heiser faf56d4f96 [docs] add documentation about the `general.donation_url:` setting
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-07-02 13:50:06 +02:00
Alexandre Flament 8c7235d5db
Update docs/donate.rst
Co-authored-by: Émilien Devos <contact@emiliendevos.be>
2022-06-29 21:02:46 +02:00
Alexandre FLAMENT 0e503c990a Move donation page to docs.searxng.org and link to it from instances
Close #1378
2022-06-29 17:26:19 +00:00
Markus Heiser b224761a1b [doc] intersphinx: fix https://python-babel.github.io/flask-babel
The URL https://flask-babel.tkte.ch/ is no longer valid [1].

[1] https://github.com/python-babel/flask-babel/commit/0847cc6284

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-06-22 09:22:21 +02:00