This patch adds the boilerplate code, needed to fetch properties from engines.
In the past we only fetched *languages* but some engines need *regions* to
parameterize the engine request.
To fit into our *fetch language* procedures the boilerplate is implemented in
the `searxng_extra/update/update_languages.py` and the *engine_properties* are
stored along in the `searx/data/engines_languages.json`.
This implementation is downward compatible to the `_fetch_fetch_languages()`
infrastructure we have. If there comes the day we have all
`_fetch_fetch_languages()` implementations moved to `_fetch_engine_properties()`
implementations, we can rename the files and scripts.
The new type `engine_properties` is a dictionary with keys `languages` and
`regions`. The values are dictionaries to map from SearXNG's language & region
to option values the engine does use::
engine_properties = {
'type' : 'engine_properties', # <-- !!!
'regions': {
# 'ca-ES' : <engine's region name>
},
'languages': {
# 'ca' : <engine's language name>
},
}
Similar to the `supported_languages`, in the engine the properties are available
under the name `supported_properties`.
Initial we start with languages & regions, but in a wider sense the type is
named *engine properties*. Engines can store in whatever options they need and
may be in the future there is a need to fetch additional or complete different
properties.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- fix the issue of fetching more the 7000 *languages*
- improve the request function and filter by language & country
- implement time_range_support & safesearch
- add more fields to the response from dailymotion (allow_embed, length)
- better clean up of HTML tags in the 'content' field.
This is more or less a complete rework based on the '/videos' API from [1].
This patch cleans up the language list in SearXNG that has been polluted by the
ISO-639-3 2 and 3 letter codes from dailymotion languages which have never been
used.
[1] https://developers.dailymotion.com/tools/
Closes: https://github.com/searxng/searxng/issues/1065
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
There are several reasons why we should prefer markdown-it-py over mistletoe:
- Get identical rendering results in SearXNG's `/info` pages and the SearXNG's
project documentation which is build by Sphinx-doc.
In the Sphinx-doc we use the MyST parser to render Markdown and the MyST
parser itself is built on top of the markdown-it-py package.
- markdown-it-py has a typographer that supports *replacements*
and *smartquotes* (e.g. em-dash, copyright, ellipsis, ...) [1]
- markdown-it-py is much more flexible compared to mistletoe [2]
- markdown-it-py is the fastest CommonMark compliant parser in python [3]
[1] https://markdown-it-py.readthedocs.io/en/latest/using.html#typographic-components
[2] https://markdown-it-py.readthedocs.io/en/latest/plugins.html
[3] https://markdown-it-py.readthedocs.io/en/latest/other.html#performance
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>