Wikimedia

Wikipedia

This module implements the Wikipedia engine. Some of this implementations are shared by other engines:

The list of supported languages is fetched from the article linked by list_of_wikipedias.

Unlike traditional search engines, wikipedia does not support one Wikipedia for all languages, but there is one Wikipedia for each supported language. Some of these Wikipedias have a LanguageConverter enabled (rest_v1_summary_url).

A LanguageConverter (LC) is a system based on language variants that automatically converts the content of a page into a different variant. A variant is mostly the same language in a different script.

PR-2554:

The Wikipedia link returned by the API is still the same in all cases (https://zh.wikipedia.org/wiki/出租車) but if your browser’s Accept-Language is set to any of zh, zh-CN, zh-TW, zh-HK or .. Wikipedia’s LC automatically returns the desired script in their web-page.

To support Wikipedia’s LanguageConverter, a SearXNG request to Wikipedia uses get_wiki_params and wiki_lc_locale_variants' in the :py:obj:`fetch_wikimedia_traits function.

To test in SearXNG, query for !wp 出租車 with each of the available Chinese options:

  • !wp 出租車 :zh should show 出租車

  • !wp 出租車 :zh-CN should show 出租车

  • !wp 出租車 :zh-TW should show 計程車

  • !wp 出租車 :zh-HK should show 的士

  • !wp 出租車 :zh-SG should show 德士

searx.engines.wikipedia.fetch_wikimedia_traits(engine_traits: EngineTraits)[source]

Fetch languages from Wikipedia. Not all languages from the list_of_wikipedias are supported by SearXNG locales, only those known from searx.locales.LOCALE_NAMES or those with a minimal editing depth.

The location of the Wikipedia address of a language is mapped in a custom field (wiki_netloc). Here is a reduced example:

traits.custom['wiki_netloc'] = {
    "en": "en.wikipedia.org",
    ..
    "gsw": "als.wikipedia.org",
    ..
    "zh": "zh.wikipedia.org",
    "zh-classical": "zh-classical.wikipedia.org"
}
searx.engines.wikipedia.get_wiki_params(sxng_locale, eng_traits)[source]

Returns the Wikipedia language tag and the netloc that fits to the sxng_locale. To support LanguageConverter this function rates a locale (region) higher than a language (compare wiki_lc_locale_variants).

searx.engines.wikipedia.request(query, params)[source]

Assemble a request (wikipedia rest_v1 summary API).

searx.engines.wikipedia.list_of_wikipedias = 'https://meta.wikimedia.org/wiki/List_of_Wikipedias'

List of all wikipedias

searx.engines.wikipedia.rest_v1_summary_url = 'https://{wiki_netloc}/api/rest_v1/page/summary/{title}'
wikipedia rest_v1 summary API:

The summary response includes an extract of the first paragraph of the page in plain text and HTML as well as the type of page. This is useful for page previews (fka. Hovercards, aka. Popups) on the web and link previews in the apps.

HTTP Accept-Language header (send_accept_language_header):

The desired language variant code for wikis where LanguageConverter is enabled.

searx.engines.wikipedia.send_accept_language_header = True

The HTTP Accept-Language header is needed for wikis where LanguageConverter is enabled.

searx.engines.wikipedia.wiki_lc_locale_variants = {'zh': ('zh-CN', 'zh-HK', 'zh-MO', 'zh-MY', 'zh-SG', 'zh-TW'), 'zh-classical': ('zh-classical',)}

Mapping rule of the LanguageConverter to map a language and its variants to a Locale (used in the HTTP Accept-Language header). For example see LC Chinese.

searx.engines.wikipedia.wikipedia_article_depth = 'https://meta.wikimedia.org/wiki/Wikipedia_article_depth'

The editing depth of Wikipedia is one of several possible rough indicators of the encyclopedia’s collaborative quality, showing how frequently its articles are updated. The measurement of depth was introduced after some limitations of the classic measurement of article count were realized.

Wikidata

This module implements the Wikidata engine. Some implementations are shared from Wikipedia.

searx.engines.wikidata.fetch_traits(engine_traits: EngineTraits)[source]

Uses languages evaluated from wikipedia.fetch_wikimedia_traits and removes

  • traits.custom['wiki_netloc']: wikidata does not have net-locations for the languages and the list of all

  • traits.custom['WIKIPEDIA_LANGUAGES']: not used in the wikipedia engine

searx.engines.wikidata.get_thumbnail(img_src)[source]

Get Thumbnail image from wikimedia commons

Images from commons.wikimedia.org are (HTTP) redirected to upload.wikimedia.org. The redirected URL can be calculated by this function.