[fix] replace language_support by a language/region view

Most engines response best results if a region is selected, most often a
language is also in the properties of a engine and sometimes the language
argument is just the language of the UI.  Most often choosing a language has a
minor effect on the result list.

To summarize:

Some engines have language codes (e.g. `ca`) in their properties, some have
region codes (e.g. `ca-ES`), some have regions and languages in their properties
and other engine do not have any language or region support.

In the past we generalized *language* over all kind of engines without taking
into mind that most engines gave best result when there is a region selected.

  This *language-centric* view in SearXNG is misleading when we need
  region-codes to parameterize engine request!

This patch replaces the *language-centric* view by a "language / region" view.

Conclusions:

With regions we can't say any longer that a engine supports *this or that*
language, by example: when the user selects 'zh' and a engine supports only
region codes like 'zh-TW' or 'zh-CN' we do not what results the user expects /
similar with 'en' or 'fr when the engine needs a region tag.

- Since it is unclear what the user expects by his language selection, we can't
  assert a property that says: "supports_selected_language"

  The feature is replaced in the UI by the wider sense of "language_support",
  what stands for:

    The engine has some kind of language support, either
    by a region tag or by a language tag.

- A list of "supported_languages" does not make sense when there are regions
  responsible for the result of an engine.

  The "supported_languages" has been removed from the /config URL

- The `has_language` test in the `searx/search/checker/impl.py` has been removed
  since it does not cover engines with region support.

  If there is a need for such a test we can implement new tests after all
  engines with language (region) support has been moved to the *supported
  properites* scheme (see searxng_extra/update/update_languages.py)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
Markus Heiser 2022-04-19 10:49:49 +02:00
parent 5fca26f0b3
commit 4a28a5e6f6
5 changed files with 4 additions and 44 deletions

View file

@ -293,13 +293,6 @@ class ResultContainerTests:
if len(self.result_container.answers) == 0: if len(self.result_container.answers) == 0:
self._record_error('No answer') self._record_error('No answer')
def has_language(self, lang):
"""Check at least one title or content of the results is written in the `lang`.
Detected using pycld3, may be not accurate"""
if lang not in self.languages:
self._record_error(lang + ' not found')
def not_empty(self): def not_empty(self):
"""Check the ResultContainer has at least one answer or infobox or result""" """Check the ResultContainer has at least one answer or infobox or result"""
result_types = set() result_types = set()

View file

@ -196,16 +196,6 @@ class OnlineProcessor(EngineProcessor):
'test': ['unique_results'], 'test': ['unique_results'],
} }
if getattr(self.engine, 'supported_languages', []):
tests['lang_fr'] = {
'matrix': {'query': 'paris', 'lang': 'fr'},
'result_container': ['not_empty', ('has_language', 'fr')],
}
tests['lang_en'] = {
'matrix': {'query': 'paris', 'lang': 'en'},
'result_container': ['not_empty', ('has_language', 'en')],
}
if getattr(self.engine, 'safesearch', False): if getattr(self.engine, 'safesearch', False):
tests['safesearch'] = {'matrix': {'query': 'porn', 'safesearch': (0, 2)}, 'test': ['unique_results']} tests['safesearch'] = {'matrix': {'query': 'porn', 'safesearch': (0, 2)}, 'test': ['unique_results']}

View file

@ -344,7 +344,6 @@
<th scope="col">{{ _("Allow") }}</th> <th scope="col">{{ _("Allow") }}</th>
<th scope="col">{{ _("Engine name") }}</th> <th scope="col">{{ _("Engine name") }}</th>
<th scope="col">{{ _("Shortcut") }}</th> <th scope="col">{{ _("Shortcut") }}</th>
<th scope="col" class="col-stat">{{ _("Selected language") }}</th>
<th scope="col" class="col-stat">{{ _("SafeSearch") }}</th> <th scope="col" class="col-stat">{{ _("SafeSearch") }}</th>
<th scope="col" class="col-stat">{{ _("Time range") }}</th> <th scope="col" class="col-stat">{{ _("Time range") }}</th>
<th scope="col">{{ _("Response time") }}</th> <th scope="col">{{ _("Response time") }}</th>
@ -356,7 +355,6 @@
<th scope="col" class="text-right">{{ _("Response time") }}</th> <th scope="col" class="text-right">{{ _("Response time") }}</th>
<th scope="col" class="text-right">{{ _("Time range") }}</th> <th scope="col" class="text-right">{{ _("Time range") }}</th>
<th scope="col" class="text-right">{{ _("SafeSearch") }}</th> <th scope="col" class="text-right">{{ _("SafeSearch") }}</th>
<th scope="col" class="text-right">{{ _("Selected language") }}</th>
<th scope="col" class="text-right">{{ _("Shortcut") }}</th> <th scope="col" class="text-right">{{ _("Shortcut") }}</th>
<th scope="col" class="text-right">{{ _("Engine name") }}</th> <th scope="col" class="text-right">{{ _("Engine name") }}</th>
<th scope="col" class="text-right">{{ _("Allow") }}</th> <th scope="col" class="text-right">{{ _("Allow") }}</th>
@ -383,7 +381,6 @@
{{- engine_about(search_engine, 'tooltip_' + categ + '_' + search_engine.name) -}} {{- engine_about(search_engine, 'tooltip_' + categ + '_' + search_engine.name) -}}
</th> </th>
<td class="name">{{ shortcuts[search_engine.name] }}</td> <td class="name">{{ shortcuts[search_engine.name] }}</td>
<td>{{ support_toggle(supports[search_engine.name]['supports_selected_language']) }}</td>
<td>{{ support_toggle(supports[search_engine.name]['safesearch']) }}</td> <td>{{ support_toggle(supports[search_engine.name]['safesearch']) }}</td>
<td>{{ support_toggle(supports[search_engine.name]['time_range_support']) }}</td> <td>{{ support_toggle(supports[search_engine.name]['time_range_support']) }}</td>
{{ engine_time(search_engine.name, 'text-right') }} {{ engine_time(search_engine.name, 'text-right') }}
@ -395,7 +392,6 @@
{{ engine_time(search_engine.name, 'text-left') }} {{ engine_time(search_engine.name, 'text-left') }}
<td>{{ support_toggle(supports[search_engine.name]['time_range_support']) }}</td> <td>{{ support_toggle(supports[search_engine.name]['time_range_support']) }}</td>
<td>{{ support_toggle(supports[search_engine.name]['safesearch']) }}</td> <td>{{ support_toggle(supports[search_engine.name]['safesearch']) }}</td>
<td>{{ support_toggle(supports[search_engine.name]['supports_selected_language']) }}</td>
<td>{{ shortcuts[search_engine.name] }}</td> <td>{{ shortcuts[search_engine.name] }}</td>
<th scope="row" data-engine-name="{{ search_engine.name }}"><span>{% if search_engine.enable_http %}{{ icon('exclamation-sign', 'No HTTPS') }}{% endif %}{{ search_engine.name }}</span>{{ engine_about(search_engine) }}</th> <th scope="row" data-engine-name="{{ search_engine.name }}"><span>{% if search_engine.enable_http %}{{ icon('exclamation-sign', 'No HTTPS') }}{% endif %}{{ search_engine.name }}</span>{{ engine_about(search_engine) }}</th>
<td class="onoff-checkbox"> <td class="onoff-checkbox">

View file

@ -297,7 +297,7 @@
<th class="engine_checkbox">{{ _("Allow") }}</th>{{- "" -}} <th class="engine_checkbox">{{ _("Allow") }}</th>{{- "" -}}
<th class="name">{{ _("Engine name") }}</th>{{- "" -}} <th class="name">{{ _("Engine name") }}</th>{{- "" -}}
<th class="shortcut">{{ _("Shortcut") }}</th>{{- "" -}} <th class="shortcut">{{ _("Shortcut") }}</th>{{- "" -}}
<th>{{ _("Supports selected language") }}</th>{{- "" -}} <th>{{ _("Language / Region") }}</th>{{- "" -}}
<th>{{ _("SafeSearch") }}</th>{{- "" -}} <th>{{ _("SafeSearch") }}</th>{{- "" -}}
<th>{{ _("Time range") }}</th>{{- "" -}} <th>{{ _("Time range") }}</th>{{- "" -}}
<th>{{ _("Response time") }}</th>{{- "" -}} <th>{{ _("Response time") }}</th>{{- "" -}}
@ -323,7 +323,7 @@
{{- engine_about(search_engine) -}} {{- engine_about(search_engine) -}}
</th>{{- "" -}} </th>{{- "" -}}
<td class="shortcut">{{ shortcuts[search_engine.name] }}</td>{{- "" -}} <td class="shortcut">{{ shortcuts[search_engine.name] }}</td>{{- "" -}}
<td>{{ checkbox(None, supports[search_engine.name]['supports_selected_language'], true) }}</td>{{- "" -}} <td>{{ checkbox(None, supports[search_engine.name]['language_support'], true) }}</td>{{- "" -}}
<td>{{ checkbox(None, supports[search_engine.name]['safesearch'], true) }}</td>{{- "" -}} <td>{{ checkbox(None, supports[search_engine.name]['safesearch'], true) }}</td>{{- "" -}}
<td>{{ checkbox(None, supports[search_engine.name]['time_range_support'], true) }}</td>{{- "" -}} <td>{{ checkbox(None, supports[search_engine.name]['time_range_support'], true) }}</td>{{- "" -}}
{{- engine_time(search_engine.name) -}} {{- engine_time(search_engine.name) -}}

View file

@ -1008,7 +1008,6 @@ def preferences():
'rate80': rate80, 'rate80': rate80,
'rate95': rate95, 'rate95': rate95,
'warn_timeout': e.timeout > settings['outgoing']['request_timeout'], 'warn_timeout': e.timeout > settings['outgoing']['request_timeout'],
'supports_selected_language': _is_selected_language_supported(e, request.preferences),
'result_count': result_count, 'result_count': result_count,
} }
# end of stats # end of stats
@ -1058,20 +1057,17 @@ def preferences():
# supports # supports
supports = {} supports = {}
for _, e in filtered_engines.items(): for _, e in filtered_engines.items():
supports_selected_language = _is_selected_language_supported(e, request.preferences)
safesearch = e.safesearch safesearch = e.safesearch
time_range_support = e.time_range_support time_range_support = e.time_range_support
for checker_test_name in checker_results.get(e.name, {}).get('errors', {}): for checker_test_name in checker_results.get(e.name, {}).get('errors', {}):
if supports_selected_language and checker_test_name.startswith('lang_'): if safesearch and checker_test_name == 'safesearch':
supports_selected_language = '?'
elif safesearch and checker_test_name == 'safesearch':
safesearch = '?' safesearch = '?'
elif time_range_support and checker_test_name == 'time_range': elif time_range_support and checker_test_name == 'time_range':
time_range_support = '?' time_range_support = '?'
supports[e.name] = { supports[e.name] = {
'supports_selected_language': supports_selected_language,
'safesearch': safesearch, 'safesearch': safesearch,
'time_range_support': time_range_support, 'time_range_support': time_range_support,
'language_support': e.language_support
} }
return render( return render(
@ -1106,16 +1102,6 @@ def preferences():
) )
def _is_selected_language_supported(engine, preferences: Preferences): # pylint: disable=redefined-outer-name
language = preferences.get_value('language')
if language == 'all':
return True
x = match_language(
language, getattr(engine, 'supported_languages', []), getattr(engine, 'language_aliases', {}), None
)
return bool(x)
@app.route('/image_proxy', methods=['GET']) @app.route('/image_proxy', methods=['GET'])
def image_proxy(): def image_proxy():
# pylint: disable=too-many-return-statements, too-many-branches # pylint: disable=too-many-return-statements, too-many-branches
@ -1331,10 +1317,6 @@ def config():
if not request.preferences.validate_token(engine): if not request.preferences.validate_token(engine):
continue continue
supported_languages = engine.supported_languages
if isinstance(engine.supported_languages, dict):
supported_languages = list(engine.supported_languages.keys())
_engines.append( _engines.append(
{ {
'name': name, 'name': name,
@ -1343,7 +1325,6 @@ def config():
'enabled': not engine.disabled, 'enabled': not engine.disabled,
'paging': engine.paging, 'paging': engine.paging,
'language_support': engine.language_support, 'language_support': engine.language_support,
'supported_languages': supported_languages,
'safesearch': engine.safesearch, 'safesearch': engine.safesearch,
'time_range_support': engine.time_range_support, 'time_range_support': engine.time_range_support,
'timeout': engine.timeout, 'timeout': engine.timeout,