Commit Graph

4789 Commits

Author SHA1 Message Date
Markus Heiser
d5ceb4c91b [mod] document server:public_instance & remove it out of the botdetection
- the option server:public_instance lacks some documentation
- the processing of this option belongs in the limiter and not
  in botdetection module

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-02 10:08:01 +01:00
Markus Heiser
bfabb1a67a [mod] isolation of botdetection from the limiter
This patch was inspired by the discussion around PR-2882 [2].  The goals of this
patch are:

1. Convert plugin searx.plugin.limiter to normal code [1]
2. isolation of botdetection from the limiter [2]
3. searx/{tools => botdetection}/config.py and drop searx.tools
4. in URL /config, 'limiter.enabled' is true only if the limiter is really
   enabled (Redis is available).

This patch moves all the code that belongs to botdetection into namespace
searx.botdetection and code that belongs to limiter is placed in namespace
searx.limiter.

Tthe limiter used to be a plugin at some point botdetection was added, it was
not a plugin.  The modularization of these two components was long overdue.
With the clear modularization, the documentation could then also be organized
according to the architecture.

[1] https://github.com/searxng/searxng/pull/2882
[2] https://github.com/searxng/searxng/pull/2882#issuecomment-1741716891

To test:

- check the app works without the limiter, check `/config`
- check the app works with the limiter and with the token, check `/config`
- make docs.live .. and read
  - http://0.0.0.0:8000/admin/searx.limiter.html
  - http://0.0.0.0:8000/src/searx.botdetection.html#botdetection

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-02 10:08:01 +01:00
sev
d9585cebdd Check public_instance in simple theme
Fix #2975
2023-11-02 10:08:01 +01:00
Markus Heiser
47d5b2f332 [fix] limit maximum page number of a search query to page 50.
To test this PR run a local instance and try to query page 51:

    http://127.0.0.1:8888/search?q=foo&pageno=51

A parameter exception will be raised:

    searx.exceptions.SearxParameterException: Invalid value "51" for parameter pageno

And the client will receive a HTTP 400 (Bad request).

Closes https://github.com/searxng/searxng/issues/2972

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-02 10:08:01 +01:00
dalf
07d112bc7e Update searx.data - update_engine_traits.py 2023-11-02 10:08:01 +01:00
dalf
bb88f74cbe Update searx.data - update_wikidata_units.py 2023-11-02 10:08:01 +01:00
dalf
7c5875465b Update searx.data - update_firefox_version.py 2023-11-02 10:08:01 +01:00
dalf
1ab543e3ae Update searx.data - update_currencies.py 2023-11-02 10:08:01 +01:00
dalf
17ac055c6a Update searx.data - update_engine_descriptions.py 2023-11-02 10:08:01 +01:00
dalf
84b37a9362 Update searx.data - update_ahmia_blacklist.py 2023-11-02 10:08:01 +01:00
searxng-bot
35ec3b5927 [translations] update from Weblate
4e5e5db44 - 2023-10-26 - quenty_occitania <quentinantonin@free.fr>
e1a8d3508 - 2023-10-26 - quenty_occitania <quentinantonin@free.fr>
84bddfb89 - 2023-10-26 - return42 <markus.heiser@darmarit.de>
d67a4114d - 2023-10-26 - quenty_occitania <quentinantonin@free.fr>
62fe8e328 - 2023-10-26 - return42 <markus.heiser@darmarit.de>
6e37ab975 - 2023-10-26 - quenty_occitania <quentinantonin@free.fr>
2cdab3247 - 2023-10-25 - SomeTr <SomeTr@users.noreply.translate.codeberg.org>
cf7ea7234 - 2023-10-23 - clsty <celestial.y@outlook.com>
0ea313893 - 2023-10-23 - return42 <markus.heiser@darmarit.de>
22151e440 - 2023-10-23 - return42 <markus.heiser@darmarit.de>
e4eaf42b6 - 2023-10-22 - clsty <celestial.y@outlook.com>
2023-11-02 10:08:01 +01:00
Markus Heiser
eea673831b [fix] HTMLParser: undocumented not implemented method
In python versions <py3.10 there is an issue with an undocumented method
HTMLParser.error() [1][2] that was deprecated in Python 3.4 and removed
in Python 3.5.

To be compatible to higher versions (>=py3.10) an error method is implemented
which throws an AssertionError exception like the higher Python versions do [3].

[1] https://github.com/python/cpython/issues/76025
[2] https://bugs.python.org/issue31844
[3] https://github.com/python/cpython/pull/8562

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-02 10:08:01 +01:00
searxng-bot
98cd1028b6 [translations] update from Weblate
2325f1583 - 2023-10-20 - return42 <markus.heiser@darmarit.de>
5090c6a8e - 2023-10-18 - return42 <markus.heiser@darmarit.de>
3a38219d8 - 2023-10-18 - return42 <markus.heiser@darmarit.de>
94a9f4164 - 2023-10-16 - return42 <markus.heiser@darmarit.de>
bdbeb4b30 - 2023-10-16 - return42 <markus.heiser@darmarit.de>
f9b483f48 - 2023-10-16 - return42 <markus.heiser@darmarit.de>
7f1ca1997 - 2023-10-16 - return42 <markus.heiser@darmarit.de>
c5a701dc4 - 2023-10-14 - alexgabi <alexgabi@disroot.org>
2023-11-02 10:08:01 +01:00
Markus Heiser
a8d42bcad3 [build] /static 2023-11-02 10:08:01 +01:00
Markus Heiser
e09663fe81 [build] /static 2023-11-02 10:02:27 +01:00
9a8b853e03 fix logo size for ultra small screen 2023-11-02 09:54:48 +01:00
b87ee6be8a new version + new build 2023-10-19 18:14:37 +02:00
Markus Heiser
08551001d4 [build] /static 2023-10-19 18:05:38 +02:00
Bnyro
0971fb0aee [fix] search.js: crash on homepage when setting form listeners 2023-10-19 18:05:12 +02:00
Émilien (perso)
85e6b680ff fixing results parsing brave 2023-10-19 18:05:12 +02:00
searxng-bot
1b9eea9c2e [translations] update from Weblate
74e401e68 - 2023-10-09 - return42 <markus.heiser@darmarit.de>
897dd8db1 - 2023-10-09 - return42 <markus.heiser@darmarit.de>
6ed046a90 - 2023-10-09 - tentsbet <remendne@pentrens.jp>
815ecb336 - 2023-10-09 - return42 <markus.heiser@darmarit.de>
65d9a0c2f - 2023-10-07 - return42 <markus.heiser@darmarit.de>
3ec249ef9 - 2023-10-07 - return42 <markus.heiser@darmarit.de>
53dc6c108 - 2023-10-07 - return42 <markus.heiser@darmarit.de>
2023-10-19 18:05:12 +02:00
Hackurei
e1efe3af93 [fix] hackernews keyerror problem 2023-10-19 18:05:12 +02:00
Hackurei
88003588ad [fix] imgur - incorrect wikidata id 2023-10-19 18:05:12 +02:00
Markus Heiser
d65753428b [fix] ddg-lite & ddg-extra: don't send empty vqd value
DDG's bot detection is sensitive to the vqd value.  For some search terms (such
as extremely long search terms that are often sent by bots), no vqd value can be
determined.

If SearXNG cannot determine a vqd value, then no request should go out to
DDG (WEB): a request with a wrong vqd value leads to DDG temporarily putting
SearXNG's IP on a block list.

Requests from IPs in this block list run into timeouts.

Not sure, but it seems the block list is a sliding window: to get my IP rid from
the bot list I had to cool down my IP for 1h (send no requests from that IP to
DDG).

Since such issues can't reproduce in a local instance I tested this patch 24h on
my public SearXNG instance: There are still errors (rare), but the reliability
is still 100%.

Related:

- https://github.com/searxng/searxng/pull/2922
- https://github.com/searxng/searxng/pull/2923

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:05:12 +02:00
Markus Heiser
dac63f7764 [fix] ddg-lite vqd value: some search terms do not have a vqd value
Some search terms do not have results and therefore no vqd value

BTW: remove a leftover from 9197efa

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:04:53 +02:00
Markus Heiser
4cb8cbc6c3 [fix] duckduckgo lite engine: set HTTP header 'Referer'
We have had problems with this before, the bot protection from ddg-lite seems to
have included this referer in the rating [1][2].

From reverse engineering:

- The Referer ``https://google.com/`` was set in commt 257dc7d6c4 --> DDG lite
  does not like this referer anymore!

- The 'Referer' header is only set on second and follow up pages but not on the
  first page

- The vqd value is not needed on the first page, the ddg-lite client sets this
  value only on follow up pages / this can help to reduce the vqd requests from
  SearXNG.

Related to 'Referer' header & ddg requests:

[1] https://github.com/searxng/searxng/pull/2161
[2] https://github.com/searxng/searxng/pull/2081

Closes: https://github.com/searxng/searxng/issues/2796
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:04:53 +02:00
Bnyro
16ce63b8a2 [mod] yacy: use official instance by default and fix crashes 2023-10-19 18:04:53 +02:00
Alex Balgavy
c2f26a560a [mod] add hotkeys option to settings.yml
The change in the hotkey mechanism introduced in 317db5b04 does not allow
configuration via `settings.yml`.  This commit adds that functionality.

Closes: #2898
2023-10-19 18:04:53 +02:00
Hackurei
ce6b7c9f75 [feat] implement hackernews engine - news.ycombinator.com 2023-10-19 18:04:53 +02:00
Aine
fa830b375b [fix] matrixrooms add proper MRS integration
Related:

- https://github.com/searxng/searxng/issues/2918
2023-10-19 18:04:53 +02:00
Bnyro
34d0d2c9ab [feat] duckduckgo: support for videos and news 2023-10-19 18:04:53 +02:00
Bnyro
84b4932e21 [fix] kickass: crash when no results 2023-10-19 18:04:53 +02:00
Bnyro
057ae1767b [mod] piped: always show video length if available 2023-10-19 18:04:53 +02:00
Bnyro
a742cfe2e7 [feat] engine: implementation of mastodon 2023-10-19 18:04:53 +02:00
searxng-bot
b6cae7d53f [translations] update from Weblate
68d743281 - 2023-10-05 - return42 <markus.heiser@darmarit.de>
42f091b7f - 2023-10-05 - return42 <markus.heiser@darmarit.de>
2479c0d7b - 2023-10-05 - ghose <correo@xmgz.eu>
a4e6cd592 - 2023-10-05 - return42 <markus.heiser@darmarit.de>
9d4e5f5c3 - 2023-10-05 - return42 <markus.heiser@darmarit.de>
b79d44775 - 2023-10-05 - gallegonovato <fran-carro@hotmail.es>
746291184 - 2023-10-06 - return42 <markus.heiser@darmarit.de>
f24d7e8b1 - 2023-10-05 - return42 <markus.heiser@darmarit.de>
6140911f9 - 2023-10-05 - Fjuro <ifjuro@proton.me>
2023-10-19 18:04:53 +02:00
Markus Heiser
ba5f9bd4ac [mod] engine - simplify region & lang handling, make filters configurable
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:04:53 +02:00
Bnyro
70f6a84aff [feat] engine: implementation of radio-browser.info 2023-10-19 18:04:53 +02:00
Markus Heiser
4de856bf19 [fix] limiter / botdetection: remove http_connection method
Related:

- https://github.com/searxng/searxng/issues/2892#issuecomment-1742153932

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:04:53 +02:00
Jinyuan Huang
993ea9c1ec [typo] solved a typo in yahoo error message. 2023-10-19 18:04:53 +02:00
Jinyuan Huang
7de4e3b48f [fix] Bug: Yahoo results for simplified Chinese search sometimes have the first character cut off #2866
Co-authored-by: Blair Noctis <n@sail.ng>
2023-10-19 18:04:53 +02:00
Bnyro
f6ee574508 [fix] emojipedia: fix engine 2023-10-19 18:04:52 +02:00
Markus Heiser
f4f2af7db6 [fix] Revision of the Bing engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:04:52 +02:00
jazzzooo
ae91b993b1 [fix] engine - bing fix search, pagination, remove safesearch 2023-10-19 18:04:52 +02:00
Bnyro
b732fbfe5b [feat] engine: implementation of pinterest 2023-10-19 18:04:52 +02:00
Bnyro
32e04f05bf [fix] matrixrooms.info: pagination not working properly 2023-10-19 18:04:52 +02:00
Markus Heiser
cef109e0e0 [fix] engine - moviepilot instead of thumbnail use img_src
Instead of thumbnail use img_src in the result item, otherwise the "movies"
categories looks clunky.

Related:

- b4e0d2eedc (r128785388)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-19 18:04:52 +02:00
Bnyro
3b51a0bf0f [mod] tagesschau: add option to only use tagesschau urls 2023-10-19 18:04:52 +02:00
Bnyro
07d932fba0 [feat] engine: implementation of matrixrooms.info 2023-10-19 18:04:52 +02:00
Bnyro
3974a88dae [feat] engine: implementation of tootfinder 2023-10-19 18:04:52 +02:00
Bnyro
aad75ae867 [mod] add movies category for tmdb, imdb and moviepilot 2023-10-19 18:04:52 +02:00