Commit graph

5299 commits

Author SHA1 Message Date
dalf
cfa8169509 Update searx.data - update_ahmia_blacklist.py 2023-07-29 09:25:24 +02:00
searxng-bot
f45d1920d9 [translations] update from Weblate
ba4888c96 - 2023-07-26 - return42 <markus.heiser@darmarit.de>
6ec8a8a28 - 2023-07-22 - return42 <markus.heiser@darmarit.de>
0a7b701b3 - 2023-07-24 - artnay <jiri.gronroos@iki.fi>
c0b34cbdb - 2023-07-23 - MonsoonFire <re1qnb5mq@mozmail.com>
37cbd41c2 - 2023-07-22 - return42 <markus.heiser@darmarit.de>
2023-07-29 08:15:21 +02:00
mrpaulblack
65d8b1a310 [fix] remove disabled: false from engine definitions in settings.yml
* setting disabled: false is not needed, since it is by default enabled
2023-07-22 18:19:01 +02:00
searxng-bot
51c531d450 [translations] update from Weblate
b7f1e9ae - 2023-07-17 - Hudobni Volk <hudobni.volk@tuta.io>
3c7c821e - 2023-07-16 - alextecplayz <alextec70@outlook.com>
0e305f84 - 2023-07-17 - return42 <markus.heiser@darmarit.de>
80745a22 - 2023-07-15 - tentsbet <remendne@pentrens.jp>
afef0e2e - 2023-07-16 - Salif Mehmed <mail@salif.eu>
4a7687ac - 2023-07-14 - Ivan Gabaldon <admin@inetol.net>
2023-07-22 17:47:05 +02:00
mrpaulblack
b477349824 [build] /static 2023-07-19 15:07:45 +02:00
Kiru
de5c1cedca fix "#backToTop" button always being clickable
`pointer-events` never gets set to "none" when the button is hidden,
allowing you to click the button. And your mouse further changes it's
cursor to the pointer style.
2023-07-19 15:06:09 +02:00
searxng-bot
b7b184244d [translations] update from Weblate
01350cf1 - 2023-07-13 - return42 <markus.heiser@darmarit.de>
5f037a4d - 2023-07-12 - return42 <markus.heiser@darmarit.de>
820a78ad - 2023-07-12 - return42 <markus.heiser@darmarit.de>
73037743 - 2023-07-12 - return42 <markus.heiser@darmarit.de>
e656795c - 2023-07-09 - Linerly <linerly@protonmail.com>
0ee18285 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
c087c7fb - 2023-07-08 - return42 <markus.heiser@darmarit.de>
6eb318c5 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
3b4a3d1f - 2023-07-08 - return42 <markus.heiser@darmarit.de>
b3187499 - 2023-07-09 - return42 <markus.heiser@darmarit.de>
c1226646 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
2356a402 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
c9a74b52 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
3d9f2938 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
68af8585 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
261a2a72 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
fcea15cf - 2023-07-08 - return42 <markus.heiser@darmarit.de>
7685385e - 2023-07-08 - return42 <markus.heiser@darmarit.de>
ec0a3727 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
0130ddf7 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
b93f9609 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
4a5cdcb3 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
9cba3939 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
d973d937 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
ce076245 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
5c36ccab - 2023-07-08 - return42 <markus.heiser@darmarit.de>
226ff7d4 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
8148a9ed - 2023-07-08 - return42 <markus.heiser@darmarit.de>
840bc189 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
51ffc22e - 2023-07-08 - return42 <markus.heiser@darmarit.de>
394ec63e - 2023-07-08 - return42 <markus.heiser@darmarit.de>
428c16a8 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
218cf51e - 2023-07-08 - return42 <markus.heiser@darmarit.de>
70260934 - 2023-07-09 - ghose <correo@xmgz.eu>
c6244c2b - 2023-07-08 - return42 <markus.heiser@darmarit.de>
b92dc5c1 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
43917957 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
df1bf630 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
d1c00dff - 2023-07-08 - return42 <markus.heiser@darmarit.de>
0a6da54f - 2023-07-08 - return42 <markus.heiser@darmarit.de>
12377e28 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
b5b8ea78 - 2023-07-07 - gallegonovato <fran-carro@hotmail.es>
ec31e65f - 2023-07-08 - return42 <markus.heiser@darmarit.de>
6c33b1fe - 2023-07-08 - return42 <markus.heiser@darmarit.de>
393d390c - 2023-07-08 - return42 <markus.heiser@darmarit.de>
a4f6b353 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
0f8d6b6b - 2023-07-08 - return42 <markus.heiser@darmarit.de>
67f2fc96 - 2023-07-08 - Fjuro <ifjuro@proton.me>
5f2d3f02 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
5ae2b8dc - 2023-07-08 - return42 <markus.heiser@darmarit.de>
0bd4fb1e - 2023-07-08 - return42 <markus.heiser@darmarit.de>
ce768726 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
a22ae2f2 - 2023-07-08 - return42 <markus.heiser@darmarit.de>
b5b8774f - 2023-07-08 - return42 <markus.heiser@darmarit.de>
2023-07-14 10:21:27 +02:00
Paolo Basso
cada89ee36 [feat] engine: re-enables z-library (zlibrary-global.se)
- re-enables z-library as the new domain zlibrary-global.se is now available
  from the open web.   The announcement of the domain:

    https://www.reddit.com/r/zlibrary/comments/13whe08/mod_note_zlibraryglobalse_domain_is_officially/

  It is an official domain, it requires to log in to the "personal" subdomain
  only to download files, but the search works.

- changes the result template of zlibrary to paper.html, filling the appropriate fields
- implements language filtering for zlibrary
- implement zlibrary custom filters (engine traits)
- refactor and document the zlibrary engine
2023-07-07 21:36:51 +02:00
Hackurei
cb92767f19 [feat] enigine: add CrowdView forum search engine 2023-07-07 21:36:11 +02:00
searxng-bot
4a2f310da3 [translations] update from Weblate
152f2008 - 2023-07-05 - return42 <markus.heiser@darmarit.de>
9dbf6b22 - 2023-07-01 - return42 <markus.heiser@darmarit.de>
4ad4c00f - 2023-07-01 - Bananhylsa <thayer@hjemmeserver.net>
2023-07-07 21:13:47 +02:00
Markus Heiser
5720844fcd [doc] rearranges Settings & Engines docs for better readability
We have built up detailed documentation of the *settings* and the *engines* over
the past few years.  However, this documentation was still spread over various
chapters and was difficult to navigate in its entirety.

This patch rearranges the Settings & Engines documentation for better
readability.

To review new ordered docs::

   make docs.clean docs.live

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-07-01 22:45:19 +02:00
searxng-bot
81c9a18456 [translations] update from Weblate
2238e87b - 2023-06-28 - jenishngl <jenishngl+codeberg@gmail.com>
c70d228a - 2023-06-24 - nogb <u8cn71wq@yogibo.anonaddy.me>
389c0c62 - 2023-06-24 - return42 <markus.heiser@darmarit.de>
656d9fcb - 2023-06-23 - return42 <markus.heiser@darmarit.de>
a9c9b116 - 2023-06-25 - alma <alma@users.noreply.translate.codeberg.org>
528b845f - 2023-06-24 - nogb <u8cn71wq@yogibo.anonaddy.me>
b8c50f23 - 2023-06-23 - return42 <markus.heiser@darmarit.de>
39f47c0f - 2023-06-23 - return42 <markus.heiser@darmarit.de>
ae0aa811 - 2023-06-24 - Fjuro <ifjuro@proton.me>
c8216259 - 2023-06-26 - lemonadeforlife <nahianlabiblimon44@gmail.com>
2023-06-30 11:49:07 +02:00
dalf
fbb72fc1f4 Update searx.data - update_engine_descriptions.py 2023-06-29 13:59:25 +02:00
Markus Heiser
87e7926ae9 [fix] engine: Anna's Archive - grep results from '.js-scroll-hidden' elements
The renderuing of the WEB page is very strange; except the firts position all
other positions of Anna's result page are enclosed in SGML comments.  These
cooments are *uncommented* by some JS code, see query of the class
'.js-scroll-hidden' in Anna's HTML template [1].

[1] https://annas-software.org/AnnaArchivist/annas-archive/-/blob/main/allthethings/templates/macros/md5_list.html

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Markus Heiser
e2df6b77a3 [mod] engine: Anna's Archive - additionl settings (content, sort, ext)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Markus Heiser
eafc2906f1 [mod] engine: Anna's Archive - fetch search arguments from search form
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Paolo Basso
7adb9090e5 [mod] engine: Anna's Archive - add language support 2023-06-29 09:32:57 +02:00
Paolo Basso
e5637fe7b9 [feat] engine: implementation of Anna's Archive
Anna's Archive [1] is a free non-profit online shadow library metasearch engine
providing access to a variety of book resources (also via IPFS), created by a
team of anonymous archivists [2].

[1] https://annas-archive.org/
[2] https://annas-software.org/AnnaArchivist/annas-archive
2023-06-29 09:32:57 +02:00
Markus Heiser
fd26f37073 [upd] make data.all
- ahmia_blacklist.txt
- currencies.json
- engine_descriptions.json
- engine_traits.json
- osm_keys_tags.json
- useragents.json

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 21:21:53 +02:00
Markus Heiser
efea962504 [fix] simple template: preferences - add missing icon_smal import
Related: https://github.com/searxng/searxng/commit/2149e88bdd64#r119535272
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 18:36:52 +02:00
Paolo Basso
401561cb58 [mod] engine torznab - refactor & option to hide links
- torznab engine using types and clearer code
- torznab option to hide torrent and magnet links.
- document the torznab engine
- add myself to authors

Closes: https://github.com/searxng/searxng/issues/1124
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 10:03:44 +02:00
Markus Heiser
da7c30291d [fix] Google API changed
It seems that Google is rolling out a modified WEB API [1][2].

In the past there was only the UI language in the `hl` argument but nowadays it
seems a combination of the UI language and the "search region" is mixed in this
argument and the `gl` argument has been removed.  I'm very surprised that google
is starting to mix the parameters of the UI with the parameters of the search
index.

This patch modifies the get_google_info(..) function.  Beside Google-WEB this
function is also used by other Google services, here are some examples to test
region & language of ..

- Google-WEB:    `!go dragon boat :en-CA`
- Google-News:   `!gon dragon boat :en-CA`
- Google-Videos: `!gov bmw :en-CA`
- Goolge-Images  `!goi bmw :en-CA`

- [1] https://github.com/searxng/searxng/issues/2515#issuecomment-1606294635
- [2] https://github.com/searxng/searxng/issues/2515#issuecomment-1607150817

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-26 18:28:09 +02:00
Markus Heiser
e8706fb738 [fix] engine & network issues / documentation and type annotations
This patch fixes some quirks and issues related to the engines and the network.
Each engine has its own network and this network was broken for the following
engines[1]:

- archlinux
- bing
- dailymotion
- duckduckgo
- google
- peertube
- startpage
- wikipedia

Since the files have been touched anyway, the type annotaions of the engine
modules has also been completed so that error messages from the type checker are
no longer reported.

Related and (partial) fixed issue:

- [1] https://github.com/searxng/searxng/issues/762#issuecomment-1605323861
- [2] https://github.com/searxng/searxng/issues/2513
- [3] https://github.com/searxng/searxng/issues/2515

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-25 13:58:26 +02:00
searxng-bot
2e4a435134 [translations] update from Weblate
9512b92a - 2023-06-23 - Coccocoas_Helper <coccocoahelper@gmail.com>
ca08c51e - 2023-06-23 - Coccocoas_Helper <coccocoahelper@gmail.com>
56ad4f21 - 2023-06-21 - return42 <markus.heiser@darmarit.de>
3ee419d6 - 2023-06-21 - return42 <markus.heiser@darmarit.de>
2023-06-23 09:34:46 +02:00
Markus Heiser
86db08793b [fix] implement a JSONEncoder for the json format
This patch implements a simple JSONEncoder just to fix #2502 / on the long term
SearXNG needs a data schema for the result items and a json generator for the
result list.

Closes: https://github.com/searxng/searxng/issues/2505
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-19 19:49:44 +02:00
Markus Heiser
fa1ef9a07b [mod] move some code from webapp module to webutils module (no functional change)
Over the years the webapp module became more and more a mess.  To improve the
modulaization a little this patch moves some implementations from the webapp
module to webutils module.

HINT: this patch brings non functional change

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-19 19:49:44 +02:00
searxng-bot
71b6ff07ca [translations] update from Weblate
98f61c70 - 2023-06-15 - alexgabi <alexgabi@disroot.org>
a1679b93 - 2023-06-13 - return42 <markus.heiser@darmarit.de>
ebd1d574 - 2023-06-13 - return42 <markus.heiser@darmarit.de>
b28a1da3 - 2023-06-13 - return42 <markus.heiser@darmarit.de>
56409bf0 - 2023-06-11 - return42 <markus.heiser@darmarit.de>
abc4916c - 2023-06-10 - return42 <markus.heiser@darmarit.de>
b1900abe - 2023-06-10 - return42 <markus.heiser@darmarit.de>
b48e84c4 - 2023-06-10 - return42 <markus.heiser@darmarit.de>
bf395e32 - 2023-06-10 - return42 <markus.heiser@darmarit.de>
c9c0a3c9 - 2023-06-10 - return42 <markus.heiser@darmarit.de>
3f50d31e - 2023-06-10 - return42 <markus.heiser@darmarit.de>
9da1c142 - 2023-06-09 - artnay <jiri.gronroos@iki.fi>
2023-06-16 09:20:43 +02:00
searxng-bot
1be27d5d83 [translations] update from Weblate
b40da1a3 - 2023-06-06 - return42 <markus.heiser@darmarit.de>
666ee7d4 - 2023-06-06 - return42 <markus.heiser@darmarit.de>
1e0e8ead - 2023-06-06 - return42 <markus.heiser@darmarit.de>
404b9937 - 2023-06-07 - Ivan Gabaldon <admin@inetol.net>
a627f9a1 - 2023-06-04 - return42 <markus.heiser@darmarit.de>
a234d2f8 - 2023-06-04 - gallegonovato <fran-carro@hotmail.es>
cc41f9b5 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
24651eac - 2023-06-02 - return42 <markus.heiser@darmarit.de>
c37b0627 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
9a435ea1 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
40e0adad - 2023-06-02 - return42 <markus.heiser@darmarit.de>
6833b142 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
00f397ad - 2023-06-02 - tentsbet <remendne@pentrens.jp>
7d3d4a97 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
f7d713a4 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
b1ec3160 - 2023-06-03 - ghose <correo@xmgz.eu>
04591a3a - 2023-06-02 - return42 <markus.heiser@darmarit.de>
cb3ac67c - 2023-06-02 - return42 <markus.heiser@darmarit.de>
fe81dbc7 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
7882670f - 2023-06-02 - return42 <markus.heiser@darmarit.de>
38882f3b - 2023-06-02 - return42 <markus.heiser@darmarit.de>
c6df5047 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
6ca23c3b - 2023-06-02 - return42 <markus.heiser@darmarit.de>
72f1ee09 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
2023-06-09 07:07:51 +00:00
Markus Heiser
22b13f4fa5 [mod] tools.Config.get(): add missing type annotations
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 14:07:19 +02:00
Markus Heiser
f3763d73ad [mod] limiter: blocklist and passlist (ip_lists)
A blocklist and a passlist can be configured in /etc/searxng/limiter.toml::

    [botdetection.ip_lists]
    pass_ip = [
      '51.15.252.168',  # IPv4 of check.searx.space
    ]

    block_ip = [
      '93.184.216.34',  # IPv4 of example.org
    ]

Closes: https://github.com/searxng/searxng/issues/2127
Closes: https://github.com/searxng/searxng/pull/2129
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 14:07:19 +02:00
Markus Heiser
f77807257b [fix] engines: don't spam marginalia.nu with default settings
The engine configuration of marginalia [2][3][4][5] spams marginalia.nu with
requests from SearXNG instances [1].  It is not in the interest of SearXNG to
disturb other FOSS projects, so the engine will be removed::

    - name: marginalia
      engine: json_engine
      shortcut: mar
      categories: general
      paging: false
      # Key and license: https://www.marginalia.nu/marginalia-search/api/
      # index: 0 popular, 1 blogs, 2 big_sites, 3 default, 4 experimental
      search_url: https://api.marginalia.nu/<insert your key here>/search/{query}?index=4&count=20
      results_query: results
      url_query: url
      title_query: title
      content_query: description
      timeout: 1.5
      disabled: true
      about:
        website: https://www.marginalia.nu/
        official_api_documentation: https://api.marginalia.nu/
        use_official_api: true
        require_api_key: true
        results: JSON

[1] https://github.com/searxng/searxng/issues/1673
[2] https://github.com/searxng/searxng/pull/1627
[3] https://github.com/searxng/searxng/issues/1620
[4] https://news.ycombinator.com/item?id=35874640
[5] d82a858491/code/services-satellite/api-service/src/main/java/nu/marginalia/api/svc/ResponseCache.java (L12-L20)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 08:23:17 +02:00
Markus Heiser
80aaef6c95
Merge pull request #2357 / limiter -> botdetection
The monolithic implementation of the limiter was divided into methods and
implemented in the Python package searx.botdetection.  Detailed documentation on
the methods has been added.

The methods are divided into two groups:

1. Probe HTTP headers

- Method http_accept
- Method http_accept_encoding
- Method http_accept_language
- Method http_connection
- Method http_user_agent

2. Rate limit:

- Method ip_limit
- Method link_token (new)

The (reduced) implementation of the limiter is now in the module
searx.botdetection.limiter.  The first group was transferred unchanged to this
module.  The ip_limit contains the sliding windows implemented by the limiter so
far.

This merge also fixes some long outstandig issue:

- limiter does not evaluate the Accept-Language correct [1]
- limiter needs a IPv6 prefix to block networks instead of IPs [2]

Without additional configuration the limiter works as before (apart from the
bugfixes).  For the commissioning of additional methods (link_toke), a
configuration must be made in an additional configuration file.  Without this
configuration, the limiter runs as before (zero configuration).

The ip_limit Method implements the sliding windows of the vanilla limiter,
additionally the link_token method can be used in this method.  The link_token
method can be used to investigate whether a request is suspicious. To activate
the link_token method in the ip_limit method add the following to your
/etc/searxng/limiter.toml::

    [botdetection.ip_limit]
    link_token = true


[1] https://github.com/searxng/searxng/issues/2455
[2] https://github.com/searxng/searxng/issues/2477
2023-06-03 06:00:15 +02:00
Markus Heiser
1a1ab34d9d [fix] URL percent-encoding in translations fail in babel
Closes: https://github.com/searxng/searxng/issues/2482
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-02 20:30:41 +02:00
Markus Heiser
b867c39ce0 [build] /static 2023-06-02 19:05:43 +02:00
Markus Heiser
2149e88bdd [mod] template preferences: split into elements (no functional change)
HINT: this patch has no functional change / it is the preparation for following
      changes and bugfixes

Over the years, the preferences template became an unmanageable beast.  To make
the source code more readable the monolith is splitted into elements.  The
splitting into elements also has the advantage that a new template can make use
of them.

The reversed checkbox is a quirk that is only used in the prefereces and must be
eliminated in the long term.  For this the macro 'checkbox_onoff_reversed' was
added to the preferences.html template.  The 'checkbox' macro is also a quirk of
the preferences.html we don't want to use in other templates (it is an
input-checkbox in a HTML form that was misused for status display).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-02 19:05:43 +02:00
searxng-bot
789b43ab60 [translations] update from Weblate
5344314f - 2023-05-30 - return42 <markus.heiser@darmarit.de>
ee8fd955 - 2023-06-01 - BBTranslate <357835338@qq.com>
1ce31caf - 2023-05-29 - return42 <markus.heiser@darmarit.de>
fe75c53d - 2023-05-29 - return42 <markus.heiser@darmarit.de>
ca60af52 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
f34b88f3 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
22d76a26 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
43d8c982 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
43a92e85 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
2bfc12dd - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e2b5fb5f - 2023-05-29 - return42 <markus.heiser@darmarit.de>
9f088420 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
bdf81b4c - 2023-05-29 - return42 <markus.heiser@darmarit.de>
f6a24c5d - 2023-05-30 - return42 <markus.heiser@darmarit.de>
01bcea56 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
8c0209f8 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
c629c610 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
a4e4945d - 2023-05-29 - return42 <markus.heiser@darmarit.de>
96bad166 - 2023-06-01 - mradalbert <mister.adalbert@gmail.com>
b0032d90 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
366adaef - 2023-05-29 - return42 <markus.heiser@darmarit.de>
2e4271bf - 2023-05-29 - return42 <markus.heiser@darmarit.de>
c5856fd6 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
790b5a6f - 2023-05-29 - return42 <markus.heiser@darmarit.de>
6c9f92a9 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
f5a6a35d - 2023-05-29 - return42 <markus.heiser@darmarit.de>
4c8eeb32 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
7b8c0618 - 2023-05-30 - nicfab <nicfab@icloud.com>
4e851dd4 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
0fa6006e - 2023-05-29 - return42 <markus.heiser@darmarit.de>
877f4396 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
c3bb1da7 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e66e6fae - 2023-05-30 - return42 <markus.heiser@darmarit.de>
1cac4771 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
949e994f - 2023-05-28 - ghose <correo@xmgz.eu>
8b181582 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
65f8fb93 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
e5088e1c - 2023-05-29 - return42 <markus.heiser@darmarit.de>
f151100c - 2023-05-29 - return42 <markus.heiser@darmarit.de>
51d169fa - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e68ac961 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
c336c5a1 - 2023-05-31 - dom1torii <djmdmitri.a@gmail.com>
88bda0d0 - 2023-05-30 - Fijxu <fijxu@zzls.xyz>
6a57c29a - 2023-05-29 - return42 <markus.heiser@darmarit.de>
0c585b4d - 2023-05-30 - return42 <markus.heiser@darmarit.de>
e8ca9891 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
817b2da4 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
6b2508aa - 2023-05-29 - return42 <markus.heiser@darmarit.de>
3a5b1842 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
fd826ab8 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
a3938c43 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
30cad6b2 - 2023-05-30 - Ivan Gabaldon <admin@inetol.net>
e997055f - 2023-05-30 - return42 <markus.heiser@darmarit.de>
de6bd3d8 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
ba5e0129 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e48fd248 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
b0e7d3f1 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
2023-06-02 09:34:36 +02:00
Markus Heiser
80af38d37b [mod] increase SUSPICIOUS_IP_WINDOW from one day to 30 days
In my tests I see bots rotating IPs (with endless IP lists).  If such a bot has
100 IPs and has three attempts (SUSPICIOUS_IP_MAX = 3) then it can successfully
send up to 300 requests in one day while rotating the IP.  To block the bots for
a longer period of time the SUSPICIOUS_IP_WINDOW, as the time period in which an
IP is observed, must be increased.

For normal WEB-browsers this is no problem, because the SUSPICIOUS_IP_WINDOW is
deleted as soon as the CSS with the token is loaded.

SUSPICIOUS_IP_WINDOW = 3600 * 24 * 30
  Time (sec) before sliding window for one suspicious IP expires.

SUSPICIOUS_IP_MAX = 3
  Maximum requests from one suspicious IP in the :py:obj:`SUSPICIOUS_IP_WINDOW`."""

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 16:00:49 +02:00
Markus Heiser
281e36f4b7 [fix] limiter: replace real_ip by IPv4/v6 network
Closes: https://github.com/searxng/searxng/issues/2477
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 15:51:14 +02:00
Markus Heiser
38431d2e14 [fix] correct determination of the IP for the request
For correct determination of the IP to the request the function
botdetection.get_real_ip() is implemented.  This fonction is used in the
ip_limit and link_token method of the botdetection and it is used in the
self_info plugin.

A documentation about the X-Forwarded-For header has been added.

[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566211059

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
b8c7c2c9aa [mod] botdetection - improve ip_limit and link_token methods
- counting requests in LONG_WINDOW and BURST_WINDOW is not needed when the
  request is validated by the link_token method [1]

- renew a ping-key on validation [2], this is needed for infinite scrolling,
  where no new token (CSS) is loaded. / this does not fix the BURST_MAX issue in
  the vanilla limiter

- normalize the counter names of the ip_limit method to 'ip_limit.*'

- just integrate the ip_limit method straight forward in the limiter plugin /
  non intermediate code --> ip_limit now returns None or a werkzeug.Response
  object that can be passed by the plugin to the flask application / non
  intermediate code that returns a tuple

[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566113277
[2] https://github.com/searxng/searxng/pull/2357#discussion_r1208542206
[3] https://github.com/searxng/searxng/pull/2357#issuecomment-1566125979

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
52f1452c09 [mod] limiter: ip_limt - monitore suspicious IPs
To intercept bots that get their IPs from a range of IPs, there is a
``SUSPICIOUS_IP_WINDOW``.  In this window the suspicious IPs are stored for a
longer time.  IPs stored in this sliding window have a maximum of
``SUSPICIOUS_IP_MAX`` accesses before they are blocked.  As soon as the IP makes
a request that is not suspicious, the sliding window for this IP is droped.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
9d7456fd6c [fix] limiter.toml: botdetection.ip_limit turn off link_token by default
To activate the ``link_token`` method in the ``ip_limit`` method add the
following to your ``/etc/searxng/limiter.toml``::

   [botdetection.ip_limit]
   link_token = true

Related: https://github.com/searxng/searxng/pull/2357#issuecomment-1554116941
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
66fdec0eb9 [mod] limiter: add config file /etc/searxng/limiter.toml
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
1ec325adcc [mod] limiter -> botdetection: modularization and documentation
In order to be able to meet the outstanding requirements, the implementation is
modularized and supplemented with documentation.

This patch does not contain functional change, except it fixes issue #2455

----

Aktivate limiter in the settings.yml and simulate a bot request by::

    curl -H 'Accept-Language: de-DE,en-US;q=0.7,en;q=0.3' \
         -H 'Accept: text/html'
         -H 'User-Agent: xyz' \
         -H 'Accept-Encoding: gzip' \
         'http://127.0.0.1:8888/search?q=foo'

In the LOG:

    DEBUG   searx.botdetection.link_token : missing ping for this request: .....

Since ``BURST_MAX_SUSPICIOUS = 2`` you can repeat the query above two time
before you get a "Too Many Requests" response.

Closes: https://github.com/searxng/searxng/issues/2455
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
Markus Heiser
5226044c13 [mod] limiter: add random token to the limiter URL
By adding a random component in the limiter URL a bot can no longer send a ping
by request a static URL.

Related: https://github.com/searxng/searxng/pull/2357#issuecomment-1518525094
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
Markus Heiser
dba569462d [mod] limiter: reduce request rates for requests without a ping
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
dalf
c1b5ff7e1c Update searx.data - update_engine_descriptions.py 2023-05-29 07:28:50 +02:00
dalf
2ba50d392e Update searx.data - update_currencies.py 2023-05-29 07:28:18 +02:00
dalf
cb843ef13c Update searx.data - update_engine_traits.py 2023-05-29 07:27:40 +02:00
dalf
512e001277 Update searx.data - update_firefox_version.py 2023-05-29 07:27:07 +02:00