Commit Graph

45 Commits

Author SHA1 Message Date
jazzzooo b6316020f7 [fix] spelling 2023-10-19 18:02:05 +02:00
Alexandre Flament 89fe1c6c1a [mod] searx.network: memory optimization
Avoid to create a SSLContext in AsyncHTTPTransportNoHttp

See:
* 0f61aa58d6/httpx/_transports/default.py (L271)
* https://github.com/encode/httpx/issues/2298
2023-08-28 10:25:44 +02:00
Alexandre Flament 3ea3ade01b Bump httpx 0.21.2 from to 0.24.1 2023-08-28 10:25:44 +02:00
Markus Heiser 8fa54ffddf [mod] Shuffle httpx's default ciphers of a SSL context randomly.
From the analyse of @9Ninety [1] we know that DDG (and may be other engines / I
have startpage in mind) does some kind of TLS fingerprint to block bots.

This patch shuffles the default ciphers from httpx to avoid a cipher profile
that is known to httpx (and blocked by DDG).

[1] https://github.com/searxng/searxng/issues/2246#issuecomment-1467895556

----

From `What Is TLS Fingerprint and How to Bypass It`_

> When implementing TLS fingerprinting, servers can't operate based on a
> locked-in whitelist database of fingerprints.  New fingerprints appear
> when web clients or TLS libraries release new versions. So, they have to
> live off a blocklist database instead.
> ...
> It's safe to leave the first three as is but shuffle the remaining ciphers
> and you can bypass the TLS fingerprint check.

.. _What Is TLS Fingerprint and How to Bypass It:
   https://www.zenrows.com/blog/what-is-tls-fingerprint#how-to-bypass-tls-fingerprinting

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Closes: https://github.com/searxng/searxng/issues/2246
2023-03-19 13:40:31 +01:00
Alexandre Flament 37addec69e search.suspended_time settings: bug fixes
* fix type in settings.yml: replace suspend_times by suspended_times
* always use delay defined in settings.yml:
  * HTTP status 402 and 403: read the value from settings.yml instead of using the hardcoded value of 1 day.
  * startpage engine: CAPTCHA suspend the engine for one day instead of one week
2023-01-28 10:24:14 +00:00
Léon Tiekötter 0cedb1c6d8 Add search.suspended_times settings
Make suspended_time changeable in settings.yml
Allow different values to be set for different exceptions.

Co-authored-by: Alexandre Flament <alex@al-f.net>
2023-01-15 09:00:32 +00:00
Evhorizon 1517724615
Update network.py 2022-11-06 20:35:30 +01:00
Alexandre Flament 32e8c2cf09 searx.network: add "verify" option to the networks
Each network can define a verify option:
* false to disable certificate verification
* a path to existing certificate.

SearXNG uses SSL_CERT_FILE and SSL_CERT_DIR when they are defined
see https://www.python-httpx.org/environment_variables/#ssl_cert_file
2022-10-14 13:59:22 +00:00
Alexandre Flament a1e8af0796 bing.py: resolve bing.com/ck/a redirections
add a new function searx.network.multi_requests to send multiple HTTP requests at once
2022-07-08 22:02:21 +02:00
Markus Heiser 2de007138c [fix] prepare for pylint 2.14.0
Remove issue reported by Pylint 2.14.0:

- no-self-use: has been moved to optional extension [1]
- The refactoring checker now also raises 'consider-using-generator' messages
  for max(), min() and sum(). [2]

.pylintrc:
  - <option name>-hint has been removed since long, Pylint 2.14.0 raises an
    error on invalid options
  - bad-continuation and bad-whitespace have been removed [3]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/summary.html#removed-checkers
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.14/full.html#what-s-new-in-pylint-2-14-0
[2] https://pylint.pycqa.org/en/latest/whatsnew/2/2.6/summary.html#summary-release-highlights

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-06-03 15:41:52 +02:00
Alexandre Flament f3f61df6a0 [mod] remove deprecate code
remove code to support Python 3.5 and Python 3.6
2022-01-29 08:54:12 +01:00
Alexandre Flament 4f82ab36a9
Merge pull request #817 from not-my-profile/pyright-01
Pyright 01
2022-01-27 23:18:41 +01:00
Léon Tiekötter 0cbf73a1f4
Allow 'using_tor_proxy' to be set for each engine individually
Check 'using_tor_proxy' for each engine individually instead of checking globally

[fix] searx.network: update _rdns test to the last httpx version

Co-authored-by: Alexandre Flament <alex@al-f.net>
2022-01-27 22:37:02 +01:00
Martin Fischer b767752d0c [pyright:basic] searx.webapp 2022-01-27 22:17:16 +01:00
Martin Fischer def62c3a47 [typing] add type hints for dictionaries 2022-01-17 11:42:48 +01:00
Alexandre Flament e64c3deab7 [mod] upgrade httpx 0.21.2
httpx 0.21.2 and httpcore 0.14.4 fix multiple issues:
* https://github.com/encode/httpx/releases/tag/0.21.2
* https://github.com/encode/httpcore/releases/tag/0.14.4

so most of the workarounds in searx.network have been removed.
2022-01-05 18:46:00 +01:00
Markus Heiser 3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-12-27 09:26:22 +01:00
Alexandre Flament f9c6393502 [enh] verify that Tor proxy works every time searx starts
based on @MarcAbonce commit on searx
2021-10-12 21:01:02 +02:00
Alexandre Flament a9c3c88cc0 [mod] searx.network.stream returns a tuple (response, stream) 2021-09-28 19:33:29 +02:00
Alexandre Flament 29893cf816 [fix] searx.network.stream: fix memory leak 2021-09-28 19:28:12 +02:00
Alexandre Flament dc74df3a55
Merge pull request #261 from dalf/upgrade_httpx
[upd] upgrade httpx 0.19.0
2021-09-17 11:48:37 +02:00
Markus Heiser 443bf35e09 [pylint] fix global-variable-not-assigned issues
If there is no write access, there is no need for global.  Remove global
statement if there is no assignment.

global-variable-not-assigned:
  Using global for names but no assignment is done Used when a variable is
  defined through the "global" statement but no assignment to this variable is
  done.

In Pylint 2.11 the global-variable-not-assigned checker now catches global
variables that are never reassigned in a local scope and catches (reassigned)
functions [1][2]

[1] https://pylint.pycqa.org/en/latest/whatsnew/2.11.html
[2] https://github.com/PyCQA/pylint/issues/1375

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-17 10:14:27 +02:00
Alexandre Flament b10403d3a1 [mod] searx.network: remove redundant code
searx.client.new_client: the proxies parameter is a dictonnary,
and the protocol (key of the dictionnary) is already normalized
(see usage of searx.network.network.PROXY_PATTERN_MAPPING)
2021-09-17 10:06:24 +02:00
Alexandre Flament 8e73438cbe [upd] upgrade httpx 0.19.0
adjust searx.network module to the new internal API
see https://github.com/encode/httpx/pull/1522
2021-09-17 10:06:22 +02:00
Alexandre Flament 602cbc2c99
Merge pull request #297 from dalf/engine-logger-enh
debug mode: more readable logging
2021-09-14 07:06:28 +02:00
Alexandre Flament 2b53d718e4 [fix] PR #257: use the image_proxy network instead of the default network 2021-09-11 11:15:51 +02:00
Alexandre Flament 91a6d80e82 [mod] debug mode: log HTTP requests with network name
For example wikipedia requests use the logger name "searx.network.wikipedia"

Log is disable when searx_debug is False
2021-09-11 10:13:14 +02:00
Markus Heiser 2a3b9a2e26 [pylint] searx: drop no longer needed 'missing-function-docstring'
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-09-07 13:34:35 +02:00
Markus Heiser 03e7d423be [pylint] Pylint 2.10 - unused-variable
Pylint 2.10 fixed [1]:

  Fixed bug with cell-var-from-loop checker: it no longer has false negatives
  when both unused-variable and used-before-assignment are disabled.

[1] https://pylint.pycqa.org/en/latest/whatsnew/2.10.html

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-08-31 10:51:50 +02:00
Alexandre Flament 3b0f70ed0f [mod] /image_proxy: use HTTP/1 instead of HTTP/2
httpx: HTTP/2 is slow when a lot data is downloaded.
https://github.com/dalf/pyhttp-benchmark

also, the usage of HTTP/1 decreases the load average
2021-08-24 14:51:20 +02:00
Alexandre Flament 43fcaa642a [fix] image_proxy: always close the httpx respone
previously, when the content type was not an image and some other error,
the httpx response was not closed
2021-08-24 14:51:20 +02:00
Alexandre Flament df15c655f7 [mod] /image_proxy: don't decompress images 2021-08-24 14:51:20 +02:00
Alexandre Flament 4b07df62e5 [mod] move all default settings into searx.settings_defaults 2021-06-01 08:10:15 +02:00
Markus Heiser 2128022f72 [coding-style] searx/network/network.py - normalized indentations
No functional change!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 17:48:47 +02:00
Markus Heiser 1499002ceb [coding-style] searx/network/client.py - normalized indentations
No functional change!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 17:44:43 +02:00
Markus Heiser e4211da639 [pylint] searx/network/raise_for_httperror.py
No functional change!

- fix messages from pylint
- add ``global NETWORKS``
- normalized indentations

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 17:40:10 +02:00
Markus Heiser 44efa911ba [pylint] searx/network/network.py & add global (NETWORKS)
No functional change!

- fix messages from pylint
- add ``global NETWORKS``

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 17:39:46 +02:00
Markus Heiser b595c482d0 [pylint] searx/network/client.py & add global (TRANSPORT_KWARGS)
No functional change!

- fix messages from pylint
- add ``global TRANSPORT_KWARGS``
- normalized python_socks imports

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 17:39:37 +02:00
Markus Heiser 8033518899 [pylint] searx/network/__init__.py & add global (THREADLOCAL)
No functional change!

- fix messages from pylint
- add ``global THREADLOCAL``
- normalized various indentation

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-05-24 17:39:14 +02:00
Alexandre Flament ec83493538 [fix] offline engine: don't crash on time recording 2021-05-22 15:17:18 +02:00
Alexandre Flament 0f4e995ab4 [mod] searx.network.client: the same configuration reuses the same ssl.SSLContext
before there was one ssl.SSLContext per client.

see https://github.com/encode/httpx/issues/978
2021-05-05 20:36:37 +02:00
Alexandre Flament 283ae7bfad [fix] searx.network: fix rare cases where LOOP is None
* searx.network.client.LOOP is initialized in a thread
* searx.network.__init__ imports LOOP which may happen
  before the thread has initialized LOOP

This commit adds a new function "searx.network.client.get_loop()"
to fix this issue
2021-04-27 17:47:36 +02:00
Alexandre Flament 7acd7ffc02 [enh] rewrite and enhance metrics 2021-04-21 16:24:46 +02:00
Alexandre Flament aae7830d14 [mod] refactoring: processors
Report to the user suspended engines.

searx.search.processor.abstract:
* manages suspend time (per network).
* reports suspended time to the ResultContainer (method extend_container_if_suspended)
* adds the results to the ResultContainer (method extend_container)
* handles exceptions (method handle_exception)
2021-04-21 16:24:46 +02:00
Alexandre Flament d14994dc73 [httpx] replace searx.poolrequests by searx.network
settings.yml:

* outgoing.networks:
   * can contains network definition
   * propertiers: enable_http, verify, http2, max_connections, max_keepalive_connections,
     keepalive_expiry, local_addresses, support_ipv4, support_ipv6, proxies, max_redirects, retries
   * retries: 0 by default, number of times searx retries to send the HTTP request (using different IP & proxy each time)
   * local_addresses can be "192.168.0.1/24" (it supports IPv6)
   * support_ipv4 & support_ipv6: both True by default
     see https://github.com/searx/searx/pull/1034
* each engine can define a "network" section:
   * either a full network description
   * either reference an existing network

* all HTTP requests of engine use the same HTTP configuration (it was not the case before, see proxy configuration in master)
2021-04-12 17:25:56 +02:00