Anecdotally, using SearX over unreliable proxies,
like tor, seems to be quite error prone.
SearX puts quite an effort to measure the
performance and reliability of engines, most
likely owning to those aspects being of
significant concern.
The patch here proposes to mitigate related
problems, by issuing concurrent redundant requests
through the specified proxies at once, returning
the first response that is not an error.
The functionality is enabled using the:
`proxy_request_redundancy` parameter within the
outgoing network settings or the engine settings.
Example:
```yaml
outgoing:
request_timeout: 8.0
proxies:
"all://":
- socks5h://tor:9050
- socks5h://tor1:9050
- socks5h://tor2:9050
- socks5h://tor3:9050
proxy_request_redundancy: 4
```
In this example, each network request will be
send 4 times, once through every proxy. The
first (non-error) response wins.
In my testing environment using several tor proxy
end-points, this approach almost entirely
removes engine errors related to timeouts
and denied requests. The latency of the
network system is also improved.
The implementation, uses a
`AsyncParallelTransport(httpx.AsyncBaseTransport)`
wrapper to wrap multiple sub-trasports,
and `asyncio.wait` to wait on the first completed
request.
The existing implementation of the network
proxy cycling has also been moved into the
`AsyncParallelTransport` class, which should
improve network client memoization and
performance.
TESTED:
- unit tests for the new functions and classes.
- tested on desktop PC with 10+ upstream proxies
and comparable request redundancy.
The use of img_src AND thumbnail in the default results makes no sense (only a
thumbnail is needed). In the current state this is rather confusing, because
img_src is displayed like a thumbnail (small) and thumbnail is displayed like an
image (large).
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
The names of the links are rather tags than real names, and they sometimes vary
greatly in their spelling:
- GitHub: github, Github
- Source code: Repository, SCM, Project Source Code
- Documentation: docs, Documentation
It was standardized to terms such as 'Source code' and 'Documentation', as
translations already exist for these terms.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This patch is a leftover from [1] in which the WIKIDATA_UNITS values has become
a dictionary.
[1] https://github.com/searxng/searxng/pull/3378
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
- l10n support: parse and format decimal numbers by babel
- ability to add additional units
- improved unit detection (symbols are not unique)
- support for alias units (0,010C to F --> 32,018 °F)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Another trip into the hell of dependencies: docutils tends to put major changes
in minor patches: the executables have been renamed / e.g.
rst2html.py --> rts2html
so we have to use docutils at least from version 0.21.2, but this version of
docutils is only supported by myst-parser from version 3.0.1 on.
Additionally, docutils decided to drop python 3.8 in version 0.21 [1]
Further, linuxdoc needed an update to cope with docutils 0.21 [2]
[1] https://docutils.sourceforge.io/RELEASE-NOTES.html#release-0-21-2024-04-09
[2] https://github.com/return42/linuxdoc/pull/36
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Startpage has changed its HTML layout, classes like ``w-gl__result__main`` do no
longer exists and the result items have been slightly changed in their
structure.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Previously only result urls were set to open in new tab by default, this should
make the behaviour consistent.
Also adds the missing rel="noreferrer" to the anchor tag. Although this should
not be needed as long as the `referrer-policy: no-referrer` header is set, it's
always nice to play safer than to have to say sorry. For example some reverse
proxy configurations might strip off unwhitelisted headers in which case it's
nice to have this set.
Sometimes the URL prefix switches from a http to a https, this patch harden the
code that removes the URL prefix from wikidata Q-name, issue has been reported
in [1].
[1] https://github.com/searxng/searxng/pull/3437#issuecomment-2082121730
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
In the "Engines" tab on searx.space [1] nearly all engines report a
TimeoutException: yep engine
As documented in issue #2444 [2], this problem can be fixed by increasing the
timeout. Note: on a local instance (`make run`) the timeout of 3sec was
sufficient / at least in my local test, but the balance of searx.space leads me
to believe that this tight timeout is usually not sufficient.
[1] https://searx.space/
[2] https://github.com/searxng/searxng/issues/2444
Closes https://github.com/searxng/searxng/issues/3421
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
CCC media serves several recording formats, to name a few:
- application/x-subrip
- video/mp4
- video/webm
- audio/mpeg
- audio/opus
- audio/mpeg
not all of them are suitable for a video frame. If available we should prefer
video/mp4 due to its minimal data rates.
Closes: https://github.com/searxng/searxng/issues/3431
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>