Commit Graph

2550 Commits

Author SHA1 Message Date
Alexandre Flament 925bb561a2
Merge pull request #2352 from dalf/no_http
Remove HTTP connections as much as possible
2020-12-06 10:18:49 +01:00
Alexandre Flament 28cc644f0a [fix] duckduckgo_definitions: fix relative image URL
ddg returns relative URL to https://duckduckgo.com/
2020-12-06 10:14:09 +01:00
Alexandre Flament cdceec1cbb
Merge pull request #2354 from dalf/fix-wikipedia
[fix] wikipedia engine: don't raise an error when the query is not found
2020-12-04 20:42:45 +01:00
Alexandre Flament f0054d67f1 [fix] wikipedia engine: don't raise an error when the query is not found
Add a new parameter "raise_for_status", set by default to True.
When True, any HTTP status code >= 300 raise an exception ( #2332 )
When False, the engine can manage the HTTP status code by itself.
2020-12-04 20:04:39 +01:00
Alexandre Flament bef2f2efa8 [fix] wikidata: fix crash when the item has no description at all and at least one URL. 2020-12-04 17:17:20 +01:00
Alexandre Flament 244e812f37 [fix] remove searx/engines/filecrop.py (dead code) 2020-12-04 16:48:15 +01:00
Alexandre Flament 0226ae69d3 [fix] dbpedia autocomplete (and use HTTPS) 2020-12-04 16:47:43 +01:00
Alexandre Flament fa909c7c02 [mod] stackoverflow & yandex: detect CAPTCHA response 2020-12-03 13:23:19 +01:00
Alexandre Flament d0d7a3e1c2 [fix] settings_loader: don't crash when a key exists only in the user settings
typical use case: result_proxy can be defined in the user settings,
but are not defined the default settings.yml
2020-12-03 11:35:12 +01:00
Alexandre Flament 64cccae99e [mod] various engines: use eval_xpath* functions and searx.exceptions.*
Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-12-03 10:22:48 +01:00
Alexandre Flament ad72803ed9 [mod] xpath, 1337x, acgsou, apkmirror, archlinux, arxiv: use eval_xpath_* functions 2020-12-03 10:22:48 +01:00
Alexandre Flament de887c6347 [mod] bing_news: use eval_xpath_getindex
remove unused function searx.utils.list_get
2020-12-03 10:22:48 +01:00
Alexandre Flament 1d0c368746 [enh] record details exception per engine
add an new API /stats/errors
2020-12-03 10:22:48 +01:00
Alexandre Flament 6b5a578822
Merge pull request #2285 from return42/fix-digg
bugfix & refactor digg engine
2020-12-03 10:20:40 +01:00
mrwormo 2b153db74c Disable Invidious engine by default 2020-12-02 21:56:11 +01:00
Markus Heiser bef185723a [refactor] digg - improve results and clean up source code
- strip html tags and superfluous quotation marks from content
- remove not needed cookie from request
- remove superfluous imports

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-02 21:54:27 +01:00
Markus Heiser 6b0a896f01 [mod] digg - pylint searx/engines/digg.py
Eliminate redundant file names which are tested by test.pylint and ignored by
test.pep8

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-02 20:59:30 +01:00
Markus Heiser 173b744ef0 [fix] digg - the ISO time stamp of published date has been changed
Error pattern::

    Engines cannot retrieve results:
    digg (unexpected crash time data '2020-10-16T14:09:55Z' does not match format '%Y-%m-%d %H:%M:%S')

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-12-02 20:40:12 +01:00
Alexandre Flament b00d108673 [mod] pylint: numerous minor code fixes 2020-12-01 15:21:19 +01:00
Alexandre Flament 9ed3ee2beb [mod] wikidata: WDGeoAttribute class: doesn't change the method signature of get_str 2020-12-01 15:21:17 +01:00
Alexandre Flament 3cfef61123 [fix] /stats: report error percentage instead of error count
This bug exists since the PR https://github.com/searx/searx/pull/751
2020-12-01 15:07:09 +01:00
Alexandre Flament a1e6bc4cee
Merge pull request #2291 from dalf/settings2
[enh] user settings can relied on the default settings
2020-12-01 14:57:12 +01:00
Alexandre Flament f1e016e9ea [mod] oscar theme: added option into gruntfile.js for generate sourceMap
Credits go to @mrwormo  (see PR #2308 )
2020-12-01 10:07:01 +01:00
GazoilKerozen 1b700738eb
[fix] fix the reset button in the oscar theme (#2306)
Rely on javascript instead of type="clear"

Close #2009
2020-11-30 16:30:21 +01:00
Noémi Ványi 4a36a3044d
Add recoll engine (#2325)
recoll is a local search engine based on Xapian:
http://www.lesbonscomptes.com/recoll/

By itself recoll does not offer web or API access,
this can be achieved using recoll-webui:
https://framagit.org/medoc92/recollwebui.git

This engine uses a custom 'files' result template

set `base_url` to the location where recoll-webui can be reached
set `dl_prefix` to a location where the file hierarchy as indexed by recoll can be reached
set `search_dir` to the part of the indexed file hierarchy to be searched, use an empty string to search the entire search domain
2020-11-30 08:35:15 +01:00
Alexandre Flament b4b81a5e1a [enh] settings.yml: add use_default_settings option (2nd version) 2020-11-27 19:40:04 +01:00
M. Efe Çetin d1f527c3af
Photon API Link Update
Via https://photon.komoot.io/
2020-11-27 10:22:28 +03:00
Alexandre Flament 1cfe7f2a75 [enh] settings.yml: add use_default_settings option
This change is backward compatible with the existing configurations.

If a settings.yml loaded from an user defined location (SEARX_SETTINGS_PATH or /etc/searx/settings.yml),
then this settings can relied on the default settings.yml with this option:
user_default_settings:True
2020-11-26 18:27:27 +01:00
Alexandre Flament 6ada5bac60
Merge pull request #2327 from renyhp/master
Add preference for displaying advanced settings
2020-11-26 17:37:43 +01:00
renyhp 0323606691 Remove unused lines 2020-11-26 17:26:19 +01:00
renyhp 844ae0b310 Fix syntax error 2020-11-26 16:27:46 +01:00
renyhp 4979b4f9d9 Another patch 2020-11-26 15:34:53 +01:00
renyhp 22489c4b5f Patch advanced search preferences 2020-11-23 19:13:29 +01:00
renyhp b00f77059c Add preference for displaying advanced settings 2020-11-22 18:16:43 +01:00
Alexandre Flament 3786920df9 [enh] Add multiple outgoing proxies
credits go to @bauruine see https://github.com/searx/searx/pull/1958
2020-11-20 15:29:21 +01:00
Noémi Ványi 80a8bc5ad9 Fix type of unresponsive_engines
Previously __get_translated_errors
returned a list. But unresponsive_engines
is a set.

Closes #2305
2020-11-17 23:22:45 +01:00
Markus Heiser c71d214b0c [refactor] deviantart - improve results and clean up source code
Devian's request and response forms has been changed.

- fixed title
- fixed time_range_dict to 'popular-*-***'
- use image from <noscript> if exists
- drop obsolete "http to https, remove domain sharding"
- use query URL https://www.deviantart.com/search/deviations?page=5&q=foo
- add searx/engines/deviantart.py to pylint check (test.pylint)

Error pattern::

    There DEBUG:searx:result: invalid title: {'url': 'https://www.deviantart.com/  ...

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-11-14 17:09:56 +01:00
Alexandre Flament 3038052c79 [mod] remove unused import
use
from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url  # NOQA
so it is possible to easily remove all unused import using autoflake:
autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-11-14 14:11:02 +01:00
Alexandre Flament c3d9b17c2a
Merge pull request #2292 from kvch/elasticsearch-engine
New engine: Elasticsearch
2020-11-14 13:25:08 +01:00
Alexandre Flament 102c08838b
Merge pull request #2289 from dalf/pylint
[mod] pylint: add extension-pkg-whitelist=lxml.etree
2020-11-14 13:24:31 +01:00
Alexandre Flament 46b454277f
Merge pull request #2309 from dalf/mod_search_repr
[mod] searx.search: EngineRef, SearchQuery: add __repr__ and __eq__ methods
2020-11-14 13:23:44 +01:00
Alexandre Flament ebed1461bc
Merge pull request #2300 from dalf/fix-webapp-index
[fix] fix of / and /search
2020-11-14 13:23:03 +01:00
Noémi Ványi 43e697681e New engine: Elasticsearch 2020-11-10 19:53:38 +01:00
Alexandre Flament 8fc74d0d7b [mod] searx.search: EngineRef, SearchQuery: add __repr__ and __eq__ methods 2020-11-10 10:45:40 +01:00
Alexandre Flament b3a3ccf2db [fix] fix of / and /search
* URL / : the index page displayed the selected or the default category.
* URL / : when the q parameter is set using the URL, the redirect includes the URL query.
* URL /search : an empty query doesn't raise an exception.
2020-11-06 12:11:52 +01:00
Adam Tauber 063260d090 [enh] add default http headers - closes #715 2020-11-05 16:14:23 +01:00
Adam Tauber 1b42d42695
Merge pull request #2290 from dalf/fix-misc
Various bug fixes
2020-11-03 15:12:25 +01:00
Alexandre Flament 58d72f2692 [mod] pylint: minor code change to allow pylint globally
This commit is only a step, it doesn't fix all the issues reported by pylint
2020-11-03 11:35:53 +01:00
Alexandre Flament e28b56e262 [fix] webadapter: fix locked categories 2020-11-03 10:55:49 +01:00
Alexandre Flament eed43783f9 [fix] comamnd engine: fix import 2020-11-03 10:55:08 +01:00
Alexandre Flament a08df82574 [fix] scanr_structure engine: fix import 2020-11-03 10:54:02 +01:00
Marc Abonce Seguin 8d71420b45 [mod] separate index and search routes
This makes it easier to separately handle search and index requests
from a web server or from a reverse proxy.

If a request to index contains a query, a permanent redirect HTTP response
is returned. This should give some level of backwards compatibility
for users that have set a searx instance in their browser's search bar.
2020-11-02 20:04:03 -07:00
Alexandre Flament 95bd6033fa [mod] wikidata engine: use one SPARQL request instead of 2 HTTP requests. 2020-10-28 08:09:25 +01:00
Alexandre Flament ca593728af [mod] duckduckgo_definitions: display only user friendly attributes / URL
various bug fixes
2020-10-28 08:09:25 +01:00
Alexandre Flament 382fded665 [mod] result.py: merge infobox URL and attributes when the same label or the same entity
entity are wikidata entity (like "Q42" for "Douglas Adams", see https://www.wikidata.org/wiki/Q42 )
2020-10-28 08:09:25 +01:00
Alexandre Flament 23f4203dfb [fix] simple theme: infobox: remove useless entity information 2020-10-28 08:09:25 +01:00
Alexandre Flament ed6696e6bf [mod] add external_urls.json and wikidata_units.json 2020-10-28 08:09:25 +01:00
Alexandre Flament 5e7060053c [mod] ahmia_filter.py: minor changes
- use result['parsed_url']
- load ahmia_blacklist.txt in searx.datae
2020-10-27 20:00:04 +01:00
Adam Tauber db703a0283
Merge pull request #565 from MarcAbonce/onions
New category: Onions
2020-10-26 14:20:58 +01:00
Adam Tauber 2aef38c3b9 [fix] resolve query_parts regression 2020-10-26 14:15:59 +01:00
Marc Abonce Seguin 32957cdf49 add Ahmia filter plugin for onion results 2020-10-25 17:59:43 -07:00
a01200356 c3daa08537 [enh] Add onions category with Ahmia, Not Evil and Torch
Xpath engine and results template changed to account for the fact that
archive.org doesn't cache .onions, though some onion engines migth have
their own cache.

Disabled by default. Can be enabled by setting the SOCKS proxies to
wherever Tor is listening and setting using_tor_proxy as True.

Requires Tor and updating packages.

To avoid manually adding the timeout on each engine, you can set
extra_proxy_timeout to account for Tor's (or whatever proxy used) extra
time.
2020-10-25 17:59:05 -07:00
Noémi Ványi 33e139cae6 Let admins lock user preferences 2020-10-25 18:06:18 +01:00
Nicholas Kegler 8e15d3e4c1 Open Semantic Search Engine 2020-10-25 17:50:00 +01:00
Adam Tauber aa3c18dda9 [enh] allow searx query parts anywhere in the query - closes #831 2020-10-25 17:40:36 +01:00
Venca24 2b93e70a26 [fix] code style 2020-10-24 09:20:55 +02:00
Venca24 1cbcf2ccb6 [mod] adapt hash plugin to current version of searx 2020-10-23 21:35:13 +02:00
Venca24 40c552c11e [fix] hash plugin 2020-10-23 21:26:42 +02:00
Venca24 69e5a58058 [fix] code style 2020-10-23 21:26:42 +02:00
Venca24 1ea9438f5d [fix] hash plugin 2020-10-23 21:25:10 +02:00
Venca24 c9593c8ffd [enh] add plugin converting strings into hash digests 2020-10-23 21:25:10 +02:00
Noémi Ványi 116f7a6daa Force admins to set secret_key if debug mode is disabled
This commit also enables debug mode for unit tests.
2020-10-09 18:31:42 +02:00
Noémi Ványi e158eeee4b Propagate error messages from YouTube API 2020-10-09 17:34:26 +02:00
Adam Tauber 835d16cbb1
Merge pull request #2255 from kvch/yacy-improvements
Add yacy improvements: HTTP digest auth, category checking
2020-10-09 16:34:42 +02:00
Alexandre Flament cfd21bc475 [fix] fix duckduckgo engine
- remove paging support: a "vqd" parameter is required between each request. This parameter is uniq for each request
- update the URL (no redirect), use the POST method
- language support: works if there is no more than request per minute, otherwise it is ignored !
2020-10-09 16:00:42 +02:00
Noémi Ványi 72c7fd25fe Add yacy improvements: HTTP digest auth, category checking 2020-10-09 15:06:05 +02:00
Adam Tauber a05c660e30 [enh] add ability to set enabled plugins from settings - closes #1613 #778 2020-10-09 14:12:31 +02:00
Noémi Ványi ce000a9fef Fix XPATH of lobste.rs engine && add timeout 2020-10-09 12:56:37 +02:00
Adam Tauber da8b227044 [fix] use base_url everywhere if it is defined in settings.yml 2020-10-08 14:19:09 +02:00
Noémi Ványi f0278d41fc add ebay enginte to shopping category 2020-10-08 13:20:55 +02:00
Alexandre Flament a9dc54bebc [mod] Add searx.data module
Instead of loading the data/*.json in different location,
load these files in the new searx.data module.
2020-10-07 10:29:34 +02:00
Alexandre Flament e30dc2f0ba
Merge pull request #2247 from dalf/fix-opensearch
[fix] opensearch.xml URL contains method and autocomplete parameters
2020-10-07 10:14:57 +02:00
Alexandre Flament 474d56c77f
Merge pull request #2248 from dalf/fix-webadapter
[fix] various fixes of searx.webadapter
2020-10-07 10:12:10 +02:00
Alexandre Flament d5950079cf [fix] fix searx.webadapter
* Fix "?q=test&engines=wikipedia": fix exception
* Fix "?q=test&engines=wikipedia&categories=images": now the engines from images category are included.
* Fix parse_timeout: make sure a value is always returned
* Various typing fixes (searx.webadapter, searx.search.SearchQuery)
2020-10-06 15:23:19 +02:00
Alexandre Flament 8659212f5a [fix] drop Python 2: use collections.abc.Iterable instead of collections.Iterable 2020-10-06 09:43:24 +02:00
Alexandre Flament 15013e64d8 [fix] drop Python 2: use importlib instead of imp.load_source
imp.load_source is not documented in Python 3
see documentation : https://docs.python.org/3/library/importlib.html#importing-a-source-file-directly

partial fix of https://github.com/searx/searx/issues/1674
2020-10-06 09:42:11 +02:00
Alexandre Flament bfdad7bc0f [fix] opensearch.xml URL contains method and autocomplete parameters
When the user add searx as a search engine, the browser loads the /opensearch.xml URL without the cookies.
Without the query parameters, the user preferences are ignored (method and autocomplete).

In addition, opensearch.xml is modified to support automatic updates,
see https://developer.mozilla.org/en-US/docs/Web/OpenSearch
2020-10-06 00:54:37 +02:00
Alexandre Flament 584760cf54
Merge pull request #2237 from dalf/mod-engines-init
Mod engines init
2020-10-05 11:20:46 +02:00
Alexandre Flament 6c39917c4d [mod] webapp.py: update engines initialization condition
Always call initialize engines except on the first run of werkzeug with the reload feature.

the reload feature is activated when:
* searx_debug is True (SEARX_DEBUG environment variable or settings.yml)
* FLASK_APP=searx/webapp.py FLASK_ENV=development flask run (see https://flask.palletsprojects.com/en/1.1.x/cli/ )

Fix SEARX_DEBUG=0 make docs
docs/admin/engines.rst : engines are initialized
See https://github.com/searx/searx/issues/2204#issuecomment-701373438
2020-10-05 11:13:32 +02:00
Alexandre Flament b728cb610b
Merge pull request #2241 from dalf/move-extract-text-and-url
Move the extract_text  and extract_url functions to searx.utils
2020-10-04 09:06:20 +02:00
Alexandre Flament e2cd9b65bb
Merge pull request #2239 from dalf/mod-preferences
[mod] preferences.py: check language setting with a regex instead of match_language
2020-10-04 09:05:14 +02:00
Finn 53c8d945b4
[enh] Add SepiaSearch engine (#2227)
supported_languages values: see https://framagit.org/framasoft/peertube/search-index/-/blob/master/client/src/views/Search.vue#L618-641
2020-10-03 13:00:10 +02:00
Alexandre Flament 8f914a28fa [mod] searx.utils.normalize_url: remove Yahoo hack
* The hack for Yahoo URLs is not necessary anymore. (see searx.engines.yahoo.parse_url)
* move the URL normalization in extract_url to normalize_url
2020-10-03 10:02:50 +02:00
Alexandre Flament c1d10bde02 [mod] searx/utils.py: add docstring 2020-10-02 18:17:01 +02:00
Alexandre Flament 2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Alexandre Flament 507896c115 [mod] preferences.py: check language setting with a regex instead of match_language 2020-10-01 11:29:31 +02:00
Markus Heiser 8162d7aff4 [fix] google engine - div classes has been renamed in HTML reult
Since 1. October 2020 google has changed the 'class' attribute of the HTML
result page.

Fix the xpath expressions and ignore <div class="g" ../> sections which do not
match to title's xpath expression.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-10-01 09:44:29 +02:00
Alexandre Flament 9740618227
Merge pull request #2226 from dalf/fix-searx-migration
[fix] migration from github.com/asciimoo/searx to github.com/searx/searx : fix URLs
2020-09-29 12:35:11 +02:00
Qt Resynth 246b8cd1a4
[fix] about.html: fix small inconsistencies in about page (#2219) 2020-09-28 16:56:25 +02:00
Alexandre Flament f204e4903d [fix] migration from github.com/asciimoo/searx to github.com/searx/searx : fix URLs 2020-09-28 16:44:14 +02:00