searxngRebrandZaclys/searx/engines/peertube.py

# SPDX-License-Identifier: AGPL-3.0-or-later
"""
 peertube (Videos)
"""

from json import loads
from datetime import datetime
from urllib.parse import urlencode
from searx.utils import html_to_text

# about
about = {
    "website": 'https://joinpeertube.org',
    "wikidata_id": 'Q50938515',
    "official_api_documentation": 'https://docs.joinpeertube.org/api-rest-reference.html',
    "use_official_api": True,
    "require_api_key": False,
    "results": 'JSON',
}

# engine dependent config
categories = ["videos"]
paging = True
base_url = "https://peer.tube"
supported_languages_url = (
    'https://framagit.org/framasoft/peertube/search-index/-/raw/master/client/src/views/Search.vue'
)


# do search-request
def request(query, params):
    sanitized_url = base_url.rstrip("/")
    pageno = (params["pageno"] - 1) * 15
    search_url = sanitized_url + "/api/v1/search/videos/?pageno={pageno}&{query}"
    query_dict = {"search": query}
    language = params["language"].split("-")[0]
    if "all" != language and language in supported_languages:
        query_dict["languageOneOf"] = language
    params["url"] = search_url.format(query=urlencode(query_dict), pageno=pageno)
    return params


def _get_offset_from_pageno(pageno):
    return (pageno - 1) * 15 + 1


# get response from search-request
def response(resp):
    sanitized_url = base_url.rstrip("/")
    results = []

    search_res = loads(resp.text)

    # return empty array if there are no results
    if "data" not in search_res:
        return []

    # parse results
    for res in search_res["data"]:
        title = res["name"]
        url = sanitized_url + "/videos/watch/" + res["uuid"]
        description = res["description"]
        if description:
            content = html_to_text(res["description"])
        else:
            content = ""
        thumbnail = sanitized_url + res["thumbnailPath"]
        publishedDate = datetime.strptime(res["publishedAt"], "%Y-%m-%dT%H:%M:%S.%fZ")

        results.append(
            {
                "template": "videos.html",
                "url": url,
                "title": title,
                "content": content,
                "publishedDate": publishedDate,
                "data_src": sanitized_url + res["embedPath"],
                "thumbnail": thumbnail,
            }
        )

    # return results
    return results


def _fetch_supported_languages(resp):
    import re

    # https://docs.python.org/3/howto/regex.html#greedy-versus-non-greedy
    videolanguages = re.search(r"videoLanguages \(\)[^\n]+(.*?)\]", resp.text, re.DOTALL)
    peertube_languages = [m.group(1) for m in re.finditer(r"\{ id: '([a-z]+)', label:", videolanguages.group(1))]
    return peertube_languages
[enh] engines: add about variable move meta information from comment to the about variable so the preferences, the documentation can show these information 2021-01-13 10:31:25 +00:00			`# SPDX-License-Identifier: AGPL-3.0-or-later`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`"""`
			`peertube (Videos)`
			`"""`

			`from json import loads`
			`from datetime import datetime`
Drop Python 2 (1/n): remove unicode string and url_utils 2020-08-06 15:42:46 +00:00			`from urllib.parse import urlencode`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`from searx.utils import html_to_text`

[enh] engines: add about variable move meta information from comment to the about variable so the preferences, the documentation can show these information 2021-01-13 10:31:25 +00:00			`# about`
			`about = {`
			`"website": 'https://joinpeertube.org',`
			`"wikidata_id": 'Q50938515',`
			`"official_api_documentation": 'https://docs.joinpeertube.org/api-rest-reference.html',`
			`"use_official_api": True,`
			`"require_api_key": False,`
			`"results": 'JSON',`
			`}`

add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`# engine dependent config`
			`categories = ["videos"]`
			`paging = True`
Improve peertube searching At the moment videos without a description are not shown - setting default content to "" fixes this. Another current bug is that thumbnails are not displayed. This is caused by a double slash in the url. For this every trailing slash is now stripped (for backwards compatibility) and the API response is correctly parsed. 2021-02-13 18:47:33 +00:00			`base_url = "https://peer.tube"`
[fix] peertube fetch supported languages close #127 2021-06-04 09:09:36 +00:00			`supported_languages_url = (`
			`'https://framagit.org/framasoft/peertube/search-index/-/raw/master/client/src/views/Search.vue'`
			`)`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00

			`# do search-request`
			`def request(query, params):`
Improve peertube searching At the moment videos without a description are not shown - setting default content to "" fixes this. Another current bug is that thumbnails are not displayed. This is caused by a double slash in the url. For this every trailing slash is now stripped (for backwards compatibility) and the API response is correctly parsed. 2021-02-13 18:47:33 +00:00			`sanitized_url = base_url.rstrip("/")`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`pageno = (params["pageno"] - 1) * 15`
Improve peertube searching At the moment videos without a description are not shown - setting default content to "" fixes this. Another current bug is that thumbnails are not displayed. This is caused by a double slash in the url. For this every trailing slash is now stripped (for backwards compatibility) and the API response is correctly parsed. 2021-02-13 18:47:33 +00:00			`search_url = sanitized_url + "/api/v1/search/videos/?pageno={pageno}&{query}"`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`query_dict = {"search": query}`
			`language = params["language"].split("-")[0]`
			`if "all" != language and language in supported_languages:`
			`query_dict["languageOneOf"] = language`
[format.python] initial formatting of the python code This patch was generated by black [1]:: make format.python [1] https://github.com/psf/black Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2021-12-27 08:26:22 +00:00			`params["url"] = search_url.format(query=urlencode(query_dict), pageno=pageno)`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`return params`


			`def _get_offset_from_pageno(pageno):`
			`return (pageno - 1) * 15 + 1`


			`# get response from search-request`
			`def response(resp):`
Improve peertube searching At the moment videos without a description are not shown - setting default content to "" fixes this. Another current bug is that thumbnails are not displayed. This is caused by a double slash in the url. For this every trailing slash is now stripped (for backwards compatibility) and the API response is correctly parsed. 2021-02-13 18:47:33 +00:00			`sanitized_url = base_url.rstrip("/")`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`results = []`

			`search_res = loads(resp.text)`

			`# return empty array if there are no results`
			`if "data" not in search_res:`
			`return []`

			`# parse results`
			`for res in search_res["data"]:`
			`title = res["name"]`
Improve peertube searching At the moment videos without a description are not shown - setting default content to "" fixes this. Another current bug is that thumbnails are not displayed. This is caused by a double slash in the url. For this every trailing slash is now stripped (for backwards compatibility) and the API response is correctly parsed. 2021-02-13 18:47:33 +00:00			`url = sanitized_url + "/videos/watch/" + res["uuid"]`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`description = res["description"]`
			`if description:`
			`content = html_to_text(res["description"])`
			`else:`
Improve peertube searching At the moment videos without a description are not shown - setting default content to "" fixes this. Another current bug is that thumbnails are not displayed. This is caused by a double slash in the url. For this every trailing slash is now stripped (for backwards compatibility) and the API response is correctly parsed. 2021-02-13 18:47:33 +00:00			`content = ""`
			`thumbnail = sanitized_url + res["thumbnailPath"]`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`publishedDate = datetime.strptime(res["publishedAt"], "%Y-%m-%dT%H:%M:%S.%fZ")`

			`results.append(`
			`{`
			`"template": "videos.html",`
			`"url": url,`
			`"title": title,`
			`"content": content,`
			`"publishedDate": publishedDate,`
[mod] result_templates/videos.html: replace embedded HTML by data_src Embedded HTML breaks SearXNG architecture. To modularize, HTML is generated in the templates (oscar & simple) and result parameter 'embedded' is replaced by 'data_src', an URL for embedded content (<iframe>). Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2022-02-07 15:16:57 +00:00			`"data_src": sanitized_url + res["embedPath"],`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`"thumbnail": thumbnail,`
			`}`
			`)`

			`# return results`
			`return results`


			`def _fetch_supported_languages(resp):`
[fix] peertube fetch supported languages close #127 2021-06-04 09:09:36 +00:00			`import re`

			`# https://docs.python.org/3/howto/regex.html#greedy-versus-non-greedy`
[fix] peertube: update _fetch_supported_languages update the regex to match the changes in peertube source code fix "make data.languages" 2021-07-23 10:03:16 +00:00			`videolanguages = re.search(r"videoLanguages \(\)[^\n]+(.*?)\]", resp.text, re.DOTALL)`
[fix] peertube fetch supported languages close #127 2021-06-04 09:09:36 +00:00			`peertube_languages = [m.group(1) for m in re.finditer(r"\{ id: '([a-z]+)', label:", videolanguages.group(1))]`
add peertube engine (#2109) 2020-08-08 17:22:53 +00:00			`return peertube_languages`