sosse


Namesosse JSON
Version 1.9.0 PyPI version JSON
download
home_page
SummarySelenium Open Source Search Engine
upload_time2024-03-10 19:38:01
maintainer
docs_urlNone
author
requires_python>=3.9
licenseGNU Affero General Public License v3
keywords search engine crawler
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p>
  <img src="https://raw.githubusercontent.com/biolds/sosse/main/se/static/se/logo.svg" width="64" align="right">
  <a href="https://gitlab.com/biolds1/sosse/" alt="Gitlab code coverage" style="text-decoration: none">
    <img src="https://img.shields.io/gitlab/pipeline-coverage/biolds1/sosse?branch=main&style=flat-square">
  </a>
  <a href="https://gitlab.com/biolds1/sosse/-/pipelines" alt="Gitlab pipeline status" style="text-decoration: none">
    <img src="https://img.shields.io/gitlab/pipeline-status/biolds1/sosse?branch=main&style=flat-square">
  </a>
  <a href="https://sosse.readthedocs.io/en/stable/" alt="Documentation" style="text-decoration: none">
    <img src="https://img.shields.io/readthedocs/sosse?style=flat-square">
  </a>
  <a href="https://discord.gg/Vt9cMf7BGK" alt="Discord" style="text-decoration: none">
    <img src="https://img.shields.io/discord/1102142186423844944?style=flat-square&color=%235865f2">
  </a>
  <a href="https://gitlab.com/biolds1/sosse/-/blob/main/LICENSE" alt="License" style="text-decoration: none">
    <img src="https://img.shields.io/gitlab/license/biolds1/sosse?style=flat-square">
  </a>
</p>

SOSSE 🦦
=======

SOSSE (Selenium Open Source Search Engine) is a Web archiving software, crawler and search engine written in Python, distributed under the [GNU-AGPLv3 license](https://www.gnu.org/licenses/agpl-3.0.en.html). It is hosted on both [Gitlab](https://gitlab.com/biolds1/sosse) and [Github](https://github.com/biolds/sosse) site, please use any of them to open feature requests, bug report or merge requests, or [open a discussion](https://github.com/biolds/sosse/discussions).

SOSSE main features are:
- 🌍 Browser based crawling: SOSSE uses [Mozilla Firefox](https://www.mozilla.org/firefox/), or [Google Chromium](https://www.chromium.org/Home) and [Selenium](https://www.selenium.dev/) to index pages that use Javascript. [Requests](https://docs.python-requests.org/en/latest/index.html) can also be used for faster crawling
- 📚 Offline browsing: SOSSE can save HTML copy or take screenshots of crawled pages to create archives suitable for offline browsing
- 📉 Low resources requirements: SOSSE is entirely written in Python and uses [PostgreSQL](https://www.postgresql.org/) for data storage
- 🔓 Authentication: the crawlers can submit authentication forms with provided credentials
- 🔗 Search engines shortcuts: shortcuts search queries can be used to redirect to external search engines (sometime called "bang" searches)
- 🔖 Search history: users can authenticate to log their search query history privately

See the [documentation](https://sosse.readthedocs.io/en/stable/) and [screenshots](https://sosse.readthedocs.io/en/stable/screenshots.html).

Try it out
==========

You can try the latest version with Docker:

```
docker run -p 8005:80 biolds/sosse:latest
```

Open http://127.0.0.1:8005/, and log in with user ``admin``, password ``admin``.

To persist Docker data, or find alternative installation methods, please check the [documentation](https://sosse.readthedocs.io/en/stable/install.html).

Keep in touch
=============

Join the [Discord server](https://discord.gg/Vt9cMf7BGK) to get help and share ideas!

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "sosse",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "search engine,crawler",
    "author": "",
    "author_email": "Laurent Defert <laurent_defert@yahoo.fr>",
    "download_url": "https://files.pythonhosted.org/packages/82/25/1668948e3eb3510b2a3423170b48abcc438da164bb5b152b9ac88265896d/sosse-1.9.0.tar.gz",
    "platform": null,
    "description": "<p>\n  <img src=\"https://raw.githubusercontent.com/biolds/sosse/main/se/static/se/logo.svg\" width=\"64\" align=\"right\">\n  <a href=\"https://gitlab.com/biolds1/sosse/\" alt=\"Gitlab code coverage\" style=\"text-decoration: none\">\n    <img src=\"https://img.shields.io/gitlab/pipeline-coverage/biolds1/sosse?branch=main&style=flat-square\">\n  </a>\n  <a href=\"https://gitlab.com/biolds1/sosse/-/pipelines\" alt=\"Gitlab pipeline status\" style=\"text-decoration: none\">\n    <img src=\"https://img.shields.io/gitlab/pipeline-status/biolds1/sosse?branch=main&style=flat-square\">\n  </a>\n  <a href=\"https://sosse.readthedocs.io/en/stable/\" alt=\"Documentation\" style=\"text-decoration: none\">\n    <img src=\"https://img.shields.io/readthedocs/sosse?style=flat-square\">\n  </a>\n  <a href=\"https://discord.gg/Vt9cMf7BGK\" alt=\"Discord\" style=\"text-decoration: none\">\n    <img src=\"https://img.shields.io/discord/1102142186423844944?style=flat-square&color=%235865f2\">\n  </a>\n  <a href=\"https://gitlab.com/biolds1/sosse/-/blob/main/LICENSE\" alt=\"License\" style=\"text-decoration: none\">\n    <img src=\"https://img.shields.io/gitlab/license/biolds1/sosse?style=flat-square\">\n  </a>\n</p>\n\nSOSSE \ud83e\udda6\n=======\n\nSOSSE (Selenium Open Source Search Engine) is a Web archiving software, crawler and search engine written in Python, distributed under the [GNU-AGPLv3 license](https://www.gnu.org/licenses/agpl-3.0.en.html). It is hosted on both [Gitlab](https://gitlab.com/biolds1/sosse) and [Github](https://github.com/biolds/sosse) site, please use any of them to open feature requests, bug report or merge requests, or [open a discussion](https://github.com/biolds/sosse/discussions).\n\nSOSSE main features are:\n- \ud83c\udf0d Browser based crawling: SOSSE uses [Mozilla Firefox](https://www.mozilla.org/firefox/), or [Google Chromium](https://www.chromium.org/Home) and [Selenium](https://www.selenium.dev/) to index pages that use Javascript. [Requests](https://docs.python-requests.org/en/latest/index.html) can also be used for faster crawling\n- \ud83d\udcda Offline browsing: SOSSE can save HTML copy or take screenshots of crawled pages to create archives suitable for offline browsing\n- \ud83d\udcc9 Low resources requirements: SOSSE is entirely written in Python and uses [PostgreSQL](https://www.postgresql.org/) for data storage\n- \ud83d\udd13 Authentication: the crawlers can submit authentication forms with provided credentials\n- \ud83d\udd17 Search engines shortcuts: shortcuts search queries can be used to redirect to external search engines (sometime called \"bang\" searches)\n- \ud83d\udd16 Search history: users can authenticate to log their search query history privately\n\nSee the [documentation](https://sosse.readthedocs.io/en/stable/) and [screenshots](https://sosse.readthedocs.io/en/stable/screenshots.html).\n\nTry it out\n==========\n\nYou can try the latest version with Docker:\n\n```\ndocker run -p 8005:80 biolds/sosse:latest\n```\n\nOpen http://127.0.0.1:8005/, and log in with user ``admin``, password ``admin``.\n\nTo persist Docker data, or find alternative installation methods, please check the [documentation](https://sosse.readthedocs.io/en/stable/install.html).\n\nKeep in touch\n=============\n\nJoin the [Discord server](https://discord.gg/Vt9cMf7BGK) to get help and share ideas!\n",
    "bugtrack_url": null,
    "license": "GNU Affero General Public License v3",
    "summary": "Selenium Open Source Search Engine",
    "version": "1.9.0",
    "project_urls": null,
    "split_keywords": [
        "search engine",
        "crawler"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "175e2b236b5fbd5360d5eee4b08525726e934c615c016348b26621b79c0574a8",
                "md5": "cc7d56ab65483722518ab9fedf0e804c",
                "sha256": "dc51f007c0bc14dc014a6666a985f09e138a8e25623db8b910750c8e1eab4e98"
            },
            "downloads": -1,
            "filename": "sosse-1.9.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cc7d56ab65483722518ab9fedf0e804c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 212678,
            "upload_time": "2024-03-10T19:37:59",
            "upload_time_iso_8601": "2024-03-10T19:37:59.548934Z",
            "url": "https://files.pythonhosted.org/packages/17/5e/2b236b5fbd5360d5eee4b08525726e934c615c016348b26621b79c0574a8/sosse-1.9.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "82251668948e3eb3510b2a3423170b48abcc438da164bb5b152b9ac88265896d",
                "md5": "61fe8ea6ecf9c80689f76db84bba18ac",
                "sha256": "e1c554a201d2ab1a7df66badc904fa7ca4f9f87b862fa9fba855ccdf5eb75dc5"
            },
            "downloads": -1,
            "filename": "sosse-1.9.0.tar.gz",
            "has_sig": false,
            "md5_digest": "61fe8ea6ecf9c80689f76db84bba18ac",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 232197,
            "upload_time": "2024-03-10T19:38:01",
            "upload_time_iso_8601": "2024-03-10T19:38:01.375462Z",
            "url": "https://files.pythonhosted.org/packages/82/25/1668948e3eb3510b2a3423170b48abcc438da164bb5b152b9ac88265896d/sosse-1.9.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-10 19:38:01",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "sosse"
}
        
Elapsed time: 0.20411s