<p>
<img src="https://raw.githubusercontent.com/biolds/sosse/main/se/static/se/logo.svg" width="64" align="right">
<a href="https://gitlab.com/biolds1/sosse/" alt="Gitlab code coverage" style="text-decoration: none">
<img src="https://img.shields.io/gitlab/pipeline-coverage/biolds1/sosse?branch=main&style=flat-square">
</a>
<a href="https://gitlab.com/biolds1/sosse/-/pipelines" alt="Gitlab pipeline status" style="text-decoration: none">
<img src="https://img.shields.io/gitlab/pipeline-status/biolds1/sosse?branch=main&style=flat-square">
</a>
<a href="https://sosse.readthedocs.io/en/stable/" alt="Documentation" style="text-decoration: none">
<img src="https://img.shields.io/readthedocs/sosse?style=flat-square">
</a>
<a href="https://discord.gg/Vt9cMf7BGK" alt="Discord" style="text-decoration: none">
<img src="https://img.shields.io/discord/1102142186423844944?style=flat-square&color=%235865f2">
</a>
<a href="https://gitlab.com/biolds1/sosse/-/blob/main/LICENSE" alt="License" style="text-decoration: none">
<img src="https://img.shields.io/gitlab/license/biolds1/sosse?style=flat-square">
</a>
</p>
SOSSE 🦦
=======
SOSSE (Selenium Open Source Search Engine) is a Web archiving software, crawler and search engine written in Python, distributed under the [GNU-AGPLv3 license](https://www.gnu.org/licenses/agpl-3.0.en.html). It is hosted on both [Gitlab](https://gitlab.com/biolds1/sosse) and [Github](https://github.com/biolds/sosse) site, please use any of them to open feature requests, bug report or merge requests, or [open a discussion](https://github.com/biolds/sosse/discussions).
SOSSE main features are:
- 🌍 Browser based crawling: SOSSE uses [Mozilla Firefox](https://www.mozilla.org/firefox/), or [Google Chromium](https://www.chromium.org/Home) and [Selenium](https://www.selenium.dev/) to index pages that use Javascript. [Requests](https://docs.python-requests.org/en/latest/index.html) can also be used for faster crawling
- 📚 Offline browsing: SOSSE can save HTML copy or take screenshots of crawled pages to create archives suitable for offline browsing
- 📉 Low resources requirements: SOSSE is entirely written in Python and uses [PostgreSQL](https://www.postgresql.org/) for data storage
- 🔓 Authentication: the crawlers can submit authentication forms with provided credentials
- 🔗 Search engines shortcuts: shortcuts search queries can be used to redirect to external search engines (sometime called "bang" searches)
- 🔖 Search history: users can authenticate to log their search query history privately
See the [documentation](https://sosse.readthedocs.io/en/stable/) and [screenshots](https://sosse.readthedocs.io/en/stable/screenshots.html).
Try it out
==========
You can try the latest version with Docker:
```
docker run -p 8005:80 biolds/sosse:latest
```
Open http://127.0.0.1:8005/, and log in with user ``admin``, password ``admin``.
To persist Docker data, or find alternative installation methods, please check the [documentation](https://sosse.readthedocs.io/en/stable/install.html).
Keep in touch
=============
Join the [Discord server](https://discord.gg/Vt9cMf7BGK) to get help and share ideas!
Raw data
{
"_id": null,
"home_page": null,
"name": "sosse",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "search engine, crawler",
"author": null,
"author_email": "Laurent Defert <laurent_defert@yahoo.fr>",
"download_url": "https://files.pythonhosted.org/packages/2a/13/b213e5e2917bd563c61fd542e9b20403e7cc544279010c437e62e9bfcbec/sosse-1.11.1.tar.gz",
"platform": null,
"description": "<p>\n <img src=\"https://raw.githubusercontent.com/biolds/sosse/main/se/static/se/logo.svg\" width=\"64\" align=\"right\">\n <a href=\"https://gitlab.com/biolds1/sosse/\" alt=\"Gitlab code coverage\" style=\"text-decoration: none\">\n <img src=\"https://img.shields.io/gitlab/pipeline-coverage/biolds1/sosse?branch=main&style=flat-square\">\n </a>\n <a href=\"https://gitlab.com/biolds1/sosse/-/pipelines\" alt=\"Gitlab pipeline status\" style=\"text-decoration: none\">\n <img src=\"https://img.shields.io/gitlab/pipeline-status/biolds1/sosse?branch=main&style=flat-square\">\n </a>\n <a href=\"https://sosse.readthedocs.io/en/stable/\" alt=\"Documentation\" style=\"text-decoration: none\">\n <img src=\"https://img.shields.io/readthedocs/sosse?style=flat-square\">\n </a>\n <a href=\"https://discord.gg/Vt9cMf7BGK\" alt=\"Discord\" style=\"text-decoration: none\">\n <img src=\"https://img.shields.io/discord/1102142186423844944?style=flat-square&color=%235865f2\">\n </a>\n <a href=\"https://gitlab.com/biolds1/sosse/-/blob/main/LICENSE\" alt=\"License\" style=\"text-decoration: none\">\n <img src=\"https://img.shields.io/gitlab/license/biolds1/sosse?style=flat-square\">\n </a>\n</p>\n\nSOSSE \ud83e\udda6\n=======\n\nSOSSE (Selenium Open Source Search Engine) is a Web archiving software, crawler and search engine written in Python, distributed under the [GNU-AGPLv3 license](https://www.gnu.org/licenses/agpl-3.0.en.html). It is hosted on both [Gitlab](https://gitlab.com/biolds1/sosse) and [Github](https://github.com/biolds/sosse) site, please use any of them to open feature requests, bug report or merge requests, or [open a discussion](https://github.com/biolds/sosse/discussions).\n\nSOSSE main features are:\n- \ud83c\udf0d Browser based crawling: SOSSE uses [Mozilla Firefox](https://www.mozilla.org/firefox/), or [Google Chromium](https://www.chromium.org/Home) and [Selenium](https://www.selenium.dev/) to index pages that use Javascript. [Requests](https://docs.python-requests.org/en/latest/index.html) can also be used for faster crawling\n- \ud83d\udcda Offline browsing: SOSSE can save HTML copy or take screenshots of crawled pages to create archives suitable for offline browsing\n- \ud83d\udcc9 Low resources requirements: SOSSE is entirely written in Python and uses [PostgreSQL](https://www.postgresql.org/) for data storage\n- \ud83d\udd13 Authentication: the crawlers can submit authentication forms with provided credentials\n- \ud83d\udd17 Search engines shortcuts: shortcuts search queries can be used to redirect to external search engines (sometime called \"bang\" searches)\n- \ud83d\udd16 Search history: users can authenticate to log their search query history privately\n\nSee the [documentation](https://sosse.readthedocs.io/en/stable/) and [screenshots](https://sosse.readthedocs.io/en/stable/screenshots.html).\n\nTry it out\n==========\n\nYou can try the latest version with Docker:\n\n```\ndocker run -p 8005:80 biolds/sosse:latest\n```\n\nOpen http://127.0.0.1:8005/, and log in with user ``admin``, password ``admin``.\n\nTo persist Docker data, or find alternative installation methods, please check the [documentation](https://sosse.readthedocs.io/en/stable/install.html).\n\nKeep in touch\n=============\n\nJoin the [Discord server](https://discord.gg/Vt9cMf7BGK) to get help and share ideas!\n",
"bugtrack_url": null,
"license": "GNU Affero General Public License v3",
"summary": "Selenium Open Source Search Engine",
"version": "1.11.1",
"project_urls": null,
"split_keywords": [
"search engine",
" crawler"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f5c3b0053f1818f1715e65dfd44cfed939fda85488ea57f984975bd0f50aad75",
"md5": "08528972603c8c9c6271c4b8a3a3df66",
"sha256": "220ad6b65d1fc7addc0bf6d6b384f5d4098671db4f1236e970283fe77c904703"
},
"downloads": -1,
"filename": "sosse-1.11.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "08528972603c8c9c6271c4b8a3a3df66",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 3422690,
"upload_time": "2024-12-26T20:12:10",
"upload_time_iso_8601": "2024-12-26T20:12:10.128676Z",
"url": "https://files.pythonhosted.org/packages/f5/c3/b0053f1818f1715e65dfd44cfed939fda85488ea57f984975bd0f50aad75/sosse-1.11.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2a13b213e5e2917bd563c61fd542e9b20403e7cc544279010c437e62e9bfcbec",
"md5": "712cb4c6dbd96e3eb789edf07bae888c",
"sha256": "bd1f65cdc6ba4a99b314c7ebf4d3e119b57aab769216fec9763a163ad1cf90a4"
},
"downloads": -1,
"filename": "sosse-1.11.1.tar.gz",
"has_sig": false,
"md5_digest": "712cb4c6dbd96e3eb789edf07bae888c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 3399012,
"upload_time": "2024-12-26T20:12:14",
"upload_time_iso_8601": "2024-12-26T20:12:14.637660Z",
"url": "https://files.pythonhosted.org/packages/2a/13/b213e5e2917bd563c61fd542e9b20403e7cc544279010c437e62e9bfcbec/sosse-1.11.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-26 20:12:14",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "sosse"
}