\n# search_engines
A Python library that queries Google, Bing, Yahoo and other search engines and collects the results from multiple search engine results pages.
Please note that web-scraping may be against the TOS of some search engines, and may result in a temporary ban.
## Supported search engines
_[Google](https://www.google.com)_
_[Bing](https://www.bing.com)_
_[Yahoo](https://search.yahoo.com)_
_[Duckduckgo](https://duckduckgo.com)_
_[Startpage](https://www.startpage.com)_
_[Aol](https://search.aol.com)_
_[Dogpile](https://www.dogpile.com)_
_[Ask](https://uk.ask.com)_
_[Mojeek](https://www.mojeek.com)_
_[Brave](https://search.brave.com/)_
_[Torch](http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi)_
## Features
- Creates output files (html, csv, json).
- Supports search filters (url, title, text).
- HTTP and SOCKS proxy support.
- Collects dark web links with Torch.
- Easy to add new search engines. You can add a new engine by creating a new class in `search_engines/engines/` and add it to the `search_engines_dict` dictionary in `search_engines/engines/__init__.py`. The new class should subclass `SearchEngine`, and override the following methods: `_selectors`, `_first_page`, `_next_page`.
- Python2 - Python3 compatible.
## Requirements
_Python 2.7 - 3.x_ with
_[Requests](http://docs.python-requests.org/en/master/)_ and
_[BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)_
## Installation
Run the setup file: `$ python setup.py install`.
Done!
## Usage
As a library:
```
from search_engines import Google
engine = Google()
results = engine.search("my query")
links = results.links()
print(links)
```
As a CLI script:
```
$ python search_engines_cli.py -e google,bing -q "my query" -o json,print
```
## Other versions
- [async-search-scraper](https://github.com/soxoj/async-search-scraper) A really cool asynchronous implementation, written by @soxoj
Raw data
{
"_id": null,
"home_page": "https://github.com/Juanchobanano/Search-Engines-Scraper",
"name": "search-engines-kit",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "search, browsers",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/7c/a5/0986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5/search_engines_kit-1.0.tar.gz",
"platform": null,
"description": "\\n# search_engines \nA Python library that queries Google, Bing, Yahoo and other search engines and collects the results from multiple search engine results pages. \nPlease note that web-scraping may be against the TOS of some search engines, and may result in a temporary ban.\n\n## Supported search engines \n\n_[Google](https://www.google.com)_ \n_[Bing](https://www.bing.com)_ \n_[Yahoo](https://search.yahoo.com)_ \n_[Duckduckgo](https://duckduckgo.com)_ \n_[Startpage](https://www.startpage.com)_ \n_[Aol](https://search.aol.com)_ \n_[Dogpile](https://www.dogpile.com)_ \n_[Ask](https://uk.ask.com)_ \n_[Mojeek](https://www.mojeek.com)_ \n_[Brave](https://search.brave.com/)_ \n_[Torch](http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi)_ \n\n## Features \n\n - Creates output files (html, csv, json). \n - Supports search filters (url, title, text). \n - HTTP and SOCKS proxy support. \n - Collects dark web links with Torch. \n - Easy to add new search engines. You can add a new engine by creating a new class in `search_engines/engines/` and add it to the `search_engines_dict` dictionary in `search_engines/engines/__init__.py`. The new class should subclass `SearchEngine`, and override the following methods: `_selectors`, `_first_page`, `_next_page`. \n - Python2 - Python3 compatible. \n\n## Requirements \n\n_Python 2.7 - 3.x_ with \n_[Requests](http://docs.python-requests.org/en/master/)_ and \n_[BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)_ \n\n## Installation \n\nRun the setup file: `$ python setup.py install`. \nDone! \n\n## Usage \n\nAs a library: \n\n```\nfrom search_engines import Google\n\nengine = Google()\nresults = engine.search(\"my query\")\nlinks = results.links()\n\nprint(links)\n```\n\nAs a CLI script: \n\n``` \n$ python search_engines_cli.py -e google,bing -q \"my query\" -o json,print\n```\n\n## Other versions \n\n - [async-search-scraper](https://github.com/soxoj/async-search-scraper) A really cool asynchronous implementation, written by @soxoj \n",
"bugtrack_url": null,
"license": null,
"summary": "Search Engines Scraper",
"version": "1.0",
"project_urls": {
"Homepage": "https://github.com/Juanchobanano/Search-Engines-Scraper"
},
"split_keywords": [
"search",
" browsers"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2ebd2a050203ce7603ea1d813e7c8c3ad204032a80fce3b8fb7a653563cc8e43",
"md5": "ffdeb19d819634543c2a7103d13917c1",
"sha256": "668f1fdce321e498c3ba2204aa4e5f892b40e96a3170b76ac0a07d57c4a43388"
},
"downloads": -1,
"filename": "search_engines_kit-1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ffdeb19d819634543c2a7103d13917c1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 27944,
"upload_time": "2024-06-12T07:35:05",
"upload_time_iso_8601": "2024-06-12T07:35:05.087400Z",
"url": "https://files.pythonhosted.org/packages/2e/bd/2a050203ce7603ea1d813e7c8c3ad204032a80fce3b8fb7a653563cc8e43/search_engines_kit-1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7ca50986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5",
"md5": "3b1054cf426e82e2c7c324d8c7d9557c",
"sha256": "56b7059f7dab2328b6a6b0e0e50f6964e1a47fae517cd5691b3c8784d4e628b2"
},
"downloads": -1,
"filename": "search_engines_kit-1.0.tar.gz",
"has_sig": false,
"md5_digest": "3b1054cf426e82e2c7c324d8c7d9557c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 18368,
"upload_time": "2024-06-12T07:35:06",
"upload_time_iso_8601": "2024-06-12T07:35:06.322401Z",
"url": "https://files.pythonhosted.org/packages/7c/a5/0986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5/search_engines_kit-1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-12 07:35:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Juanchobanano",
"github_project": "Search-Engines-Scraper",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "requests",
"specs": [
[
">=",
"2.22.0"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
">=",
"4.10.0"
]
]
}
],
"lcname": "search-engines-kit"
}