search-engines-kit


Namesearch-engines-kit JSON
Version 1.0 PyPI version JSON
download
home_pagehttps://github.com/Juanchobanano/Search-Engines-Scraper
SummarySearch Engines Scraper
upload_time2024-06-12 07:35:06
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseNone
keywords search browsers
VCS
bugtrack_url
requirements requests beautifulsoup4
Travis-CI No Travis.
coveralls test coverage No coveralls.
            \n# search_engines  
A Python library that queries Google, Bing, Yahoo and other search engines and collects the results from multiple search engine results pages.  
Please note that web-scraping may be against the TOS of some search engines, and may result in a temporary ban.

## Supported search engines  

_[Google](https://www.google.com)_  
_[Bing](https://www.bing.com)_  
_[Yahoo](https://search.yahoo.com)_  
_[Duckduckgo](https://duckduckgo.com)_  
_[Startpage](https://www.startpage.com)_  
_[Aol](https://search.aol.com)_  
_[Dogpile](https://www.dogpile.com)_  
_[Ask](https://uk.ask.com)_  
_[Mojeek](https://www.mojeek.com)_  
_[Brave](https://search.brave.com/)_  
_[Torch](http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi)_  

## Features  

 - Creates output files (html, csv, json).  
 - Supports search filters (url, title, text).  
 - HTTP and SOCKS proxy support.  
 - Collects dark web links with Torch.  
 - Easy to add new search engines. You can add a new engine by creating a new class in `search_engines/engines/` and add it to the  `search_engines_dict` dictionary in `search_engines/engines/__init__.py`. The new class should subclass `SearchEngine`, and override the following methods: `_selectors`, `_first_page`, `_next_page`. 
 - Python2 - Python3 compatible.  

## Requirements  

_Python 2.7 - 3.x_ with  
_[Requests](http://docs.python-requests.org/en/master/)_ and  
_[BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)_  

## Installation  

Run the setup file: `$ python setup.py install`.  
Done!  

## Usage  

As a library:  

```
from search_engines import Google

engine = Google()
results = engine.search("my query")
links = results.links()

print(links)
```

As a CLI script:  

```  
$ python search_engines_cli.py -e google,bing -q "my query" -o json,print
```

## Other versions  

 - [async-search-scraper](https://github.com/soxoj/async-search-scraper) A really cool asynchronous implementation, written by @soxoj   

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Juanchobanano/Search-Engines-Scraper",
    "name": "search-engines-kit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "search, browsers",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/7c/a5/0986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5/search_engines_kit-1.0.tar.gz",
    "platform": null,
    "description": "\\n# search_engines  \nA Python library that queries Google, Bing, Yahoo and other search engines and collects the results from multiple search engine results pages.  \nPlease note that web-scraping may be against the TOS of some search engines, and may result in a temporary ban.\n\n## Supported search engines  \n\n_[Google](https://www.google.com)_  \n_[Bing](https://www.bing.com)_  \n_[Yahoo](https://search.yahoo.com)_  \n_[Duckduckgo](https://duckduckgo.com)_  \n_[Startpage](https://www.startpage.com)_  \n_[Aol](https://search.aol.com)_  \n_[Dogpile](https://www.dogpile.com)_  \n_[Ask](https://uk.ask.com)_  \n_[Mojeek](https://www.mojeek.com)_  \n_[Brave](https://search.brave.com/)_  \n_[Torch](http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi)_  \n\n## Features  \n\n - Creates output files (html, csv, json).  \n - Supports search filters (url, title, text).  \n - HTTP and SOCKS proxy support.  \n - Collects dark web links with Torch.  \n - Easy to add new search engines. You can add a new engine by creating a new class in `search_engines/engines/` and add it to the  `search_engines_dict` dictionary in `search_engines/engines/__init__.py`. The new class should subclass `SearchEngine`, and override the following methods: `_selectors`, `_first_page`, `_next_page`. \n - Python2 - Python3 compatible.  \n\n## Requirements  \n\n_Python 2.7 - 3.x_ with  \n_[Requests](http://docs.python-requests.org/en/master/)_ and  \n_[BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)_  \n\n## Installation  \n\nRun the setup file: `$ python setup.py install`.  \nDone!  \n\n## Usage  \n\nAs a library:  \n\n```\nfrom search_engines import Google\n\nengine = Google()\nresults = engine.search(\"my query\")\nlinks = results.links()\n\nprint(links)\n```\n\nAs a CLI script:  \n\n```  \n$ python search_engines_cli.py -e google,bing -q \"my query\" -o json,print\n```\n\n## Other versions  \n\n - [async-search-scraper](https://github.com/soxoj/async-search-scraper) A really cool asynchronous implementation, written by @soxoj   \n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Search Engines Scraper",
    "version": "1.0",
    "project_urls": {
        "Homepage": "https://github.com/Juanchobanano/Search-Engines-Scraper"
    },
    "split_keywords": [
        "search",
        " browsers"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ebd2a050203ce7603ea1d813e7c8c3ad204032a80fce3b8fb7a653563cc8e43",
                "md5": "ffdeb19d819634543c2a7103d13917c1",
                "sha256": "668f1fdce321e498c3ba2204aa4e5f892b40e96a3170b76ac0a07d57c4a43388"
            },
            "downloads": -1,
            "filename": "search_engines_kit-1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ffdeb19d819634543c2a7103d13917c1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 27944,
            "upload_time": "2024-06-12T07:35:05",
            "upload_time_iso_8601": "2024-06-12T07:35:05.087400Z",
            "url": "https://files.pythonhosted.org/packages/2e/bd/2a050203ce7603ea1d813e7c8c3ad204032a80fce3b8fb7a653563cc8e43/search_engines_kit-1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7ca50986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5",
                "md5": "3b1054cf426e82e2c7c324d8c7d9557c",
                "sha256": "56b7059f7dab2328b6a6b0e0e50f6964e1a47fae517cd5691b3c8784d4e628b2"
            },
            "downloads": -1,
            "filename": "search_engines_kit-1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "3b1054cf426e82e2c7c324d8c7d9557c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 18368,
            "upload_time": "2024-06-12T07:35:06",
            "upload_time_iso_8601": "2024-06-12T07:35:06.322401Z",
            "url": "https://files.pythonhosted.org/packages/7c/a5/0986109189093947d78b5f182a5791970a14ccaf0d3f04e267606e34f7d5/search_engines_kit-1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-12 07:35:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Juanchobanano",
    "github_project": "Search-Engines-Scraper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.22.0"
                ]
            ]
        },
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.10.0"
                ]
            ]
        }
    ],
    "lcname": "search-engines-kit"
}
        
Elapsed time: 1.09934s