Name | MultiScraper JSON |
Version |
1.5.0
JSON |
| download |
home_page | |
Summary | Multi Scraper is a set of tools for fast and easy parsing |
upload_time | 2023-06-22 13:26:46 |
maintainer | |
docs_url | None |
author | Alexander554 |
requires_python | >=3.7 |
license | |
keywords |
multiscraper
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# MultiScraper
MultiScraper is a set of tools for fast and easy parsing. It includes methods for extracting html and soup objects from modules such as: requests, cloudscraper, Selenium, http.client, Aiohttp, urllib3. It also has 2 assistants. The first one is Regular, it contains ready-made, useful for parsing regular expressions. The second one is ResponseHandler, will help you figure out what the status code means to you. The most important thing is Lxmlsoup. This is an analogue of BeautifulSoup containing the most basic and necessary methods. Its speed exceeds bs4 by 2 times. The syntax is the same.
```
0.7749056816101074 - LxmlSoup
1.4368107318878174 - BeautifulSoup
```
## Installation
MultiScraper requires Python >= 3.7
Install with `pip` from PyPI:
```
pip install MultiScraper
```
### Examples
```python
from MultiScraper import Requests, LxmlSoup
html = Requests.get_html('https://sunlight.net/catalog')
soup = LxmlSoup.LxmlSoup(html)
links = soup.find_all('a', class_='cl-item-link js-cl-item-link js-cl-item-root-link')
for link in links:
print(link.text(), link.get('href'))
```
```python
from MultiScraper import Aiohttp, LxmlSoup
import asyncio
async def main():
url = "https://example.com"
html = await Aiohttp.get_html(url)
soup = LxmlSoup.LxmlSoup(html)
title = soup.find('h1').text()
print(title)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
```
Raw data
{
"_id": null,
"home_page": "",
"name": "MultiScraper",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "MultiScraper",
"author": "Alexander554",
"author_email": "gaa.280811@gamil.com",
"download_url": "https://files.pythonhosted.org/packages/d2/11/3b42f9cb22b8fa8c34fe2a85f9ef6848fee014edee02fb6949e96987ec36/MultiScraper-1.5.0.tar.gz",
"platform": null,
"description": "# MultiScraper\r\nMultiScraper is a set of tools for fast and easy parsing. It includes methods for extracting html and soup objects from modules such as: requests, cloudscraper, Selenium, http.client, Aiohttp, urllib3. It also has 2 assistants. The first one is Regular, it contains ready-made, useful for parsing regular expressions. The second one is ResponseHandler, will help you figure out what the status code means to you. The most important thing is Lxmlsoup. This is an analogue of BeautifulSoup containing the most basic and necessary methods. Its speed exceeds bs4 by 2 times. The syntax is the same.\r\n\r\n```\r\n0.7749056816101074 - LxmlSoup\r\n1.4368107318878174 - BeautifulSoup\r\n```\r\n## Installation\r\n\r\nMultiScraper requires Python >= 3.7\r\n\r\nInstall with `pip` from PyPI:\r\n\r\n```\r\npip install MultiScraper\r\n```\r\n\r\n### Examples\r\n\r\n```python\r\nfrom MultiScraper import Requests, LxmlSoup\r\n\r\nhtml = Requests.get_html('https://sunlight.net/catalog')\r\nsoup = LxmlSoup.LxmlSoup(html)\r\n\r\nlinks = soup.find_all('a', class_='cl-item-link js-cl-item-link js-cl-item-root-link')\r\nfor link in links:\r\n print(link.text(), link.get('href'))\r\n```\r\n\r\n```python\r\nfrom MultiScraper import Aiohttp, LxmlSoup\r\nimport asyncio\r\n\r\nasync def main():\r\n url = \"https://example.com\" \r\n html = await Aiohttp.get_html(url)\r\n soup = LxmlSoup.LxmlSoup(html) \r\n title = soup.find('h1').text()\r\n print(title)\r\n\r\nloop = asyncio.get_event_loop()\r\nloop.run_until_complete(main())\r\n```\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Multi Scraper is a set of tools for fast and easy parsing",
"version": "1.5.0",
"project_urls": null,
"split_keywords": [
"multiscraper"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c675c05483233212f7c96ff88acb2f1d08033b4f9b6768b508d7dfd9185a46f0",
"md5": "74f028c5b3ec45ab2f2eae12153068e0",
"sha256": "6b9f787ec4d65fbe23485eab48ea341162788e1584dbff36265757cc89f85559"
},
"downloads": -1,
"filename": "MultiScraper-1.5.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "74f028c5b3ec45ab2f2eae12153068e0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 8465,
"upload_time": "2023-06-22T13:26:44",
"upload_time_iso_8601": "2023-06-22T13:26:44.333786Z",
"url": "https://files.pythonhosted.org/packages/c6/75/c05483233212f7c96ff88acb2f1d08033b4f9b6768b508d7dfd9185a46f0/MultiScraper-1.5.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d2113b42f9cb22b8fa8c34fe2a85f9ef6848fee014edee02fb6949e96987ec36",
"md5": "0be99b949b18867b2108b8607ae178ba",
"sha256": "235eeaf25267ede2e2595e69e9dcc01699d233490a706b5393b7b149dc16a3be"
},
"downloads": -1,
"filename": "MultiScraper-1.5.0.tar.gz",
"has_sig": false,
"md5_digest": "0be99b949b18867b2108b8607ae178ba",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 5819,
"upload_time": "2023-06-22T13:26:46",
"upload_time_iso_8601": "2023-06-22T13:26:46.488519Z",
"url": "https://files.pythonhosted.org/packages/d2/11/3b42f9cb22b8fa8c34fe2a85f9ef6848fee014edee02fb6949e96987ec36/MultiScraper-1.5.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-22 13:26:46",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "multiscraper"
}