MultiScraper


NameMultiScraper JSON
Version 1.5.0 PyPI version JSON
download
home_page
SummaryMulti Scraper is a set of tools for fast and easy parsing
upload_time2023-06-22 13:26:46
maintainer
docs_urlNone
authorAlexander554
requires_python>=3.7
license
keywords multiscraper
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MultiScraper
MultiScraper is a set of tools for fast and easy parsing. It includes methods for extracting html and soup objects from modules such as: requests, cloudscraper, Selenium, http.client, Aiohttp, urllib3. It also has 2 assistants. The first one is Regular, it contains ready-made, useful for parsing regular expressions. The second one is ResponseHandler, will help you figure out what the status code means to you. The most important thing is Lxmlsoup. This is an analogue of BeautifulSoup containing the most basic and necessary methods. Its speed exceeds bs4 by 2 times. The syntax is the same.

```
0.7749056816101074 - LxmlSoup
1.4368107318878174 - BeautifulSoup
```
## Installation

MultiScraper requires Python >= 3.7

Install with `pip` from PyPI:

```
pip install MultiScraper
```

### Examples

```python
from MultiScraper import Requests, LxmlSoup

html = Requests.get_html('https://sunlight.net/catalog')
soup = LxmlSoup.LxmlSoup(html)

links = soup.find_all('a', class_='cl-item-link js-cl-item-link js-cl-item-root-link')
for link in links:
    print(link.text(), link.get('href'))
```

```python
from MultiScraper import Aiohttp, LxmlSoup
import asyncio

async def main():
    url = "https://example.com"  
    html = await Aiohttp.get_html(url)
    soup = LxmlSoup.LxmlSoup(html)  
    title = soup.find('h1').text()
    print(title)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
```

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "MultiScraper",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "MultiScraper",
    "author": "Alexander554",
    "author_email": "gaa.280811@gamil.com",
    "download_url": "https://files.pythonhosted.org/packages/d2/11/3b42f9cb22b8fa8c34fe2a85f9ef6848fee014edee02fb6949e96987ec36/MultiScraper-1.5.0.tar.gz",
    "platform": null,
    "description": "# MultiScraper\r\nMultiScraper is a set of tools for fast and easy parsing. It includes methods for extracting html and soup objects from modules such as: requests, cloudscraper, Selenium, http.client, Aiohttp, urllib3. It also has 2 assistants. The first one is Regular, it contains ready-made, useful for parsing regular expressions. The second one is ResponseHandler, will help you figure out what the status code means to you. The most important thing is Lxmlsoup. This is an analogue of BeautifulSoup containing the most basic and necessary methods. Its speed exceeds bs4 by 2 times. The syntax is the same.\r\n\r\n```\r\n0.7749056816101074 - LxmlSoup\r\n1.4368107318878174 - BeautifulSoup\r\n```\r\n## Installation\r\n\r\nMultiScraper requires Python >= 3.7\r\n\r\nInstall with `pip` from PyPI:\r\n\r\n```\r\npip install MultiScraper\r\n```\r\n\r\n### Examples\r\n\r\n```python\r\nfrom MultiScraper import Requests, LxmlSoup\r\n\r\nhtml = Requests.get_html('https://sunlight.net/catalog')\r\nsoup = LxmlSoup.LxmlSoup(html)\r\n\r\nlinks = soup.find_all('a', class_='cl-item-link js-cl-item-link js-cl-item-root-link')\r\nfor link in links:\r\n    print(link.text(), link.get('href'))\r\n```\r\n\r\n```python\r\nfrom MultiScraper import Aiohttp, LxmlSoup\r\nimport asyncio\r\n\r\nasync def main():\r\n    url = \"https://example.com\"  \r\n    html = await Aiohttp.get_html(url)\r\n    soup = LxmlSoup.LxmlSoup(html)  \r\n    title = soup.find('h1').text()\r\n    print(title)\r\n\r\nloop = asyncio.get_event_loop()\r\nloop.run_until_complete(main())\r\n```\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Multi Scraper is a set of tools for fast and easy parsing",
    "version": "1.5.0",
    "project_urls": null,
    "split_keywords": [
        "multiscraper"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c675c05483233212f7c96ff88acb2f1d08033b4f9b6768b508d7dfd9185a46f0",
                "md5": "74f028c5b3ec45ab2f2eae12153068e0",
                "sha256": "6b9f787ec4d65fbe23485eab48ea341162788e1584dbff36265757cc89f85559"
            },
            "downloads": -1,
            "filename": "MultiScraper-1.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74f028c5b3ec45ab2f2eae12153068e0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 8465,
            "upload_time": "2023-06-22T13:26:44",
            "upload_time_iso_8601": "2023-06-22T13:26:44.333786Z",
            "url": "https://files.pythonhosted.org/packages/c6/75/c05483233212f7c96ff88acb2f1d08033b4f9b6768b508d7dfd9185a46f0/MultiScraper-1.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d2113b42f9cb22b8fa8c34fe2a85f9ef6848fee014edee02fb6949e96987ec36",
                "md5": "0be99b949b18867b2108b8607ae178ba",
                "sha256": "235eeaf25267ede2e2595e69e9dcc01699d233490a706b5393b7b149dc16a3be"
            },
            "downloads": -1,
            "filename": "MultiScraper-1.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0be99b949b18867b2108b8607ae178ba",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 5819,
            "upload_time": "2023-06-22T13:26:46",
            "upload_time_iso_8601": "2023-06-22T13:26:46.488519Z",
            "url": "https://files.pythonhosted.org/packages/d2/11/3b42f9cb22b8fa8c34fe2a85f9ef6848fee014edee02fb6949e96987ec36/MultiScraper-1.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-22 13:26:46",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "multiscraper"
}
        
Elapsed time: 0.21774s