scrapemaster


Namescrapemaster JSON
Version 0.2.2 PyPI version JSON
download
home_pagehttps://github.com/ParisNeo/ScrapeMaster
SummaryA versatile web scraping library with multiple techniques
upload_time2024-10-07 21:22:48
maintainerNone
docs_urlNone
authorParisNeo
requires_python>=3.7
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ScrapeMaster

ScrapeMaster is a comprehensive Python library for web scraping that handles both simple and complex websites, offering features like text and image extraction, session management, and anti-bot circumvention techniques.

## Features

- Scrape text and images from websites
- Handle JavaScript-rendered content using Selenium
- Manage cookies and sessions for authenticated scraping
- Rotate user agents and use proxies to avoid detection
- Clean and format extracted data

## Installation

You can install ScrapeMaster using pip:

```
pip install ScrapeMaster
```

## Quick Start

Here's a simple example of how to use ScrapeMaster:

```python
from scrapemaster import ScrapeMaster

scraper = ScrapeMaster('https://example.com')
results = scraper.scrape_all('p', 'img', 'output_images')
print(results['texts'])
print(results['image_urls'])
```

## Advanced Usage

For more advanced usage, including handling of JavaScript-rendered content and authenticated scraping, please refer to the documentation.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ParisNeo/ScrapeMaster",
    "name": "scrapemaster",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": null,
    "author": "ParisNeo",
    "author_email": "parisneoai@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/78/1a/112713953492665a15cfe63d6803bf23f3d0d7a3cf6c342569e02e9b834f/scrapemaster-0.2.2.tar.gz",
    "platform": null,
    "description": "# ScrapeMaster\r\n\r\nScrapeMaster is a comprehensive Python library for web scraping that handles both simple and complex websites, offering features like text and image extraction, session management, and anti-bot circumvention techniques.\r\n\r\n## Features\r\n\r\n- Scrape text and images from websites\r\n- Handle JavaScript-rendered content using Selenium\r\n- Manage cookies and sessions for authenticated scraping\r\n- Rotate user agents and use proxies to avoid detection\r\n- Clean and format extracted data\r\n\r\n## Installation\r\n\r\nYou can install ScrapeMaster using pip:\r\n\r\n```\r\npip install ScrapeMaster\r\n```\r\n\r\n## Quick Start\r\n\r\nHere's a simple example of how to use ScrapeMaster:\r\n\r\n```python\r\nfrom scrapemaster import ScrapeMaster\r\n\r\nscraper = ScrapeMaster('https://example.com')\r\nresults = scraper.scrape_all('p', 'img', 'output_images')\r\nprint(results['texts'])\r\nprint(results['image_urls'])\r\n```\r\n\r\n## Advanced Usage\r\n\r\nFor more advanced usage, including handling of JavaScript-rendered content and authenticated scraping, please refer to the documentation.\r\n\r\n## Contributing\r\n\r\nContributions are welcome! Please feel free to submit a Pull Request.\r\n\r\n## License\r\n\r\nThis project is licensed under the MIT License - see the LICENSE file for details.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A versatile web scraping library with multiple techniques",
    "version": "0.2.2",
    "project_urls": {
        "Homepage": "https://github.com/ParisNeo/ScrapeMaster"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9b936fe0d3480f49bad3f9bd3114ef1a51bd189c5fc15256ddffd4f0f5fd8f11",
                "md5": "9916f3ee736307eff93e00d0749ae867",
                "sha256": "7ea3fa9deb3dc87781c421aa9345446f7b5eb06f56760d5581a44c8b1249adb7"
            },
            "downloads": -1,
            "filename": "scrapemaster-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9916f3ee736307eff93e00d0749ae867",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 10591,
            "upload_time": "2024-10-07T21:22:46",
            "upload_time_iso_8601": "2024-10-07T21:22:46.128493Z",
            "url": "https://files.pythonhosted.org/packages/9b/93/6fe0d3480f49bad3f9bd3114ef1a51bd189c5fc15256ddffd4f0f5fd8f11/scrapemaster-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "781a112713953492665a15cfe63d6803bf23f3d0d7a3cf6c342569e02e9b834f",
                "md5": "a5c7caba57d1c6ebf27c27c44da91887",
                "sha256": "038e9f5493c29898d27b5ae3cb770f9760cde9a50e54d95560802c41a58a52b0"
            },
            "downloads": -1,
            "filename": "scrapemaster-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "a5c7caba57d1c6ebf27c27c44da91887",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 11213,
            "upload_time": "2024-10-07T21:22:48",
            "upload_time_iso_8601": "2024-10-07T21:22:48.111982Z",
            "url": "https://files.pythonhosted.org/packages/78/1a/112713953492665a15cfe63d6803bf23f3d0d7a3cf6c342569e02e9b834f/scrapemaster-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-07 21:22:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ParisNeo",
    "github_project": "ScrapeMaster",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "scrapemaster"
}
        
Elapsed time: 0.36345s