rocketscraper


Namerocketscraper JSON
Version 0.0.4 PyPI version JSON
download
home_pageNone
SummaryPython SDK for the Rocket Scraper API.
upload_time2024-11-02 18:01:05
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT
keywords rocketscraper api web-scraping ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Rocket Scraper API Python SDK

[![](https://img.shields.io/pypi/v/rocketscraper)](https://pypi.org/project/rocketscraper)

Python SDK for the [Rocket Scraper API](https://rocketscraper.com). For more information, visit the [GitHub repository](https://github.com/rocketscraper/rocketscraper-sdk-python).

## Requirements

- [Python](https://www.python.org/) version 3.7 or above

## Installation

```bash
pip install rocketscraper
```

## Usage

To use the SDK, you need to create a new instance of the `RocketClient` class and pass your API key as an argument.

### Setup

```python
from rocketscraper import RocketClient

rocket_client = RocketClient('YOUR_API_KEY')  # Simplified constructor
```

### Scrape

The `scrape` method allows you to scrape data from a website using a schema. The method returns the scraped data in the format specified in the schema.

```python
from rocketscraper import RocketClient

try:
    client = RocketClient('YOUR_API_KEY')
    
    # Define a comprehensive product schema
    schema = {
        "productDetails": {
            "name": "string",
            "brand": "string",
            "currentPrice": "number",
            "originalPrice": "number",
            "discount": "number",
            "availability": "boolean",
            "rating": "number",
            "reviewCount": "integer"
        },
        "specifications": [{
            "name": "string",
            "value": "string"
        }],
        "shipping": {
            "freeShipping": "boolean",
            "estimatedDays": "integer"
        }
    }
    
    # Add a detailed task description for better accuracy (optional)
    task_description = """
    Extract product information with the following guidelines:
    1. For prices, use the main displayed price (ignore bulk discounts)
    2. Calculate discount percentage from original and current price
    3. Include all technical specifications found on the page
    4. Extract shipping details from both product and shipping sections
    """
    
    result = client.scrape(
        url='https://marketplace.example.com/products/wireless-earbuds',
        schema=schema,
        task_description=task_description
    )
    print(result)

except Exception as e:
    print(f"Error: {e}")
```

#### Example Output

```python
{
    "productDetails": {
        "name": "Premium Wireless Earbuds Pro X",
        "brand": "AudioTech",
        "currentPrice": 149.99,
        "originalPrice": 199.99,
        "discount": 25.0,
        "availability": true,
        "rating": 4.5,
        "reviewCount": 328
    },
    "specifications": [
        {
            "name": "Battery Life",
            "value": "Up to 8 hours (single charge)"
        },
        {
            "name": "Connectivity",
            "value": "Bluetooth 5.2"
        },
        {
            "name": "Water Resistance",
            "value": "IPX4"
        }
    ],
    "shipping": {
        "freeShipping": true,
        "estimatedDays": 3
    }
}
```

### Error Handling

The SDK will raise exceptions for various error cases. It's recommended to wrap your API calls in try-catch blocks to handle potential errors gracefully.

Common error scenarios:
- Invalid API key
- Invalid URL
- Invalid schema format

## Documentation

For more information on how to use the Rocket Scraper API, visit the [Rocket Scraper API documentation](https://docs.rocketscraper.com).

## License

This project is licensed under the MIT License. See the [LICENSE](https://github.com/rocketscraper/rocketscraper-sdk-python/blob/main/LICENSE) file for more details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "rocketscraper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "rocketscraper, api, web-scraping, ai",
    "author": null,
    "author_email": "Rocket Scraper API <support@rocketscraper.com>",
    "download_url": "https://files.pythonhosted.org/packages/08/d3/7896f39695031f42411a1293f9c7b22c76a43cc09d67893e3bca27342d49/rocketscraper-0.0.4.tar.gz",
    "platform": null,
    "description": "# Rocket Scraper API Python SDK\n\n[![](https://img.shields.io/pypi/v/rocketscraper)](https://pypi.org/project/rocketscraper)\n\nPython SDK for the [Rocket Scraper API](https://rocketscraper.com). For more information, visit the [GitHub repository](https://github.com/rocketscraper/rocketscraper-sdk-python).\n\n## Requirements\n\n- [Python](https://www.python.org/) version 3.7 or above\n\n## Installation\n\n```bash\npip install rocketscraper\n```\n\n## Usage\n\nTo use the SDK, you need to create a new instance of the `RocketClient` class and pass your API key as an argument.\n\n### Setup\n\n```python\nfrom rocketscraper import RocketClient\n\nrocket_client = RocketClient('YOUR_API_KEY')  # Simplified constructor\n```\n\n### Scrape\n\nThe `scrape` method allows you to scrape data from a website using a schema. The method returns the scraped data in the format specified in the schema.\n\n```python\nfrom rocketscraper import RocketClient\n\ntry:\n    client = RocketClient('YOUR_API_KEY')\n    \n    # Define a comprehensive product schema\n    schema = {\n        \"productDetails\": {\n            \"name\": \"string\",\n            \"brand\": \"string\",\n            \"currentPrice\": \"number\",\n            \"originalPrice\": \"number\",\n            \"discount\": \"number\",\n            \"availability\": \"boolean\",\n            \"rating\": \"number\",\n            \"reviewCount\": \"integer\"\n        },\n        \"specifications\": [{\n            \"name\": \"string\",\n            \"value\": \"string\"\n        }],\n        \"shipping\": {\n            \"freeShipping\": \"boolean\",\n            \"estimatedDays\": \"integer\"\n        }\n    }\n    \n    # Add a detailed task description for better accuracy (optional)\n    task_description = \"\"\"\n    Extract product information with the following guidelines:\n    1. For prices, use the main displayed price (ignore bulk discounts)\n    2. Calculate discount percentage from original and current price\n    3. Include all technical specifications found on the page\n    4. Extract shipping details from both product and shipping sections\n    \"\"\"\n    \n    result = client.scrape(\n        url='https://marketplace.example.com/products/wireless-earbuds',\n        schema=schema,\n        task_description=task_description\n    )\n    print(result)\n\nexcept Exception as e:\n    print(f\"Error: {e}\")\n```\n\n#### Example Output\n\n```python\n{\n    \"productDetails\": {\n        \"name\": \"Premium Wireless Earbuds Pro X\",\n        \"brand\": \"AudioTech\",\n        \"currentPrice\": 149.99,\n        \"originalPrice\": 199.99,\n        \"discount\": 25.0,\n        \"availability\": true,\n        \"rating\": 4.5,\n        \"reviewCount\": 328\n    },\n    \"specifications\": [\n        {\n            \"name\": \"Battery Life\",\n            \"value\": \"Up to 8 hours (single charge)\"\n        },\n        {\n            \"name\": \"Connectivity\",\n            \"value\": \"Bluetooth 5.2\"\n        },\n        {\n            \"name\": \"Water Resistance\",\n            \"value\": \"IPX4\"\n        }\n    ],\n    \"shipping\": {\n        \"freeShipping\": true,\n        \"estimatedDays\": 3\n    }\n}\n```\n\n### Error Handling\n\nThe SDK will raise exceptions for various error cases. It's recommended to wrap your API calls in try-catch blocks to handle potential errors gracefully.\n\nCommon error scenarios:\n- Invalid API key\n- Invalid URL\n- Invalid schema format\n\n## Documentation\n\nFor more information on how to use the Rocket Scraper API, visit the [Rocket Scraper API documentation](https://docs.rocketscraper.com).\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/rocketscraper/rocketscraper-sdk-python/blob/main/LICENSE) file for more details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python SDK for the Rocket Scraper API.",
    "version": "0.0.4",
    "project_urls": {
        "Bug Tracker": "https://github.com/rocketscraper/rocketscraper-sdk-python/issues",
        "Homepage": "https://rocketscraper.com",
        "Repository": "https://github.com/rocketscraper/rocketscraper-sdk-python"
    },
    "split_keywords": [
        "rocketscraper",
        " api",
        " web-scraping",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c42046de453a4b3e6c7be1a84f3fd6860fde1e4596b46d34cc486194eb31d77d",
                "md5": "8310f77591f774722f752b03345b9a7c",
                "sha256": "495372bd5e59923225dba393ab048222a36eca8074bcfdff1bc545d5119217e2"
            },
            "downloads": -1,
            "filename": "rocketscraper-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8310f77591f774722f752b03345b9a7c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 5336,
            "upload_time": "2024-11-02T18:01:04",
            "upload_time_iso_8601": "2024-11-02T18:01:04.546112Z",
            "url": "https://files.pythonhosted.org/packages/c4/20/46de453a4b3e6c7be1a84f3fd6860fde1e4596b46d34cc486194eb31d77d/rocketscraper-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "08d37896f39695031f42411a1293f9c7b22c76a43cc09d67893e3bca27342d49",
                "md5": "1e0588bc089fc55a0b6ab9c0a8949841",
                "sha256": "2521d19bfd213dc2e0e204e23dbad339e13059c97c35d3f643bfdbcbf15b449a"
            },
            "downloads": -1,
            "filename": "rocketscraper-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "1e0588bc089fc55a0b6ab9c0a8949841",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4931,
            "upload_time": "2024-11-02T18:01:05",
            "upload_time_iso_8601": "2024-11-02T18:01:05.853996Z",
            "url": "https://files.pythonhosted.org/packages/08/d3/7896f39695031f42411a1293f9c7b22c76a43cc09d67893e3bca27342d49/rocketscraper-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-02 18:01:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rocketscraper",
    "github_project": "rocketscraper-sdk-python",
    "github_not_found": true,
    "lcname": "rocketscraper"
}
        
Elapsed time: 0.58967s