Name | rocketscraper JSON |
Version |
0.0.4
JSON |
| download |
home_page | None |
Summary | Python SDK for the Rocket Scraper API. |
upload_time | 2024-11-02 18:01:05 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.7 |
license | MIT |
keywords |
rocketscraper
api
web-scraping
ai
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Rocket Scraper API Python SDK
[![](https://img.shields.io/pypi/v/rocketscraper)](https://pypi.org/project/rocketscraper)
Python SDK for the [Rocket Scraper API](https://rocketscraper.com). For more information, visit the [GitHub repository](https://github.com/rocketscraper/rocketscraper-sdk-python).
## Requirements
- [Python](https://www.python.org/) version 3.7 or above
## Installation
```bash
pip install rocketscraper
```
## Usage
To use the SDK, you need to create a new instance of the `RocketClient` class and pass your API key as an argument.
### Setup
```python
from rocketscraper import RocketClient
rocket_client = RocketClient('YOUR_API_KEY') # Simplified constructor
```
### Scrape
The `scrape` method allows you to scrape data from a website using a schema. The method returns the scraped data in the format specified in the schema.
```python
from rocketscraper import RocketClient
try:
client = RocketClient('YOUR_API_KEY')
# Define a comprehensive product schema
schema = {
"productDetails": {
"name": "string",
"brand": "string",
"currentPrice": "number",
"originalPrice": "number",
"discount": "number",
"availability": "boolean",
"rating": "number",
"reviewCount": "integer"
},
"specifications": [{
"name": "string",
"value": "string"
}],
"shipping": {
"freeShipping": "boolean",
"estimatedDays": "integer"
}
}
# Add a detailed task description for better accuracy (optional)
task_description = """
Extract product information with the following guidelines:
1. For prices, use the main displayed price (ignore bulk discounts)
2. Calculate discount percentage from original and current price
3. Include all technical specifications found on the page
4. Extract shipping details from both product and shipping sections
"""
result = client.scrape(
url='https://marketplace.example.com/products/wireless-earbuds',
schema=schema,
task_description=task_description
)
print(result)
except Exception as e:
print(f"Error: {e}")
```
#### Example Output
```python
{
"productDetails": {
"name": "Premium Wireless Earbuds Pro X",
"brand": "AudioTech",
"currentPrice": 149.99,
"originalPrice": 199.99,
"discount": 25.0,
"availability": true,
"rating": 4.5,
"reviewCount": 328
},
"specifications": [
{
"name": "Battery Life",
"value": "Up to 8 hours (single charge)"
},
{
"name": "Connectivity",
"value": "Bluetooth 5.2"
},
{
"name": "Water Resistance",
"value": "IPX4"
}
],
"shipping": {
"freeShipping": true,
"estimatedDays": 3
}
}
```
### Error Handling
The SDK will raise exceptions for various error cases. It's recommended to wrap your API calls in try-catch blocks to handle potential errors gracefully.
Common error scenarios:
- Invalid API key
- Invalid URL
- Invalid schema format
## Documentation
For more information on how to use the Rocket Scraper API, visit the [Rocket Scraper API documentation](https://docs.rocketscraper.com).
## License
This project is licensed under the MIT License. See the [LICENSE](https://github.com/rocketscraper/rocketscraper-sdk-python/blob/main/LICENSE) file for more details.
Raw data
{
"_id": null,
"home_page": null,
"name": "rocketscraper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "rocketscraper, api, web-scraping, ai",
"author": null,
"author_email": "Rocket Scraper API <support@rocketscraper.com>",
"download_url": "https://files.pythonhosted.org/packages/08/d3/7896f39695031f42411a1293f9c7b22c76a43cc09d67893e3bca27342d49/rocketscraper-0.0.4.tar.gz",
"platform": null,
"description": "# Rocket Scraper API Python SDK\n\n[![](https://img.shields.io/pypi/v/rocketscraper)](https://pypi.org/project/rocketscraper)\n\nPython SDK for the [Rocket Scraper API](https://rocketscraper.com). For more information, visit the [GitHub repository](https://github.com/rocketscraper/rocketscraper-sdk-python).\n\n## Requirements\n\n- [Python](https://www.python.org/) version 3.7 or above\n\n## Installation\n\n```bash\npip install rocketscraper\n```\n\n## Usage\n\nTo use the SDK, you need to create a new instance of the `RocketClient` class and pass your API key as an argument.\n\n### Setup\n\n```python\nfrom rocketscraper import RocketClient\n\nrocket_client = RocketClient('YOUR_API_KEY') # Simplified constructor\n```\n\n### Scrape\n\nThe `scrape` method allows you to scrape data from a website using a schema. The method returns the scraped data in the format specified in the schema.\n\n```python\nfrom rocketscraper import RocketClient\n\ntry:\n client = RocketClient('YOUR_API_KEY')\n \n # Define a comprehensive product schema\n schema = {\n \"productDetails\": {\n \"name\": \"string\",\n \"brand\": \"string\",\n \"currentPrice\": \"number\",\n \"originalPrice\": \"number\",\n \"discount\": \"number\",\n \"availability\": \"boolean\",\n \"rating\": \"number\",\n \"reviewCount\": \"integer\"\n },\n \"specifications\": [{\n \"name\": \"string\",\n \"value\": \"string\"\n }],\n \"shipping\": {\n \"freeShipping\": \"boolean\",\n \"estimatedDays\": \"integer\"\n }\n }\n \n # Add a detailed task description for better accuracy (optional)\n task_description = \"\"\"\n Extract product information with the following guidelines:\n 1. For prices, use the main displayed price (ignore bulk discounts)\n 2. Calculate discount percentage from original and current price\n 3. Include all technical specifications found on the page\n 4. Extract shipping details from both product and shipping sections\n \"\"\"\n \n result = client.scrape(\n url='https://marketplace.example.com/products/wireless-earbuds',\n schema=schema,\n task_description=task_description\n )\n print(result)\n\nexcept Exception as e:\n print(f\"Error: {e}\")\n```\n\n#### Example Output\n\n```python\n{\n \"productDetails\": {\n \"name\": \"Premium Wireless Earbuds Pro X\",\n \"brand\": \"AudioTech\",\n \"currentPrice\": 149.99,\n \"originalPrice\": 199.99,\n \"discount\": 25.0,\n \"availability\": true,\n \"rating\": 4.5,\n \"reviewCount\": 328\n },\n \"specifications\": [\n {\n \"name\": \"Battery Life\",\n \"value\": \"Up to 8 hours (single charge)\"\n },\n {\n \"name\": \"Connectivity\",\n \"value\": \"Bluetooth 5.2\"\n },\n {\n \"name\": \"Water Resistance\",\n \"value\": \"IPX4\"\n }\n ],\n \"shipping\": {\n \"freeShipping\": true,\n \"estimatedDays\": 3\n }\n}\n```\n\n### Error Handling\n\nThe SDK will raise exceptions for various error cases. It's recommended to wrap your API calls in try-catch blocks to handle potential errors gracefully.\n\nCommon error scenarios:\n- Invalid API key\n- Invalid URL\n- Invalid schema format\n\n## Documentation\n\nFor more information on how to use the Rocket Scraper API, visit the [Rocket Scraper API documentation](https://docs.rocketscraper.com).\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](https://github.com/rocketscraper/rocketscraper-sdk-python/blob/main/LICENSE) file for more details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python SDK for the Rocket Scraper API.",
"version": "0.0.4",
"project_urls": {
"Bug Tracker": "https://github.com/rocketscraper/rocketscraper-sdk-python/issues",
"Homepage": "https://rocketscraper.com",
"Repository": "https://github.com/rocketscraper/rocketscraper-sdk-python"
},
"split_keywords": [
"rocketscraper",
" api",
" web-scraping",
" ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c42046de453a4b3e6c7be1a84f3fd6860fde1e4596b46d34cc486194eb31d77d",
"md5": "8310f77591f774722f752b03345b9a7c",
"sha256": "495372bd5e59923225dba393ab048222a36eca8074bcfdff1bc545d5119217e2"
},
"downloads": -1,
"filename": "rocketscraper-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8310f77591f774722f752b03345b9a7c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 5336,
"upload_time": "2024-11-02T18:01:04",
"upload_time_iso_8601": "2024-11-02T18:01:04.546112Z",
"url": "https://files.pythonhosted.org/packages/c4/20/46de453a4b3e6c7be1a84f3fd6860fde1e4596b46d34cc486194eb31d77d/rocketscraper-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "08d37896f39695031f42411a1293f9c7b22c76a43cc09d67893e3bca27342d49",
"md5": "1e0588bc089fc55a0b6ab9c0a8949841",
"sha256": "2521d19bfd213dc2e0e204e23dbad339e13059c97c35d3f643bfdbcbf15b449a"
},
"downloads": -1,
"filename": "rocketscraper-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "1e0588bc089fc55a0b6ab9c0a8949841",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 4931,
"upload_time": "2024-11-02T18:01:05",
"upload_time_iso_8601": "2024-11-02T18:01:05.853996Z",
"url": "https://files.pythonhosted.org/packages/08/d3/7896f39695031f42411a1293f9c7b22c76a43cc09d67893e3bca27342d49/rocketscraper-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-02 18:01:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rocketscraper",
"github_project": "rocketscraper-sdk-python",
"github_not_found": true,
"lcname": "rocketscraper"
}