# ScrapingBee Python SDK
[](https://github.com/scrapingbee/scrapingbee-python/actions)
[](https://pypi.org/project/scrapingbee/)
[](https://pypi.org/project/scrapingbee/)
[ScrapingBee](https://www.scrapingbee.com/) is a web scraping API that handles headless browsers and rotates proxies for you. The Python SDK makes it easier to interact with ScrapingBee's API.
## Installation
You can install ScrapingBee Python SDK with pip.
```bash
pip install scrapingbee
```
## Usage
The ScrapingBee Python SDK is a wrapper around the [requests](https://docs.python-requests.org/en/master/) library. ScrapingBee supports GET and POST requests.
Signup to ScrapingBee to [get your API key](https://app.scrapingbee.com/account/register) and some free credits to get started.
### Making a GET request
```python
>>> from scrapingbee import ScrapingBeeClient
>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')
>>> response = client.get(
'https://www.scrapingbee.com/blog/',
params={
# Block ads on the page you want to scrape
'block_ads': False,
# Block images and CSS on the page you want to scrape
'block_resources': True,
# Premium proxy geolocation
'country_code': '',
# Control the device the request will be sent from
'device': 'desktop',
# Use some data extraction rules
'extract_rules': {'title': 'h1'},
# Wrap response in JSON
'json_response': False,
# Interact with the webpage you want to scrape
'js_scenario': {
"instructions": [
{"wait_for": "#slow_button"},
{"click": "#slow_button"},
{"scroll_x": 1000},
{"wait": 1000},
{"scroll_x": 1000},
{"wait": 1000},
]
},
# Use premium proxies to bypass difficult to scrape websites (10-25 credits/request)
'premium_proxy': False,
# Execute JavaScript code with a Headless Browser (5 credits/request)
'render_js': True,
# Return the original HTML before the JavaScript rendering
'return_page_source': False,
# Return page screenshot as a png image
'screenshot': False,
# Take a full page screenshot without the window limitation
'screenshot_full_page': False,
# Transparently return the same HTTP code of the page requested.
'transparent_status_code': False,
# Wait, in miliseconds, before returning the response
'wait': 0,
# Wait for CSS selector before returning the response, ex ".title"
'wait_for': '',
# Set the browser window width in pixel
'window_width': 1920,
# Set the browser window height in pixel
'window_height': 1080
},
headers={
# Forward custom headers to the target website
"key": "value"
},
cookies={
# Forward custom cookies to the target website
"name": "value"
}
)
>>> response.text
'<!DOCTYPE html><html lang="en"><head>...'
```
ScrapingBee takes various parameters to render JavaScript, execute a custom JavaScript script, use a premium proxy from a specific geolocation and more.
You can find all the supported parameters on [ScrapingBee's documentation](https://www.scrapingbee.com/documentation/).
You can send custom cookies and headers like you would normally do with the requests library.
## Screenshot
Here a little exemple on how to retrieve and store a screenshot from the ScrapingBee blog in its mobile resolution.
```python
>>> from scrapingbee import ScrapingBeeClient
>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')
>>> response = client.get(
'https://www.scrapingbee.com/blog/',
params={
# Take a screenshot
'screenshot': True,
# Specify that we need the full height
'screenshot_full_page': True,
# Specify a mobile width in pixel
'window_width': 375
}
)
>>> if response.ok:
with open("./scrapingbee_mobile.png", "wb") as f:
f.write(response.content)
```
## Using ScrapingBee with Scrapy
Scrapy is the most popular Python web scraping framework. You can easily [integrate ScrapingBee's API with the Scrapy middleware](https://github.com/ScrapingBee/scrapy-scrapingbee).
## Retries
The client includes a retry mechanism for 5XX responses.
```python
>>> from scrapingbee import ScrapingBeeClient
>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')
>>> response = client.get(
'https://www.scrapingbee.com/blog/',
params={
'render_js': True,
},
retries=5
)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/scrapingbee/scrapingbee-python",
"name": "scrapingbee",
"maintainer": "Pierre de Wulf",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "hello@scrapingbee.com",
"keywords": "",
"author": "Ari Bajo Rouvinen",
"author_email": "arimbr@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/f8/4f/55ca54247c0c124b898ce428027f5cddd39c1377cb27e263edbbda85d348/scrapingbee-2.0.1.tar.gz",
"platform": null,
"description": "# ScrapingBee Python SDK\n\n[](https://github.com/scrapingbee/scrapingbee-python/actions)\n[](https://pypi.org/project/scrapingbee/)\n[](https://pypi.org/project/scrapingbee/)\n\n[ScrapingBee](https://www.scrapingbee.com/) is a web scraping API that handles headless browsers and rotates proxies for you. The Python SDK makes it easier to interact with ScrapingBee's API.\n\n## Installation\n\nYou can install ScrapingBee Python SDK with pip.\n\n```bash\npip install scrapingbee\n```\n\n## Usage\n\nThe ScrapingBee Python SDK is a wrapper around the [requests](https://docs.python-requests.org/en/master/) library. ScrapingBee supports GET and POST requests.\n\nSignup to ScrapingBee to [get your API key](https://app.scrapingbee.com/account/register) and some free credits to get started.\n\n### Making a GET request\n\n```python\n>>> from scrapingbee import ScrapingBeeClient\n\n>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')\n\n>>> response = client.get(\n 'https://www.scrapingbee.com/blog/', \n params={\n # Block ads on the page you want to scrape\t\n 'block_ads': False,\n # Block images and CSS on the page you want to scrape\t\n 'block_resources': True,\n # Premium proxy geolocation\n 'country_code': '',\n # Control the device the request will be sent from\t\n 'device': 'desktop',\n # Use some data extraction rules\n 'extract_rules': {'title': 'h1'},\n # Wrap response in JSON\n 'json_response': False,\n # Interact with the webpage you want to scrape \n 'js_scenario': {\n \"instructions\": [\n {\"wait_for\": \"#slow_button\"},\n {\"click\": \"#slow_button\"},\n {\"scroll_x\": 1000},\n {\"wait\": 1000},\n {\"scroll_x\": 1000},\n {\"wait\": 1000}, \n ]\n },\n # Use premium proxies to bypass difficult to scrape websites (10-25 credits/request)\n 'premium_proxy': False,\n # Execute JavaScript code with a Headless Browser (5 credits/request)\n 'render_js': True,\n # Return the original HTML before the JavaScript rendering\t\n 'return_page_source': False,\n # Return page screenshot as a png image\n 'screenshot': False,\n # Take a full page screenshot without the window limitation\n 'screenshot_full_page': False,\n # Transparently return the same HTTP code of the page requested.\n 'transparent_status_code': False,\n # Wait, in miliseconds, before returning the response\n 'wait': 0,\n # Wait for CSS selector before returning the response, ex \".title\"\n 'wait_for': '',\n # Set the browser window width in pixel\n 'window_width': 1920,\n # Set the browser window height in pixel\n 'window_height': 1080\n },\n headers={\n # Forward custom headers to the target website\n \"key\": \"value\"\n },\n cookies={\n # Forward custom cookies to the target website\n \"name\": \"value\"\n }\n)\n>>> response.text\n'<!DOCTYPE html><html lang=\"en\"><head>...'\n```\n\nScrapingBee takes various parameters to render JavaScript, execute a custom JavaScript script, use a premium proxy from a specific geolocation and more. \n\nYou can find all the supported parameters on [ScrapingBee's documentation](https://www.scrapingbee.com/documentation/).\n\nYou can send custom cookies and headers like you would normally do with the requests library.\n\n## Screenshot\n\nHere a little exemple on how to retrieve and store a screenshot from the ScrapingBee blog in its mobile resolution.\n\n```python\n>>> from scrapingbee import ScrapingBeeClient\n\n>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')\n\n>>> response = client.get(\n 'https://www.scrapingbee.com/blog/', \n params={\n # Take a screenshot\n 'screenshot': True,\n # Specify that we need the full height\n 'screenshot_full_page': True,\n # Specify a mobile width in pixel\n 'window_width': 375\n }\n)\n\n>>> if response.ok:\n with open(\"./scrapingbee_mobile.png\", \"wb\") as f:\n f.write(response.content)\n```\n\n## Using ScrapingBee with Scrapy\n\nScrapy is the most popular Python web scraping framework. You can easily [integrate ScrapingBee's API with the Scrapy middleware](https://github.com/ScrapingBee/scrapy-scrapingbee).\n\n\n## Retries\n\nThe client includes a retry mechanism for 5XX responses.\n\n```python\n>>> from scrapingbee import ScrapingBeeClient\n\n>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')\n\n>>> response = client.get(\n 'https://www.scrapingbee.com/blog/', \n params={\n 'render_js': True,\n },\n retries=5\n)\n```\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "ScrapingBee Python SDK",
"version": "2.0.1",
"project_urls": {
"Homepage": "https://github.com/scrapingbee/scrapingbee-python"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b632c995c1d670f05e4879b5ba23a6d1690a930a519809111504050b889cf715",
"md5": "3ba294de1eb887d3a5a1a63fed19e5f9",
"sha256": "55c7eca71f5be891718795750b3149a2aead1458b0ae6716fdefeb6cf8e81e82"
},
"downloads": -1,
"filename": "scrapingbee-2.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3ba294de1eb887d3a5a1a63fed19e5f9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 5201,
"upload_time": "2023-10-17T15:36:15",
"upload_time_iso_8601": "2023-10-17T15:36:15.137616Z",
"url": "https://files.pythonhosted.org/packages/b6/32/c995c1d670f05e4879b5ba23a6d1690a930a519809111504050b889cf715/scrapingbee-2.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f84f55ca54247c0c124b898ce428027f5cddd39c1377cb27e263edbbda85d348",
"md5": "4e55bbc1097e0a0a2bd424e9c4d88120",
"sha256": "b33622d4c6111c0ae454fa23398a43dccf7b7e126cc44ac498d3411a8514069f"
},
"downloads": -1,
"filename": "scrapingbee-2.0.1.tar.gz",
"has_sig": false,
"md5_digest": "4e55bbc1097e0a0a2bd424e9c4d88120",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 4730,
"upload_time": "2023-10-17T15:36:16",
"upload_time_iso_8601": "2023-10-17T15:36:16.884264Z",
"url": "https://files.pythonhosted.org/packages/f8/4f/55ca54247c0c124b898ce428027f5cddd39c1377cb27e263edbbda85d348/scrapingbee-2.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-10-17 15:36:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "scrapingbee",
"github_project": "scrapingbee-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"tox": true,
"lcname": "scrapingbee"
}