<img width="1300" height="200" alt="sdk-banner(1)" src="https://github.com/user-attachments/assets/c4a7857e-10dd-420b-947a-ed2ea5825cb8" />
<h3 align="center">Python SDK by Bright Data, Easy-to-use scalable methods for web search & scraping</h3>
<p></p>
## Installation
To install the package, open your terminal:
```python
pip install brightdata-sdk
```
> If using macOS, first open a virtual environment for your project
## Quick Start
Create a [Bright Data](https://brightdata.com/) account and copy your API key
### Initialize the Client
```python
from brightdata import bdclient
client = bdclient(api_token="your_api_token_here") # can also be defined as BRIGHTDATA_API_TOKEN in your .env file
```
### Launch first request
Add to your code a serp function
```python
results = client.search("best selling shoes")
print(client.parse_content(results))
```
<img width="4774" height="2149" alt="final-banner" src="https://github.com/user-attachments/assets/1ef4f6ad-b5f2-469f-a260-36d1eeaf8dba" />
## Features
| Feature | Functions | Description
|--------------------------|-----------------------------|-------------------------------------
| **Scrape every website** | `scrape` | Scrape every website using Bright's scraping and unti bot-detection capabilities
| **Web search** | `search` | Search google and other search engines by query (supports batch searches)
| **Web crawling** | `crawl` | Discover and scrape multiple pages from websites with advanced filtering and depth control
| **AI-powered extraction** | `extract` | Extract specific information from websites using natural language queries and OpenAI
| **Content parsing** | `parse_content` | Extract text, links, images and structured data from API responses (JSON or HTML)
| **Browser automation** | `connect_browser` | Get WebSocket endpoint for Playwright/Selenium integration with Bright Data's scraping browser
| **Search chatGPT** | `search_chatGPT` | Prompt chatGPT and scrape its answers, support multiple inputs and follow-up prompts
| **Search linkedin** | `search_linkedin.posts()`, `search_linkedin.jobs()`, `search_linkedin.profiles()` | Search LinkedIn by specific queries, and recieve structured data
| **Scrape linkedin** | `scrape_linkedin.posts()`, `scrape_linkedin.jobs()`, `scrape_linkedin.profiles()`, `scrape_linkedin.companies()` | Scrape LinkedIn and recieve structured data
| **Download functions** | `download_snapshot`, `download_content` | Download content for both sync and async requests
| **Client class** | `bdclient` | Handles authentication, automatic zone creation and managment, and options for robust error handling
| **Parallel processing** | **all functions** | All functions use Concurrent processing for multiple URLs or queries, and support multiple Output Formats
### Try usig one of the functions
#### `Search()`
```python
# Simple single query search
result = client.search("pizza restaurants")
# Try using multiple queries (parallel processing), with custom configuration
queries = ["pizza", "restaurants", "delivery"]
results = client.search(
queries,
search_engine="bing",
country="gb",
format="raw"
)
```
#### `scrape()`
```python
# Simple single URL scrape
result = client.scrape("https://example.com")
# Multiple URLs (parallel processing) with custom options
urls = ["https://example1.com", "https://example2.com", "https://example3.com"]
results = client.scrape(
"urls",
format="raw",
country="gb",
data_format="screenshot"
)
```
#### `search_chatGPT()`
```python
result = client.search_chatGPT(
prompt="what day is it today?"
# prompt=["What are the top 3 programming languages in 2024?", "Best hotels in New York", "Explain quantum computing"],
# additional_prompt=["Can you explain why?", "Are you sure?", ""]
)
client.download_content(result) # In case of timeout error, your snapshot_id is presented and you will downloaded it using download_snapshot()
```
#### `search_linkedin.`
Available functions:
client.**`search_linkedin.posts()`**,client.**`search_linkedin.jobs()`**,client.**`search_linkedin.profiles()`**
```python
# Search LinkedIn profiles by name
first_names = ["James", "Idan"]
last_names = ["Smith", "Vilenski"]
result = client.search_linkedin.profiles(first_names, last_names) # can also be changed to async
# will print the snapshot_id, which can be downloaded using the download_snapshot() function
```
#### `scrape_linkedin.`
Available functions
client.**`scrape_linkedin.posts()`**,client.**`scrape_linkedin.jobs()`**,client.**`scrape_linkedin.profiles()`**,client.**`scrape_linkedin.companies()`**
```python
post_urls = [
"https://www.linkedin.com/posts/orlenchner_scrapecon-activity-7180537307521769472-oSYN?trk=public_profile",
"https://www.linkedin.com/pulse/getting-value-out-sunburst-guillaume-de-b%C3%A9naz%C3%A9?trk=public_profile_article_view"
]
results = client.scrape_linkedin.posts(post_urls) # can also be changed to async
print(results) # will print the snapshot_id, which can be downloaded using the download_snapshot() function
```
#### `crawl()`
```python
# Single URL crawl with filters
result = client.crawl(
url="https://example.com/",
depth=2,
filter="/product/", # Only crawl URLs containing "/product/"
exclude_filter="/ads/", # Exclude URLs containing "/ads/"
custom_output_fields=["markdown", "url", "page_title"]
)
print(f"Crawl initiated. Snapshot ID: {result['snapshot_id']}")
# Download crawl results
data = client.download_snapshot(result['snapshot_id'])
```
#### `parse_content()`
```python
# Parse scraping results
scraped_data = client.scrape("https://example.com")
parsed = client.parse_content(
scraped_data,
extract_text=True,
extract_links=True,
extract_images=True
)
print(f"Title: {parsed['title']}")
print(f"Text length: {len(parsed['text'])}")
print(f"Found {len(parsed['links'])} links")
```
#### `extract()`
```python
# Basic extraction (URL in query)
result = client.extract("Extract news headlines from CNN.com")
print(result)
# Using URL parameter with structured output
schema = {
"type": "object",
"properties": {
"headlines": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["headlines"]
}
result = client.extract(
query="Extract main headlines",
url="https://cnn.com",
output_scheme=schema
)
print(result) # Returns structured JSON matching the schema
```
#### `connect_browser()`
```python
# For Playwright (default browser_type)
from playwright.sync_api import sync_playwright
client = bdclient(
api_token="your_api_token",
browser_username="username-zone-browser_zone1",
browser_password="your_password"
)
with sync_playwright() as playwright:
browser = playwright.chromium.connect_over_cdp(client.connect_browser())
page = browser.new_page()
page.goto("https://example.com")
print(f"Title: {page.title()}")
browser.close()
```
**`download_content`** (for sync requests)
```python
data = client.scrape("https://example.com")
client.download_content(data)
```
**`download_snapshot`** (for async requests)
```python
# Save this function to seperate file
client.download_snapshot("") # Insert your snapshot_id
```
> [!TIP]
> Hover over the "search" or each function in the package, to see all its available parameters.

## Function Parameters
<details>
<summary>🔍 <strong>Search(...)</strong></summary>
Searches using the SERP API. Accepts the same arguments as scrape(), plus:
```python
- `query`: Search query string or list of queries
- `search_engine`: "google", "bing", or "yandex"
- Other parameters same as scrape()
```
</details>
<details>
<summary>🔗 <strong>scrape(...)</strong></summary>
Scrapes a single URL or list of URLs using the Web Unlocker.
```python
- `url`: Single URL string or list of URLs
- `zone`: Zone identifier (auto-configured if None)
- `format`: "json" or "raw"
- `method`: HTTP method
- `country`: Two-letter country code
- `data_format`: "markdown", "screenshot", etc.
- `async_request`: Enable async processing
- `max_workers`: Max parallel workers (default: 10)
- `timeout`: Request timeout in seconds (default: 30)
```
</details>
<details>
<summary>🕷️ <strong>crawl(...)</strong></summary>
Discover and scrape multiple pages from websites with advanced filtering.
```python
- `url`: Single URL string or list of URLs to crawl (required)
- `ignore_sitemap`: Ignore sitemap when crawling (optional)
- `depth`: Maximum crawl depth relative to entered URL (optional)
- `filter`: Regex to include only certain URLs (e.g. "/product/")
- `exclude_filter`: Regex to exclude certain URLs (e.g. "/ads/")
- `custom_output_fields`: List of output fields to include (optional)
- `include_errors`: Include errors in response (default: True)
```
</details>
<details>
<summary>🔍 <strong>parse_content(...)</strong></summary>
Extract and parse useful information from API responses.
```python
- `data`: Response data from scrape(), search(), or crawl() methods
- `extract_text`: Extract clean text content (default: True)
- `extract_links`: Extract all links from content (default: False)
- `extract_images`: Extract image URLs from content (default: False)
```
</details>
<details>
<summary>🤖 <strong>extract(...)</strong></summary>
Extract specific information from websites using AI-powered natural language processing with OpenAI.
```python
- `query`: Natural language query describing what to extract (required)
- `url`: Single URL or list of URLs to extract from (optional - if not provided, extracts URL from query)
- `output_scheme`: JSON Schema for OpenAI Structured Outputs (optional - enables reliable JSON responses)
- `llm_key`: OpenAI API key (optional - uses OPENAI_API_KEY env variable if not provided)
# Returns: ExtractResult object (string-like with metadata attributes)
# Available attributes: .url, .query, .source_title, .token_usage, .content_length
```
</details>
<details>
<summary>🌐 <strong>connect_browser(...)</strong></summary>
Get WebSocket endpoint for browser automation with Bright Data's scraping browser.
```python
# Required client parameters:
- `browser_username`: Username for browser API (format: "username-zone-{zone_name}")
- `browser_password`: Password for browser API authentication
- `browser_type`: "playwright", "puppeteer", or "selenium" (default: "playwright")
# Returns: WebSocket endpoint URL string
```
</details>
<details>
<summary>💾 <strong>Download_Content(...)</strong></summary>
Save content to local file.
```python
- `content`: Content to save
- `filename`: Output filename (auto-generated if None)
- `format`: File format ("json", "csv", "txt", etc.)
```
</details>
<details>
<summary>⚙️ <strong>Configuration Constants</strong></summary>
<p></p>
| Constant | Default | Description |
| ---------------------- | ------- | ------------------------------- |
| `DEFAULT_MAX_WORKERS` | `10` | Max parallel tasks |
| `DEFAULT_TIMEOUT` | `30` | Request timeout (in seconds) |
| `CONNECTION_POOL_SIZE` | `20` | Max concurrent HTTP connections |
| `MAX_RETRIES` | `3` | Retry attempts on failure |
| `RETRY_BACKOFF_FACTOR` | `1.5` | Exponential backoff multiplier |
</details>
## Advanced Configuration
<details>
<summary>🔧 <strong>Environment Variables</strong></summary>
Create a `.env` file in your project root:
```env
BRIGHTDATA_API_TOKEN=your_bright_data_api_token
WEB_UNLOCKER_ZONE=your_web_unlocker_zone # Optional
SERP_ZONE=your_serp_zone # Optional
BROWSER_ZONE=your_browser_zone # Optional
BRIGHTDATA_BROWSER_USERNAME=username-zone-name # For browser automation
BRIGHTDATA_BROWSER_PASSWORD=your_browser_password # For browser automation
OPENAI_API_KEY=your_openai_api_key # For extract() function
```
</details>
<details>
<summary>🌐 <strong>Manage Zones</strong></summary>
List all active zones
```python
# List all active zones
zones = client.list_zones()
print(f"Found {len(zones)} zones")
```
Configure a custom zone name
```python
client = bdclient(
api_token="your_token",
auto_create_zones=False, # Else it creates the Zone automatically
web_unlocker_zone="custom_zone",
serp_zone="custom_serp_zone"
)
```
</details>
<details>
<summary>👥 <strong>Client Management</strong></summary>
bdclient Class - Complete parameter list
```python
bdclient(
api_token: str = None, # Your Bright Data API token (required)
auto_create_zones: bool = True, # Auto-create zones if they don't exist
web_unlocker_zone: str = None, # Custom web unlocker zone name
serp_zone: str = None, # Custom SERP zone name
browser_zone: str = None, # Custom browser zone name
browser_username: str = None, # Browser API username (format: "username-zone-{zone_name}")
browser_password: str = None, # Browser API password
browser_type: str = "playwright", # Browser automation tool: "playwright", "puppeteer", "selenium"
log_level: str = "INFO", # Logging level: "DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"
structured_logging: bool = True, # Use structured JSON logging
verbose: bool = None # Enable verbose logging (overrides log_level if True)
)
```
</details>
<details>
<summary>⚠️ <strong>Error Handling</strong></summary>
bdclient Class
The SDK includes built-in input validation and retry logic
In case of zone related problems, use the **list_zones()** function to check your active zones, and check that your [**account settings**](https://brightdata.com/cp/setting/users), to verify that your API key have **"admin permissions"**.
</details>
## Support
For any issues, contact [Bright Data support](https://brightdata.com/contact), or open an issue in this repository.
Raw data
{
"_id": null,
"home_page": "https://github.com/brightdata/brightdata-sdk-python",
"name": "brightdata-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "Bright Data <idanv@brightdata.com>",
"keywords": "brightdata, web scraping, proxy, serp, search, data extraction",
"author": "Bright Data",
"author_email": "Bright Data <support@brightdata.com>",
"download_url": "https://files.pythonhosted.org/packages/aa/22/66e5e9cb2336d5bf68bd2fea44d2c2f765828ad5baebe8170535f4597ae4/brightdata_sdk-1.1.3.tar.gz",
"platform": null,
"description": "\n<img width=\"1300\" height=\"200\" alt=\"sdk-banner(1)\" src=\"https://github.com/user-attachments/assets/c4a7857e-10dd-420b-947a-ed2ea5825cb8\" />\n\n<h3 align=\"center\">Python SDK by Bright Data, Easy-to-use scalable methods for web search & scraping</h3>\n<p></p>\n\n## Installation\nTo install the package, open your terminal:\n\n```python\npip install brightdata-sdk\n```\n> If using macOS, first open a virtual environment for your project\n\n## Quick Start\n\nCreate a [Bright Data](https://brightdata.com/) account and copy your API key\n\n### Initialize the Client\n\n```python\nfrom brightdata import bdclient\n\nclient = bdclient(api_token=\"your_api_token_here\") # can also be defined as BRIGHTDATA_API_TOKEN in your .env file\n```\n\n### Launch first request\nAdd to your code a serp function\n```python\nresults = client.search(\"best selling shoes\")\n\nprint(client.parse_content(results))\n```\n\n<img width=\"4774\" height=\"2149\" alt=\"final-banner\" src=\"https://github.com/user-attachments/assets/1ef4f6ad-b5f2-469f-a260-36d1eeaf8dba\" />\n\n## Features\n\n| Feature | Functions | Description\n|--------------------------|-----------------------------|-------------------------------------\n| **Scrape every website** | `scrape` | Scrape every website using Bright's scraping and unti bot-detection capabilities\n| **Web search** | `search` | Search google and other search engines by query (supports batch searches)\n| **Web crawling** | `crawl` | Discover and scrape multiple pages from websites with advanced filtering and depth control\n| **AI-powered extraction** | `extract` | Extract specific information from websites using natural language queries and OpenAI\n| **Content parsing** | `parse_content` | Extract text, links, images and structured data from API responses (JSON or HTML)\n| **Browser automation** | `connect_browser` | Get WebSocket endpoint for Playwright/Selenium integration with Bright Data's scraping browser\n| **Search chatGPT** | `search_chatGPT` | Prompt chatGPT and scrape its answers, support multiple inputs and follow-up prompts\n| **Search linkedin** | `search_linkedin.posts()`, `search_linkedin.jobs()`, `search_linkedin.profiles()` | Search LinkedIn by specific queries, and recieve structured data\n| **Scrape linkedin** | `scrape_linkedin.posts()`, `scrape_linkedin.jobs()`, `scrape_linkedin.profiles()`, `scrape_linkedin.companies()` | Scrape LinkedIn and recieve structured data\n| **Download functions** | `download_snapshot`, `download_content` | Download content for both sync and async requests\n| **Client class** | `bdclient` | Handles authentication, automatic zone creation and managment, and options for robust error handling\n| **Parallel processing** | **all functions** | All functions use Concurrent processing for multiple URLs or queries, and support multiple Output Formats\n\n### Try usig one of the functions\n\n#### `Search()`\n```python\n# Simple single query search\nresult = client.search(\"pizza restaurants\")\n\n# Try using multiple queries (parallel processing), with custom configuration\nqueries = [\"pizza\", \"restaurants\", \"delivery\"]\nresults = client.search(\n queries,\n search_engine=\"bing\",\n country=\"gb\",\n format=\"raw\"\n)\n```\n#### `scrape()`\n```python\n# Simple single URL scrape\nresult = client.scrape(\"https://example.com\")\n\n# Multiple URLs (parallel processing) with custom options\nurls = [\"https://example1.com\", \"https://example2.com\", \"https://example3.com\"]\nresults = client.scrape(\n \"urls\",\n format=\"raw\",\n country=\"gb\",\n data_format=\"screenshot\"\n)\n```\n#### `search_chatGPT()`\n```python\nresult = client.search_chatGPT(\n prompt=\"what day is it today?\"\n # prompt=[\"What are the top 3 programming languages in 2024?\", \"Best hotels in New York\", \"Explain quantum computing\"],\n # additional_prompt=[\"Can you explain why?\", \"Are you sure?\", \"\"] \n)\n\nclient.download_content(result) # In case of timeout error, your snapshot_id is presented and you will downloaded it using download_snapshot()\n```\n\n#### `search_linkedin.`\nAvailable functions:\nclient.**`search_linkedin.posts()`**,client.**`search_linkedin.jobs()`**,client.**`search_linkedin.profiles()`**\n```python\n# Search LinkedIn profiles by name\nfirst_names = [\"James\", \"Idan\"]\nlast_names = [\"Smith\", \"Vilenski\"]\n\nresult = client.search_linkedin.profiles(first_names, last_names) # can also be changed to async\n# will print the snapshot_id, which can be downloaded using the download_snapshot() function\n```\n\n#### `scrape_linkedin.`\nAvailable functions\n\nclient.**`scrape_linkedin.posts()`**,client.**`scrape_linkedin.jobs()`**,client.**`scrape_linkedin.profiles()`**,client.**`scrape_linkedin.companies()`**\n```python\npost_urls = [\n \"https://www.linkedin.com/posts/orlenchner_scrapecon-activity-7180537307521769472-oSYN?trk=public_profile\",\n \"https://www.linkedin.com/pulse/getting-value-out-sunburst-guillaume-de-b%C3%A9naz%C3%A9?trk=public_profile_article_view\"\n]\n\nresults = client.scrape_linkedin.posts(post_urls) # can also be changed to async\n\nprint(results) # will print the snapshot_id, which can be downloaded using the download_snapshot() function\n```\n\n#### `crawl()`\n```python\n# Single URL crawl with filters\nresult = client.crawl(\n url=\"https://example.com/\",\n depth=2,\n filter=\"/product/\", # Only crawl URLs containing \"/product/\"\n exclude_filter=\"/ads/\", # Exclude URLs containing \"/ads/\"\n custom_output_fields=[\"markdown\", \"url\", \"page_title\"]\n)\nprint(f\"Crawl initiated. Snapshot ID: {result['snapshot_id']}\")\n\n# Download crawl results\ndata = client.download_snapshot(result['snapshot_id'])\n```\n\n#### `parse_content()`\n```python\n# Parse scraping results\nscraped_data = client.scrape(\"https://example.com\")\nparsed = client.parse_content(\n scraped_data, \n extract_text=True, \n extract_links=True, \n extract_images=True\n)\nprint(f\"Title: {parsed['title']}\")\nprint(f\"Text length: {len(parsed['text'])}\")\nprint(f\"Found {len(parsed['links'])} links\")\n```\n\n#### `extract()`\n```python\n# Basic extraction (URL in query)\nresult = client.extract(\"Extract news headlines from CNN.com\")\nprint(result)\n\n# Using URL parameter with structured output\nschema = {\n \"type\": \"object\",\n \"properties\": {\n \"headlines\": {\n \"type\": \"array\",\n \"items\": {\"type\": \"string\"}\n }\n },\n \"required\": [\"headlines\"]\n}\n\nresult = client.extract(\n query=\"Extract main headlines\",\n url=\"https://cnn.com\",\n output_scheme=schema\n)\nprint(result) # Returns structured JSON matching the schema\n```\n\n#### `connect_browser()`\n```python\n# For Playwright (default browser_type)\nfrom playwright.sync_api import sync_playwright\n\nclient = bdclient(\n api_token=\"your_api_token\",\n browser_username=\"username-zone-browser_zone1\",\n browser_password=\"your_password\"\n)\n\nwith sync_playwright() as playwright:\n browser = playwright.chromium.connect_over_cdp(client.connect_browser())\n page = browser.new_page()\n page.goto(\"https://example.com\")\n print(f\"Title: {page.title()}\")\n browser.close()\n```\n\n**`download_content`** (for sync requests)\n```python\ndata = client.scrape(\"https://example.com\")\nclient.download_content(data) \n```\n**`download_snapshot`** (for async requests)\n```python\n# Save this function to seperate file\nclient.download_snapshot(\"\") # Insert your snapshot_id\n```\n\n> [!TIP]\n> Hover over the \"search\" or each function in the package, to see all its available parameters.\n\n\n\n## Function Parameters\n<details>\n <summary>\ud83d\udd0d <strong>Search(...)</strong></summary>\n \nSearches using the SERP API. Accepts the same arguments as scrape(), plus:\n\n```python\n- `query`: Search query string or list of queries\n- `search_engine`: \"google\", \"bing\", or \"yandex\"\n- Other parameters same as scrape()\n```\n \n</details>\n<details>\n <summary>\ud83d\udd17 <strong>scrape(...)</strong></summary>\n\nScrapes a single URL or list of URLs using the Web Unlocker.\n\n```python\n- `url`: Single URL string or list of URLs\n- `zone`: Zone identifier (auto-configured if None)\n- `format`: \"json\" or \"raw\"\n- `method`: HTTP method\n- `country`: Two-letter country code\n- `data_format`: \"markdown\", \"screenshot\", etc.\n- `async_request`: Enable async processing\n- `max_workers`: Max parallel workers (default: 10)\n- `timeout`: Request timeout in seconds (default: 30)\n```\n\n</details>\n<details>\n <summary>\ud83d\udd77\ufe0f <strong>crawl(...)</strong></summary>\n\nDiscover and scrape multiple pages from websites with advanced filtering.\n\n```python\n- `url`: Single URL string or list of URLs to crawl (required)\n- `ignore_sitemap`: Ignore sitemap when crawling (optional)\n- `depth`: Maximum crawl depth relative to entered URL (optional)\n- `filter`: Regex to include only certain URLs (e.g. \"/product/\")\n- `exclude_filter`: Regex to exclude certain URLs (e.g. \"/ads/\")\n- `custom_output_fields`: List of output fields to include (optional)\n- `include_errors`: Include errors in response (default: True)\n```\n\n</details>\n<details>\n <summary>\ud83d\udd0d <strong>parse_content(...)</strong></summary>\n\nExtract and parse useful information from API responses.\n\n```python\n- `data`: Response data from scrape(), search(), or crawl() methods\n- `extract_text`: Extract clean text content (default: True)\n- `extract_links`: Extract all links from content (default: False)\n- `extract_images`: Extract image URLs from content (default: False)\n```\n\n</details>\n<details>\n <summary>\ud83e\udd16 <strong>extract(...)</strong></summary>\n\nExtract specific information from websites using AI-powered natural language processing with OpenAI.\n\n```python\n- `query`: Natural language query describing what to extract (required)\n- `url`: Single URL or list of URLs to extract from (optional - if not provided, extracts URL from query)\n- `output_scheme`: JSON Schema for OpenAI Structured Outputs (optional - enables reliable JSON responses)\n- `llm_key`: OpenAI API key (optional - uses OPENAI_API_KEY env variable if not provided)\n\n# Returns: ExtractResult object (string-like with metadata attributes)\n# Available attributes: .url, .query, .source_title, .token_usage, .content_length\n```\n\n</details>\n<details>\n <summary>\ud83c\udf10 <strong>connect_browser(...)</strong></summary>\n\nGet WebSocket endpoint for browser automation with Bright Data's scraping browser.\n\n```python\n# Required client parameters:\n- `browser_username`: Username for browser API (format: \"username-zone-{zone_name}\")\n- `browser_password`: Password for browser API authentication\n- `browser_type`: \"playwright\", \"puppeteer\", or \"selenium\" (default: \"playwright\")\n\n# Returns: WebSocket endpoint URL string\n```\n\n</details>\n<details>\n <summary>\ud83d\udcbe <strong>Download_Content(...)</strong></summary>\n\nSave content to local file.\n\n```python\n- `content`: Content to save\n- `filename`: Output filename (auto-generated if None)\n- `format`: File format (\"json\", \"csv\", \"txt\", etc.)\n```\n\n</details>\n<details>\n <summary>\u2699\ufe0f <strong>Configuration Constants</strong></summary>\n\n<p></p>\n\n| Constant | Default | Description |\n| ---------------------- | ------- | ------------------------------- |\n| `DEFAULT_MAX_WORKERS` | `10` | Max parallel tasks |\n| `DEFAULT_TIMEOUT` | `30` | Request timeout (in seconds) |\n| `CONNECTION_POOL_SIZE` | `20` | Max concurrent HTTP connections |\n| `MAX_RETRIES` | `3` | Retry attempts on failure |\n| `RETRY_BACKOFF_FACTOR` | `1.5` | Exponential backoff multiplier |\n\n</details>\n\n## Advanced Configuration\n\n<details>\n <summary>\ud83d\udd27 <strong>Environment Variables</strong></summary>\n\nCreate a `.env` file in your project root:\n\n```env\nBRIGHTDATA_API_TOKEN=your_bright_data_api_token\nWEB_UNLOCKER_ZONE=your_web_unlocker_zone # Optional\nSERP_ZONE=your_serp_zone # Optional\nBROWSER_ZONE=your_browser_zone # Optional\nBRIGHTDATA_BROWSER_USERNAME=username-zone-name # For browser automation\nBRIGHTDATA_BROWSER_PASSWORD=your_browser_password # For browser automation\nOPENAI_API_KEY=your_openai_api_key # For extract() function\n```\n\n</details>\n<details>\n <summary>\ud83c\udf10 <strong>Manage Zones</strong></summary>\n\nList all active zones\n\n```python\n# List all active zones\nzones = client.list_zones()\nprint(f\"Found {len(zones)} zones\")\n```\n\nConfigure a custom zone name\n\n```python\nclient = bdclient(\n api_token=\"your_token\",\n auto_create_zones=False, # Else it creates the Zone automatically\n web_unlocker_zone=\"custom_zone\",\n serp_zone=\"custom_serp_zone\"\n)\n\n```\n\n</details>\n<details>\n <summary>\ud83d\udc65 <strong>Client Management</strong></summary>\n \nbdclient Class - Complete parameter list\n \n```python\nbdclient(\n api_token: str = None, # Your Bright Data API token (required)\n auto_create_zones: bool = True, # Auto-create zones if they don't exist\n web_unlocker_zone: str = None, # Custom web unlocker zone name\n serp_zone: str = None, # Custom SERP zone name\n browser_zone: str = None, # Custom browser zone name\n browser_username: str = None, # Browser API username (format: \"username-zone-{zone_name}\")\n browser_password: str = None, # Browser API password\n browser_type: str = \"playwright\", # Browser automation tool: \"playwright\", \"puppeteer\", \"selenium\"\n log_level: str = \"INFO\", # Logging level: \"DEBUG\", \"INFO\", \"WARNING\", \"ERROR\", \"CRITICAL\"\n structured_logging: bool = True, # Use structured JSON logging\n verbose: bool = None # Enable verbose logging (overrides log_level if True)\n)\n```\n \n</details>\n<details>\n <summary>\u26a0\ufe0f <strong>Error Handling</strong></summary>\n \nbdclient Class\n \nThe SDK includes built-in input validation and retry logic\n\nIn case of zone related problems, use the **list_zones()** function to check your active zones, and check that your [**account settings**](https://brightdata.com/cp/setting/users), to verify that your API key have **\"admin permissions\"**.\n \n</details>\n\n## Support\n\nFor any issues, contact [Bright Data support](https://brightdata.com/contact), or open an issue in this repository.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python SDK for Bright Data Web Scraping and SERP APIs",
"version": "1.1.3",
"project_urls": {
"Bug Reports": "https://github.com/brightdata/bright-data-sdk-python/issues",
"Changelog": "https://github.com/brightdata/bright-data-sdk-python/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/brightdata/bright-data-sdk-python#readme",
"Homepage": "https://github.com/brightdata/bright-data-sdk-python",
"Repository": "https://github.com/brightdata/bright-data-sdk-python"
},
"split_keywords": [
"brightdata",
" web scraping",
" proxy",
" serp",
" search",
" data extraction"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "348984378a9e727e51fd61a3d237b0f586dd7ddb5a15213a2c26aa55243fb1d0",
"md5": "44d79244e3f299ccf274be166715d867",
"sha256": "0af57bb4d6b65dfdd5646b52ce95375124893d841468cb1e563c44597af5ce8a"
},
"downloads": -1,
"filename": "brightdata_sdk-1.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "44d79244e3f299ccf274be166715d867",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 52740,
"upload_time": "2025-09-07T18:21:13",
"upload_time_iso_8601": "2025-09-07T18:21:13.411085Z",
"url": "https://files.pythonhosted.org/packages/34/89/84378a9e727e51fd61a3d237b0f586dd7ddb5a15213a2c26aa55243fb1d0/brightdata_sdk-1.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "aa2266e5e9cb2336d5bf68bd2fea44d2c2f765828ad5baebe8170535f4597ae4",
"md5": "c9fa2ed57c4afebcef8e1673eaa71014",
"sha256": "33897308536f3320e9d71b0cec2001952fd80bd33bb8312674c4f76362597e31"
},
"downloads": -1,
"filename": "brightdata_sdk-1.1.3.tar.gz",
"has_sig": false,
"md5_digest": "c9fa2ed57c4afebcef8e1673eaa71014",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 49773,
"upload_time": "2025-09-07T18:21:14",
"upload_time_iso_8601": "2025-09-07T18:21:14.703060Z",
"url": "https://files.pythonhosted.org/packages/aa/22/66e5e9cb2336d5bf68bd2fea44d2c2f765828ad5baebe8170535f4597ae4/brightdata_sdk-1.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-07 18:21:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "brightdata",
"github_project": "brightdata-sdk-python",
"github_not_found": true,
"lcname": "brightdata-sdk"
}