pydoll-python


Namepydoll-python JSON
Version 2.8.0 PyPI version JSON
download
home_pageNone
SummaryPydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.
upload_time2025-08-28 04:11:22
maintainerNone
docs_urlNone
authorThalison Fernandes
requires_python<4.0,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
    <img src="https://github.com/user-attachments/assets/219f2dbc-37ed-4aea-a289-ba39cdbb335d" alt="Pydoll Logo" /> <br>
</p>
<h1 align="center">Pydoll: Automate the Web, Naturally</h1>

<p align="center">
    <a href="https://github.com/autoscrape-labs/pydoll/stargazers"><img src="https://img.shields.io/github/stars/autoscrape-labs/pydoll?style=social"></a>
    <a href="https://codecov.io/gh/autoscrape-labs/pydoll" >
        <img src="https://codecov.io/gh/autoscrape-labs/pydoll/graph/badge.svg?token=40I938OGM9"/>
    </a>
    <img src="https://github.com/autoscrape-labs/pydoll/actions/workflows/tests.yml/badge.svg" alt="Tests">
    <img src="https://github.com/autoscrape-labs/pydoll/actions/workflows/ruff-ci.yml/badge.svg" alt="Ruff CI">
    <img src="https://github.com/autoscrape-labs/pydoll/actions/workflows/mypy.yml/badge.svg" alt="MyPy CI">
    <img src="https://img.shields.io/badge/python-%3E%3D3.10-blue" alt="Python >= 3.10">
    <a href="https://deepwiki.com/autoscrape-labs/pydoll"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
</p>


<p align="center">
  📖 <a href="https://pydoll.tech/">Documentation</a> •
  🚀 <a href="#-getting-started">Getting Started</a> •
  ⚡ <a href="#-advanced-features">Advanced Features</a> •
  🤝 <a href="#-contributing">Contributing</a> •
  💖 <a href="#-support-my-work">Support My Work</a>
</p>

Imagine the following scenario: you need to automate tasks in your browser. Maybe it's testing a web application, collecting data from a site, or even automating repetitive processes. Normally this involves using external drivers, complex configurations, and many compatibility issues.

**Pydoll was born to solve these problems.**

Built from scratch with a different philosophy, Pydoll connects directly to the Chrome DevTools Protocol (CDP), eliminating the need for external drivers. This clean implementation along with realistic ways of clicking, navigating and interacting with elements makes it practically indistinguishable from a real user.

We believe that powerful automation shouldn't require you to become an expert in configuration or constantly fight with bot protection systems. With Pydoll, you can focus on what really matters: your automation logic, not the underlying complexity or protection systems.

<div>
  <h4>Be a good human. Give it a star ⭐</h4> 
    No stars, no bugs fixed. Just kidding (maybe)
</div>

## 🌟 What makes Pydoll special?

- **Zero Webdrivers**: Say goodbye to webdriver compatibility issues
- **Human-like Interaction Engine**: Capable of passing behavioral CAPTCHAs like reCAPTCHA v3 or Turnstile, depending on IP reputation and interaction patterns
- **Asynchronous Performance**: For high-speed automation and multiple simultaneous tasks
- **Humanized Interactions**: Mimic real user behavior
- **Simplicity**: With Pydoll, you install and you're ready to automate.

## What's New

### Remote connections via WebSocket — control any Chrome from anywhere!

You asked for it, we delivered. You can now connect to an already running browser remotely via its WebSocket address and use the full Pydoll API immediately.

```python
from pydoll.browser.chromium import Chrome

chrome = Chrome()
tab = await chrome.connect('ws://YOUR_HOST:9222/devtools/browser/XXXX')

# Full power unlocked: navigation, element automation, requests, events…
await tab.go_to('https://example.com')
title = await tab.execute_script('return document.title')
print(title)
```

This makes it effortless to run Pydoll against remote/CI browsers, containers, or shared debugging targets — no local launch required. Just point to the WS endpoint and automate.

### Navigate the DOM like a pro: get_children_elements() and get_siblings_elements()

Two delightful helpers to traverse complex layouts with intention:

```python
# Grab direct children of a container
container = await tab.find(id='cards')
cards = await container.get_children_elements(max_depth=1)

# Want to go deeper? This will return children of children (and so on)
elements = await container.get_children_elements(max_depth=2) 

# Walk horizontal lists without re-querying the DOM
active = await tab.find(class_name='item-active')
siblings = await active.get_siblings_elements()

print(len(cards), len(siblings))
```

Use them to cut boilerplate, express intent, and keep your scraping/automation logic clean and readable — especially in dynamic grids, lists and menus.

### WebElement: state waiting and new public APIs

- New `wait_until(...)` on `WebElement` to await element states with minimal code:

```python
# Wait until it becomes visible OR the timeout expires
await element.wait_until(is_visible=True, timeout=5)

# Wait until it becomes interactable (visible, on top, receiving pointer events)
await element.wait_until(is_interactable=True, timeout=10)
```

- Methods now public on `WebElement`:
  - `is_visible()`
    - Checks that the element has a visible area (> 0), isn’t hidden by CSS and is in the viewport (after `scroll_into_view()` when needed). Useful pre-check before interactions.
  - `is_interactable()`
    - “Click-ready” state: combines visibility, enabledness and pointer-event hit testing. Ideal for robust flows that avoid lost clicks.
  - `is_on_top()`
    - Verifies the element is the top hit-test target at the intended click point, avoiding overlays.
  - `execute_script(script: str, return_by_value: bool = False)`
    - Executes JavaScript in the element’s own context (where `this` is the element). Great for fine-tuning and quick inspections.

```python
# Visually outline the element via JS
await element.execute_script("this.style.outline='2px solid #22d3ee'")

# Confirm states
visible = await element.is_visible()
interactable = await element.is_interactable()
on_top = await element.is_on_top()
```

These additions simplify waiting and state validation before clicking/typing, reducing flakiness and making automations more predictable.


## 📦 Installation

```bash
pip install pydoll-python
```

And that's it! Just install and start automating.

## 🚀 Getting Started

### Your first automation

Let's start with a real example: an automation that performs a Google search and clicks on the first result. With this example, we can see how the library works and how you can start automating your tasks.

```python
import asyncio

from pydoll.browser import Chrome
from pydoll.constants import Key

async def google_search(query: str):
    async with Chrome() as browser:
        tab = await browser.start()
        await tab.go_to('https://www.google.com')
        search_box = await tab.find(tag_name='textarea', name='q')
        await search_box.insert_text(query)
        await search_box.press_keyboard_key(Key.ENTER)
        await (await tab.find(
            tag_name='h3',
            text='autoscrape-labs/pydoll',
            timeout=10,
        )).click()
        await tab.find(id='repository-container-header', timeout=10)

asyncio.run(google_search('pydoll python'))
```

Without configurations, just a simple script, we can do a complete Google search!
Okay, now let's see how we can extract data from a page, using the same previous example.
Let's consider in the code below that we're already on the Pydoll page. We want to extract the following information:

- Project description
- Number of stars
- Number of forks
- Number of issues
- Number of pull requests

Let's get started! To get the project description, we'll use xpath queries. You can check the documentation on how to build your own queries.

```python
description = await (await tab.query(
    '//h2[contains(text(), "About")]/following-sibling::p',
    timeout=10,
)).text
```

And that's it! Let's understand what this query does:

1. `//h2[contains(text(), "About")]` - Selects the first `<h2>` that contains "About"
2. `/following-sibling::p` - Selects the first `<p>` that comes after the `<h2>`

Now let's get the rest of the data:

```python
number_of_stars = await (await tab.find(
    id='repo-stars-counter-star'
)).text

number_of_forks = await (await tab.find(
    id='repo-network-counter'
)).text
number_of_issues = await (await tab.find(
    id='issues-repo-tab-count',
)).text
number_of_pull_requests = await (await tab.find(
    id='pull-requests-repo-tab-count',
)).text

data = {
    'description': description,
    'number_of_stars': number_of_stars,
    'number_of_forks': number_of_forks,
    'number_of_issues': number_of_issues,
    'number_of_pull_requests': number_of_pull_requests,
}
print(data)

```

We managed to extract all the necessary data!

### Custom Configurations

Sometimes we need more control over the browser. Pydoll offers a flexible way to do this. Let's see the example below:


```python
from pydoll.browser import Chrome
from pydoll.browser.options import ChromiumOptions as Options

async def custom_automation():
    # Configure browser options
    options = Options()
    options.add_argument('--proxy-server=username:password@ip:port')
    options.add_argument('--window-size=1920,1080')
    options.binary_location = '/path/to/your/browser'
    options.start_timeout = 20

    async with Chrome(options=options) as browser:
        tab = await browser.start()
        # Your automation code here
        await tab.go_to('https://example.com')
        # The browser is now using your custom settings

asyncio.run(custom_automation())
```

In this example, we're configuring the browser to use a proxy and a 1920x1080 window, in addition to a custom path for the Chrome binary, in case your installation location is different from the common defaults.


## ⚡ Advanced Features

Pydoll offers a series of advanced features to please even the most
demanding users.



### Advanced Element Search

We have several ways to find elements on the page. No matter how you prefer, we have a way that makes sense for you:

```python
import asyncio
from pydoll.browser import Chrome

async def element_finding_examples():
    async with Chrome() as browser:
        tab = await browser.start()
        await tab.go_to('https://example.com')

        # Find by attributes (most intuitive)
        submit_btn = await tab.find(
            tag_name='button',
            class_name='btn-primary',
            text='Submit'
        )
        # Find by ID
        username_field = await tab.find(id='username')
        # Find multiple elements
        all_links = await tab.find(tag_name='a', find_all=True)
        # CSS selectors and XPath
        nav_menu = await tab.query('nav.main-menu')
        specific_item = await tab.query('//div[@data-testid="item-123"]')
        # With timeout and error handling
        delayed_element = await tab.find(
            class_name='dynamic-content',
            timeout=10,
            raise_exc=False  # Returns None if not found
        )
        # Advanced: Custom attributes
        custom_element = await tab.find(
            data_testid='submit-button',
            aria_label='Submit form'
        )

asyncio.run(element_finding_examples())
```

The `find` method is more user-friendly. We can search by common attributes like id, tag_name, class_name, etc., up to custom attributes (e.g. `data-testid`).

If that's not enough, we can use the `query` method to search for elements using CSS selectors, XPath queries, etc. Pydoll automatically takes care of identifying what type of query we're using.


### Browser-context HTTP requests - game changer for hybrid automation!
Ever wished you could make HTTP requests that automatically inherit all your browser's session state? **Now you can!**<br>
The `tab.request` property gives you a beautiful `requests`-like interface that executes HTTP calls directly in the browser's JavaScript context. This means every request automatically gets cookies, authentication headers, CORS policies, and session state, just as if the browser made the request itself.

**Perfect for Hybrid Automation:**
```python
# Navigate to a site and login normally with PyDoll
await tab.go_to('https://example.com/login')
await (await tab.find(id='username')).type_text('user@example.com')
await (await tab.find(id='password')).type_text('password')
await (await tab.find(id='login-btn')).click()

# Now make API calls that inherit the logged-in session!
response = await tab.request.get('https://example.com/api/user/profile')
user_data = response.json()

# POST data while staying authenticated
response = await tab.request.post(
    'https://example.com/api/settings', 
    json={'theme': 'dark', 'notifications': True}
)

# Access response content in different formats
raw_data = response.content
text_data = response.text
json_data = response.json()

# Check cookies that were set
for cookie in response.cookies:
    print(f"Cookie: {cookie['name']} = {cookie['value']}")

# Add custom headers to your requests
headers = [
    {'name': 'X-Custom-Header', 'value': 'my-value'},
    {'name': 'X-API-Version', 'value': '2.0'}
]

await tab.request.get('https://api.example.com/data', headers=headers)

```

**Why this is great:**
- **No more session juggling** - Requests inherit browser cookies automatically
- **CORS just works** - Requests respect browser security policies  
- **Perfect for modern SPAs** - Seamlessly mix UI automation with API calls
- **Authentication made easy** - Login once via UI, then hammer APIs
- **Hybrid workflows** - Use the best tool for each step (UI or API)

This opens up incredible possibilities for automation scenarios where you need both browser interaction AND API efficiency!

### New expect_download() context manager — robust file downloads made easy!
Tired of fighting with flaky download flows, missing files, or racy event listeners? Meet `tab.expect_download()`, a delightful, reliable way to handle file downloads.

- Automatically sets the browser’s download behavior
- Works with your own directory or a temporary folder (auto-cleaned!)
- Waits for completion with a timeout (so your tests don’t hang)
- Gives you a handy handle to read bytes/base64 or check `file_path`

Tiny example that just works:

```python
import asyncio
from pathlib import Path
from pydoll.browser import Chrome

async def download_report():
    async with Chrome() as browser:
        tab = await browser.start()
        await tab.go_to('https://example.com/reports')

        target_dir = Path('/tmp/my-downloads')
        async with tab.expect_download(keep_file_at=target_dir, timeout=10) as download:
            # Trigger the download in the page (button/link/etc.)
            await (await tab.find(text='Download latest report')).click()
            # Wait until finished and read the content
            data = await download.read_bytes()
            print(f"Downloaded {len(data)} bytes to: {download.file_path}")

asyncio.run(download_report())
```

Want zero-hassle cleanup? Omit `keep_file_at` and we’ll create a temp folder and remove it automatically after the context exits. Perfect for tests.

### Total browser control with custom preferences! (thanks to [@LucasAlvws](https://github.com/LucasAlvws))
Want to completely customize how Chrome behaves? **Now you can control EVERYTHING!**<br>
The new `browser_preferences` system gives you access to hundreds of internal Chrome settings that were previously impossible to change programmatically. We're talking about deep browser customization that goes way beyond command-line flags!

**The possibilities are endless:**
```python
options = ChromiumOptions()

# Create the perfect automation environment
options.browser_preferences = {
    'download': {
        'default_directory': '/tmp/downloads',
        'prompt_for_download': False,
        'directory_upgrade': True,
        'extensions_to_open': ''  # Don't auto-open any downloads
    },
    'profile': {
        'default_content_setting_values': {
            'notifications': 2,        # Block all notifications
            'geolocation': 2,         # Block location requests
            'media_stream_camera': 2, # Block camera access
            'media_stream_mic': 2,    # Block microphone access
            'popups': 1               # Allow popups (useful for automation)
        },
        'password_manager_enabled': False,  # Disable password prompts
        'exit_type': 'Normal'              # Always exit cleanly
    },
    'intl': {
        'accept_languages': 'en-US,en',
        'charset_default': 'UTF-8'
    },
    'browser': {
        'check_default_browser': False,    # Don't ask about default browser
        'show_update_promotion_infobar': False
    }
}

# Or use the convenient helper methods
options.set_default_download_directory('/tmp/downloads')
options.set_accept_languages('en-US,en,pt-BR')  
options.prompt_for_download = False
```

**Real-world power examples:**
- **Silent downloads** - No prompts, no dialogs, just automated downloads
- **Block ALL distractions** - Notifications, popups, camera requests, you name it
- **Perfect for CI/CD** - Disable update checks, default browser prompts, crash reporting
- **Multi-region testing** - Change languages, timezones, and locale settings instantly
- **Security hardening** - Lock down permissions and disable unnecessary features
- **Advanced fingerprinting control** - Modify browser install dates, engagement history, and behavioral patterns

**Fingerprint customization for stealth automation:**
```python
import time

# Simulate a browser that's been around for months
fake_engagement_time = int(time.time()) - (7 * 24 * 60 * 60)  # 7 days ago

options.browser_preferences = {
    'settings': {
        'touchpad': {
            'natural_scroll': True,
        }
    },
    'profile': {
        'last_engagement_time': fake_engagement_time,
        'exit_type': 'Normal',
        'exited_cleanly': True
    },
    'newtab_page_location_override': 'https://www.google.com',
    'session': {
        'restore_on_startup': 1,  # Restore last session
        'startup_urls': ['https://www.google.com']
    }
}
```

This level of control was previously only available to Chrome extension developers - now it's in your automation toolkit!

Check the [documentation](https://pydoll.tech/docs/features/#custom-browser-preferences/) for more details.

### Concurrent Automation

One of the great advantages of Pydoll is the ability to process multiple tasks simultaneously thanks to its asynchronous implementation. We can automate multiple tabs
at the same time! Let's see an example:

```python
import asyncio
from pydoll.browser import Chrome

async def scrape_page(url, tab):
    await tab.go_to(url)
    title = await tab.execute_script('return document.title')
    links = await tab.find(tag_name='a', find_all=True)
    return {
        'url': url,
        'title': title,
        'link_count': len(links)
    }

async def concurrent_scraping():
    browser = Chrome()
    tab_google = await browser.start()
    tab_duckduckgo = await browser.new_tab()
    tasks = [
        scrape_page('https://google.com/', tab_google),
        scrape_page('https://duckduckgo.com/', tab_duckduckgo)
    ]
    results = await asyncio.gather(*tasks)
    print(results)
    await browser.stop()

asyncio.run(concurrent_scraping())
```

We managed to extract data from two pages at the same time!

And there's much, much more! Event system for reactive automations, request interception and modification, and so on. Take a look at the documentation, you won't
regret it!


## 🔧 Quick Troubleshooting

**Browser not found?**
```python
from pydoll.browser import Chrome
from pydoll.browser.options import ChromiumOptions

options = ChromiumOptions()
options.binary_location = '/path/to/your/chrome'
browser = Chrome(options=options)
```

**Browser starts after a FailedToStartBrowser error?**
```python
from pydoll.browser import Chrome
from pydoll.browser.options import ChromiumOptions

options = ChromiumOptions()
options.start_timeout = 20  # default is 10 seconds

browser = Chrome(options=options)
```

**Need a proxy?**
```python
options.add_argument('--proxy-server=your-proxy:port')
```

**Running in Docker?**
```python
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
```

## 📚 Documentation

For complete documentation, detailed examples and deep dives into all Pydoll functionalities, visit our [official documentation](https://pydoll.tech/).

The documentation includes:
- **Getting Started Guide** - Step-by-step tutorials
- **API Reference** - Complete method documentation
- **Advanced Techniques** - Network interception, event handling, performance optimization

>The chinese version of this README is [here](README_zh.md).

## 🤝 Contributing

We would love your help to make Pydoll even better! Check out our [contribution guidelines](CONTRIBUTING.md) to get started. Whether it's fixing bugs, adding features or improving documentation - all contributions are welcome!

Please make sure to:
- Write tests for new features or bug fixes
- Follow code style and conventions
- Use conventional commits for pull requests
- Run lint checks and tests before submitting

## 💖 Support My Work

If you find Pydoll useful, consider [supporting me on GitHub](https://github.com/sponsors/thalissonvs).  
You'll get access to exclusive benefits like priority support, custom features and much more!

Can't sponsor right now? No problem, you can still help a lot by:
- Starring the repository
- Sharing on social media
- Writing posts or tutorials
- Giving feedback or reporting issues

Every bit of support makes a difference/

## 💬 Spread the word

If Pydoll saved you time, mental health, or a keyboard from being smashed, give it a ⭐, share it, or tell your weird dev friends.

## 📄 License

Pydoll is licensed under the [MIT License](LICENSE).

<p align="center">
  <b>Pydoll</b> — Making browser automation magical!
</p>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pydoll-python",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Thalison Fernandes",
    "author_email": "thalissfernandes99@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/42/5d/b43a5eaa0a80aa7a74b623e365ab9d1882925aa8fbe0d7043b5feeeec935/pydoll_python-2.8.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n    <img src=\"https://github.com/user-attachments/assets/219f2dbc-37ed-4aea-a289-ba39cdbb335d\" alt=\"Pydoll Logo\" /> <br>\n</p>\n<h1 align=\"center\">Pydoll: Automate the Web, Naturally</h1>\n\n<p align=\"center\">\n    <a href=\"https://github.com/autoscrape-labs/pydoll/stargazers\"><img src=\"https://img.shields.io/github/stars/autoscrape-labs/pydoll?style=social\"></a>\n    <a href=\"https://codecov.io/gh/autoscrape-labs/pydoll\" >\n        <img src=\"https://codecov.io/gh/autoscrape-labs/pydoll/graph/badge.svg?token=40I938OGM9\"/>\n    </a>\n    <img src=\"https://github.com/autoscrape-labs/pydoll/actions/workflows/tests.yml/badge.svg\" alt=\"Tests\">\n    <img src=\"https://github.com/autoscrape-labs/pydoll/actions/workflows/ruff-ci.yml/badge.svg\" alt=\"Ruff CI\">\n    <img src=\"https://github.com/autoscrape-labs/pydoll/actions/workflows/mypy.yml/badge.svg\" alt=\"MyPy CI\">\n    <img src=\"https://img.shields.io/badge/python-%3E%3D3.10-blue\" alt=\"Python >= 3.10\">\n    <a href=\"https://deepwiki.com/autoscrape-labs/pydoll\"><img src=\"https://deepwiki.com/badge.svg\" alt=\"Ask DeepWiki\"></a>\n</p>\n\n\n<p align=\"center\">\n  \ud83d\udcd6 <a href=\"https://pydoll.tech/\">Documentation</a> \u2022\n  \ud83d\ude80 <a href=\"#-getting-started\">Getting Started</a> \u2022\n  \u26a1 <a href=\"#-advanced-features\">Advanced Features</a> \u2022\n  \ud83e\udd1d <a href=\"#-contributing\">Contributing</a> \u2022\n  \ud83d\udc96 <a href=\"#-support-my-work\">Support My Work</a>\n</p>\n\nImagine the following scenario: you need to automate tasks in your browser. Maybe it's testing a web application, collecting data from a site, or even automating repetitive processes. Normally this involves using external drivers, complex configurations, and many compatibility issues.\n\n**Pydoll was born to solve these problems.**\n\nBuilt from scratch with a different philosophy, Pydoll connects directly to the Chrome DevTools Protocol (CDP), eliminating the need for external drivers. This clean implementation along with realistic ways of clicking, navigating and interacting with elements makes it practically indistinguishable from a real user.\n\nWe believe that powerful automation shouldn't require you to become an expert in configuration or constantly fight with bot protection systems. With Pydoll, you can focus on what really matters: your automation logic, not the underlying complexity or protection systems.\n\n<div>\n  <h4>Be a good human. Give it a star \u2b50</h4> \n    No stars, no bugs fixed. Just kidding (maybe)\n</div>\n\n## \ud83c\udf1f What makes Pydoll special?\n\n- **Zero Webdrivers**: Say goodbye to webdriver compatibility issues\n- **Human-like Interaction Engine**: Capable of passing behavioral CAPTCHAs like reCAPTCHA v3 or Turnstile, depending on IP reputation and interaction patterns\n- **Asynchronous Performance**: For high-speed automation and multiple simultaneous tasks\n- **Humanized Interactions**: Mimic real user behavior\n- **Simplicity**: With Pydoll, you install and you're ready to automate.\n\n## What's New\n\n### Remote connections via WebSocket \u2014 control any Chrome from anywhere!\n\nYou asked for it, we delivered. You can now connect to an already running browser remotely via its WebSocket address and use the full Pydoll API immediately.\n\n```python\nfrom pydoll.browser.chromium import Chrome\n\nchrome = Chrome()\ntab = await chrome.connect('ws://YOUR_HOST:9222/devtools/browser/XXXX')\n\n# Full power unlocked: navigation, element automation, requests, events\u2026\nawait tab.go_to('https://example.com')\ntitle = await tab.execute_script('return document.title')\nprint(title)\n```\n\nThis makes it effortless to run Pydoll against remote/CI browsers, containers, or shared debugging targets \u2014 no local launch required. Just point to the WS endpoint and automate.\n\n### Navigate the DOM like a pro: get_children_elements() and get_siblings_elements()\n\nTwo delightful helpers to traverse complex layouts with intention:\n\n```python\n# Grab direct children of a container\ncontainer = await tab.find(id='cards')\ncards = await container.get_children_elements(max_depth=1)\n\n# Want to go deeper? This will return children of children (and so on)\nelements = await container.get_children_elements(max_depth=2) \n\n# Walk horizontal lists without re-querying the DOM\nactive = await tab.find(class_name='item-active')\nsiblings = await active.get_siblings_elements()\n\nprint(len(cards), len(siblings))\n```\n\nUse them to cut boilerplate, express intent, and keep your scraping/automation logic clean and readable \u2014 especially in dynamic grids, lists and menus.\n\n### WebElement: state waiting and new public APIs\n\n- New `wait_until(...)` on `WebElement` to await element states with minimal code:\n\n```python\n# Wait until it becomes visible OR the timeout expires\nawait element.wait_until(is_visible=True, timeout=5)\n\n# Wait until it becomes interactable (visible, on top, receiving pointer events)\nawait element.wait_until(is_interactable=True, timeout=10)\n```\n\n- Methods now public on `WebElement`:\n  - `is_visible()`\n    - Checks that the element has a visible area (> 0), isn\u2019t hidden by CSS and is in the viewport (after `scroll_into_view()` when needed). Useful pre-check before interactions.\n  - `is_interactable()`\n    - \u201cClick-ready\u201d state: combines visibility, enabledness and pointer-event hit testing. Ideal for robust flows that avoid lost clicks.\n  - `is_on_top()`\n    - Verifies the element is the top hit-test target at the intended click point, avoiding overlays.\n  - `execute_script(script: str, return_by_value: bool = False)`\n    - Executes JavaScript in the element\u2019s own context (where `this` is the element). Great for fine-tuning and quick inspections.\n\n```python\n# Visually outline the element via JS\nawait element.execute_script(\"this.style.outline='2px solid #22d3ee'\")\n\n# Confirm states\nvisible = await element.is_visible()\ninteractable = await element.is_interactable()\non_top = await element.is_on_top()\n```\n\nThese additions simplify waiting and state validation before clicking/typing, reducing flakiness and making automations more predictable.\n\n\n## \ud83d\udce6 Installation\n\n```bash\npip install pydoll-python\n```\n\nAnd that's it! Just install and start automating.\n\n## \ud83d\ude80 Getting Started\n\n### Your first automation\n\nLet's start with a real example: an automation that performs a Google search and clicks on the first result. With this example, we can see how the library works and how you can start automating your tasks.\n\n```python\nimport asyncio\n\nfrom pydoll.browser import Chrome\nfrom pydoll.constants import Key\n\nasync def google_search(query: str):\n    async with Chrome() as browser:\n        tab = await browser.start()\n        await tab.go_to('https://www.google.com')\n        search_box = await tab.find(tag_name='textarea', name='q')\n        await search_box.insert_text(query)\n        await search_box.press_keyboard_key(Key.ENTER)\n        await (await tab.find(\n            tag_name='h3',\n            text='autoscrape-labs/pydoll',\n            timeout=10,\n        )).click()\n        await tab.find(id='repository-container-header', timeout=10)\n\nasyncio.run(google_search('pydoll python'))\n```\n\nWithout configurations, just a simple script, we can do a complete Google search!\nOkay, now let's see how we can extract data from a page, using the same previous example.\nLet's consider in the code below that we're already on the Pydoll page. We want to extract the following information:\n\n- Project description\n- Number of stars\n- Number of forks\n- Number of issues\n- Number of pull requests\n\nLet's get started! To get the project description, we'll use xpath queries. You can check the documentation on how to build your own queries.\n\n```python\ndescription = await (await tab.query(\n    '//h2[contains(text(), \"About\")]/following-sibling::p',\n    timeout=10,\n)).text\n```\n\nAnd that's it! Let's understand what this query does:\n\n1. `//h2[contains(text(), \"About\")]` - Selects the first `<h2>` that contains \"About\"\n2. `/following-sibling::p` - Selects the first `<p>` that comes after the `<h2>`\n\nNow let's get the rest of the data:\n\n```python\nnumber_of_stars = await (await tab.find(\n    id='repo-stars-counter-star'\n)).text\n\nnumber_of_forks = await (await tab.find(\n    id='repo-network-counter'\n)).text\nnumber_of_issues = await (await tab.find(\n    id='issues-repo-tab-count',\n)).text\nnumber_of_pull_requests = await (await tab.find(\n    id='pull-requests-repo-tab-count',\n)).text\n\ndata = {\n    'description': description,\n    'number_of_stars': number_of_stars,\n    'number_of_forks': number_of_forks,\n    'number_of_issues': number_of_issues,\n    'number_of_pull_requests': number_of_pull_requests,\n}\nprint(data)\n\n```\n\nWe managed to extract all the necessary data!\n\n### Custom Configurations\n\nSometimes we need more control over the browser. Pydoll offers a flexible way to do this. Let's see the example below:\n\n\n```python\nfrom pydoll.browser import Chrome\nfrom pydoll.browser.options import ChromiumOptions as Options\n\nasync def custom_automation():\n    # Configure browser options\n    options = Options()\n    options.add_argument('--proxy-server=username:password@ip:port')\n    options.add_argument('--window-size=1920,1080')\n    options.binary_location = '/path/to/your/browser'\n    options.start_timeout = 20\n\n    async with Chrome(options=options) as browser:\n        tab = await browser.start()\n        # Your automation code here\n        await tab.go_to('https://example.com')\n        # The browser is now using your custom settings\n\nasyncio.run(custom_automation())\n```\n\nIn this example, we're configuring the browser to use a proxy and a 1920x1080 window, in addition to a custom path for the Chrome binary, in case your installation location is different from the common defaults.\n\n\n## \u26a1 Advanced Features\n\nPydoll offers a series of advanced features to please even the most\ndemanding users.\n\n\n\n### Advanced Element Search\n\nWe have several ways to find elements on the page. No matter how you prefer, we have a way that makes sense for you:\n\n```python\nimport asyncio\nfrom pydoll.browser import Chrome\n\nasync def element_finding_examples():\n    async with Chrome() as browser:\n        tab = await browser.start()\n        await tab.go_to('https://example.com')\n\n        # Find by attributes (most intuitive)\n        submit_btn = await tab.find(\n            tag_name='button',\n            class_name='btn-primary',\n            text='Submit'\n        )\n        # Find by ID\n        username_field = await tab.find(id='username')\n        # Find multiple elements\n        all_links = await tab.find(tag_name='a', find_all=True)\n        # CSS selectors and XPath\n        nav_menu = await tab.query('nav.main-menu')\n        specific_item = await tab.query('//div[@data-testid=\"item-123\"]')\n        # With timeout and error handling\n        delayed_element = await tab.find(\n            class_name='dynamic-content',\n            timeout=10,\n            raise_exc=False  # Returns None if not found\n        )\n        # Advanced: Custom attributes\n        custom_element = await tab.find(\n            data_testid='submit-button',\n            aria_label='Submit form'\n        )\n\nasyncio.run(element_finding_examples())\n```\n\nThe `find` method is more user-friendly. We can search by common attributes like id, tag_name, class_name, etc., up to custom attributes (e.g. `data-testid`).\n\nIf that's not enough, we can use the `query` method to search for elements using CSS selectors, XPath queries, etc. Pydoll automatically takes care of identifying what type of query we're using.\n\n\n### Browser-context HTTP requests - game changer for hybrid automation!\nEver wished you could make HTTP requests that automatically inherit all your browser's session state? **Now you can!**<br>\nThe `tab.request` property gives you a beautiful `requests`-like interface that executes HTTP calls directly in the browser's JavaScript context. This means every request automatically gets cookies, authentication headers, CORS policies, and session state, just as if the browser made the request itself.\n\n**Perfect for Hybrid Automation:**\n```python\n# Navigate to a site and login normally with PyDoll\nawait tab.go_to('https://example.com/login')\nawait (await tab.find(id='username')).type_text('user@example.com')\nawait (await tab.find(id='password')).type_text('password')\nawait (await tab.find(id='login-btn')).click()\n\n# Now make API calls that inherit the logged-in session!\nresponse = await tab.request.get('https://example.com/api/user/profile')\nuser_data = response.json()\n\n# POST data while staying authenticated\nresponse = await tab.request.post(\n    'https://example.com/api/settings', \n    json={'theme': 'dark', 'notifications': True}\n)\n\n# Access response content in different formats\nraw_data = response.content\ntext_data = response.text\njson_data = response.json()\n\n# Check cookies that were set\nfor cookie in response.cookies:\n    print(f\"Cookie: {cookie['name']} = {cookie['value']}\")\n\n# Add custom headers to your requests\nheaders = [\n    {'name': 'X-Custom-Header', 'value': 'my-value'},\n    {'name': 'X-API-Version', 'value': '2.0'}\n]\n\nawait tab.request.get('https://api.example.com/data', headers=headers)\n\n```\n\n**Why this is great:**\n- **No more session juggling** - Requests inherit browser cookies automatically\n- **CORS just works** - Requests respect browser security policies  \n- **Perfect for modern SPAs** - Seamlessly mix UI automation with API calls\n- **Authentication made easy** - Login once via UI, then hammer APIs\n- **Hybrid workflows** - Use the best tool for each step (UI or API)\n\nThis opens up incredible possibilities for automation scenarios where you need both browser interaction AND API efficiency!\n\n### New expect_download() context manager \u2014 robust file downloads made easy!\nTired of fighting with flaky download flows, missing files, or racy event listeners? Meet `tab.expect_download()`, a delightful, reliable way to handle file downloads.\n\n- Automatically sets the browser\u2019s download behavior\n- Works with your own directory or a temporary folder (auto-cleaned!)\n- Waits for completion with a timeout (so your tests don\u2019t hang)\n- Gives you a handy handle to read bytes/base64 or check `file_path`\n\nTiny example that just works:\n\n```python\nimport asyncio\nfrom pathlib import Path\nfrom pydoll.browser import Chrome\n\nasync def download_report():\n    async with Chrome() as browser:\n        tab = await browser.start()\n        await tab.go_to('https://example.com/reports')\n\n        target_dir = Path('/tmp/my-downloads')\n        async with tab.expect_download(keep_file_at=target_dir, timeout=10) as download:\n            # Trigger the download in the page (button/link/etc.)\n            await (await tab.find(text='Download latest report')).click()\n            # Wait until finished and read the content\n            data = await download.read_bytes()\n            print(f\"Downloaded {len(data)} bytes to: {download.file_path}\")\n\nasyncio.run(download_report())\n```\n\nWant zero-hassle cleanup? Omit `keep_file_at` and we\u2019ll create a temp folder and remove it automatically after the context exits. Perfect for tests.\n\n### Total browser control with custom preferences! (thanks to [@LucasAlvws](https://github.com/LucasAlvws))\nWant to completely customize how Chrome behaves? **Now you can control EVERYTHING!**<br>\nThe new `browser_preferences` system gives you access to hundreds of internal Chrome settings that were previously impossible to change programmatically. We're talking about deep browser customization that goes way beyond command-line flags!\n\n**The possibilities are endless:**\n```python\noptions = ChromiumOptions()\n\n# Create the perfect automation environment\noptions.browser_preferences = {\n    'download': {\n        'default_directory': '/tmp/downloads',\n        'prompt_for_download': False,\n        'directory_upgrade': True,\n        'extensions_to_open': ''  # Don't auto-open any downloads\n    },\n    'profile': {\n        'default_content_setting_values': {\n            'notifications': 2,        # Block all notifications\n            'geolocation': 2,         # Block location requests\n            'media_stream_camera': 2, # Block camera access\n            'media_stream_mic': 2,    # Block microphone access\n            'popups': 1               # Allow popups (useful for automation)\n        },\n        'password_manager_enabled': False,  # Disable password prompts\n        'exit_type': 'Normal'              # Always exit cleanly\n    },\n    'intl': {\n        'accept_languages': 'en-US,en',\n        'charset_default': 'UTF-8'\n    },\n    'browser': {\n        'check_default_browser': False,    # Don't ask about default browser\n        'show_update_promotion_infobar': False\n    }\n}\n\n# Or use the convenient helper methods\noptions.set_default_download_directory('/tmp/downloads')\noptions.set_accept_languages('en-US,en,pt-BR')  \noptions.prompt_for_download = False\n```\n\n**Real-world power examples:**\n- **Silent downloads** - No prompts, no dialogs, just automated downloads\n- **Block ALL distractions** - Notifications, popups, camera requests, you name it\n- **Perfect for CI/CD** - Disable update checks, default browser prompts, crash reporting\n- **Multi-region testing** - Change languages, timezones, and locale settings instantly\n- **Security hardening** - Lock down permissions and disable unnecessary features\n- **Advanced fingerprinting control** - Modify browser install dates, engagement history, and behavioral patterns\n\n**Fingerprint customization for stealth automation:**\n```python\nimport time\n\n# Simulate a browser that's been around for months\nfake_engagement_time = int(time.time()) - (7 * 24 * 60 * 60)  # 7 days ago\n\noptions.browser_preferences = {\n    'settings': {\n        'touchpad': {\n            'natural_scroll': True,\n        }\n    },\n    'profile': {\n        'last_engagement_time': fake_engagement_time,\n        'exit_type': 'Normal',\n        'exited_cleanly': True\n    },\n    'newtab_page_location_override': 'https://www.google.com',\n    'session': {\n        'restore_on_startup': 1,  # Restore last session\n        'startup_urls': ['https://www.google.com']\n    }\n}\n```\n\nThis level of control was previously only available to Chrome extension developers - now it's in your automation toolkit!\n\nCheck the [documentation](https://pydoll.tech/docs/features/#custom-browser-preferences/) for more details.\n\n### Concurrent Automation\n\nOne of the great advantages of Pydoll is the ability to process multiple tasks simultaneously thanks to its asynchronous implementation. We can automate multiple tabs\nat the same time! Let's see an example:\n\n```python\nimport asyncio\nfrom pydoll.browser import Chrome\n\nasync def scrape_page(url, tab):\n    await tab.go_to(url)\n    title = await tab.execute_script('return document.title')\n    links = await tab.find(tag_name='a', find_all=True)\n    return {\n        'url': url,\n        'title': title,\n        'link_count': len(links)\n    }\n\nasync def concurrent_scraping():\n    browser = Chrome()\n    tab_google = await browser.start()\n    tab_duckduckgo = await browser.new_tab()\n    tasks = [\n        scrape_page('https://google.com/', tab_google),\n        scrape_page('https://duckduckgo.com/', tab_duckduckgo)\n    ]\n    results = await asyncio.gather(*tasks)\n    print(results)\n    await browser.stop()\n\nasyncio.run(concurrent_scraping())\n```\n\nWe managed to extract data from two pages at the same time!\n\nAnd there's much, much more! Event system for reactive automations, request interception and modification, and so on. Take a look at the documentation, you won't\nregret it!\n\n\n## \ud83d\udd27 Quick Troubleshooting\n\n**Browser not found?**\n```python\nfrom pydoll.browser import Chrome\nfrom pydoll.browser.options import ChromiumOptions\n\noptions = ChromiumOptions()\noptions.binary_location = '/path/to/your/chrome'\nbrowser = Chrome(options=options)\n```\n\n**Browser starts after a FailedToStartBrowser error?**\n```python\nfrom pydoll.browser import Chrome\nfrom pydoll.browser.options import ChromiumOptions\n\noptions = ChromiumOptions()\noptions.start_timeout = 20  # default is 10 seconds\n\nbrowser = Chrome(options=options)\n```\n\n**Need a proxy?**\n```python\noptions.add_argument('--proxy-server=your-proxy:port')\n```\n\n**Running in Docker?**\n```python\noptions.add_argument('--no-sandbox')\noptions.add_argument('--disable-dev-shm-usage')\n```\n\n## \ud83d\udcda Documentation\n\nFor complete documentation, detailed examples and deep dives into all Pydoll functionalities, visit our [official documentation](https://pydoll.tech/).\n\nThe documentation includes:\n- **Getting Started Guide** - Step-by-step tutorials\n- **API Reference** - Complete method documentation\n- **Advanced Techniques** - Network interception, event handling, performance optimization\n\n>The chinese version of this README is [here](README_zh.md).\n\n## \ud83e\udd1d Contributing\n\nWe would love your help to make Pydoll even better! Check out our [contribution guidelines](CONTRIBUTING.md) to get started. Whether it's fixing bugs, adding features or improving documentation - all contributions are welcome!\n\nPlease make sure to:\n- Write tests for new features or bug fixes\n- Follow code style and conventions\n- Use conventional commits for pull requests\n- Run lint checks and tests before submitting\n\n## \ud83d\udc96 Support My Work\n\nIf you find Pydoll useful, consider [supporting me on GitHub](https://github.com/sponsors/thalissonvs).  \nYou'll get access to exclusive benefits like priority support, custom features and much more!\n\nCan't sponsor right now? No problem, you can still help a lot by:\n- Starring the repository\n- Sharing on social media\n- Writing posts or tutorials\n- Giving feedback or reporting issues\n\nEvery bit of support makes a difference/\n\n## \ud83d\udcac Spread the word\n\nIf Pydoll saved you time, mental health, or a keyboard from being smashed, give it a \u2b50, share it, or tell your weird dev friends.\n\n## \ud83d\udcc4 License\n\nPydoll is licensed under the [MIT License](LICENSE).\n\n<p align=\"center\">\n  <b>Pydoll</b> \u2014 Making browser automation magical!\n</p>\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.",
    "version": "2.8.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "48457be5605fb2b9cecfb905a0ed6c042168b7fdfb53cb4e997d88fb94cf24dc",
                "md5": "6e32fcff0766a6a323dadc4e38b59d26",
                "sha256": "6c5fe22c3b61af74695c466f5adb361a604f10f34a483d2832b196b1b00d9e1c"
            },
            "downloads": -1,
            "filename": "pydoll_python-2.8.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6e32fcff0766a6a323dadc4e38b59d26",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 194644,
            "upload_time": "2025-08-28T04:11:21",
            "upload_time_iso_8601": "2025-08-28T04:11:21.671513Z",
            "url": "https://files.pythonhosted.org/packages/48/45/7be5605fb2b9cecfb905a0ed6c042168b7fdfb53cb4e997d88fb94cf24dc/pydoll_python-2.8.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "425db43a5eaa0a80aa7a74b623e365ab9d1882925aa8fbe0d7043b5feeeec935",
                "md5": "57e2ed0ace6443fb9a2b92852c67e449",
                "sha256": "92069b160d33d5e06300e5452b468d7c5f84c4cf4150ef00bd6c6ccebd1621f6"
            },
            "downloads": -1,
            "filename": "pydoll_python-2.8.0.tar.gz",
            "has_sig": false,
            "md5_digest": "57e2ed0ace6443fb9a2b92852c67e449",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 169774,
            "upload_time": "2025-08-28T04:11:22",
            "upload_time_iso_8601": "2025-08-28T04:11:22.814452Z",
            "url": "https://files.pythonhosted.org/packages/42/5d/b43a5eaa0a80aa7a74b623e365ab9d1882925aa8fbe0d7043b5feeeec935/pydoll_python-2.8.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-28 04:11:22",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "pydoll-python"
}
        
Elapsed time: 3.18307s