undetected-geckodriver


Nameundetected-geckodriver JSON
Version 1.0.7 PyPI version JSON
download
home_pagehttps://github.com/ByteXenon/undetected_geckodriver
SummaryA Firefox Selenium WebDriver that patches the browser to avoid detection. Bypasses services such as Cloudflare, Distil Networks, and more. Ideal for web scraping, automated testing, and bot development without getting detected.
upload_time2024-11-20 19:39:27
maintainerNone
docs_urlNone
authorByteXenon
requires_python>=3.6
licenseMIT
keywords selenium firefox webdriver undetected bypass cloudflare distil web scraping automated testing bot development anti-detection automation browser automation
VCS
bugtrack_url
requirements selenium psutil
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">

# Undetected GeckoDriver v1.0.6

A Python package that integrates with Firefox Selenium to bypass anti-bot detection mechanisms, ideal for web scraping, automated testing, and browser automation without being marked as a bot.

[![PyPI version](https://badge.fury.io/py/undetected-geckodriver.svg)](https://badge.fury.io/py/undetected-geckodriver)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://pepy.tech/badge/undetected-geckodriver)](https://pepy.tech/project/undetected-geckodriver)
[![Downloads](https://pepy.tech/badge/undetected-geckodriver/month)](https://pepy.tech/project/undetected-geckodriver)
[![Downloads](https://pepy.tech/badge/undetected-geckodriver/week)](https://pepy.tech/project/undetected-geckodriver)

</div>

## Preview

|                                           With undetected-geckodriver                                           |                                           Without undetected-geckodriver                                           |
| :-------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------: |
| ![With undetected-geckodriver](https://github.com/user-attachments/assets/24a208c0-4793-4d5d-bf3c-22e3a1beb9a4) | ![Without undetected-geckodriver](https://github.com/user-attachments/assets/927be4df-06d6-4d88-8948-668c35efa68e) |

> You can test it for yourself by going to [this website](https://www.browserscan.net/bot-detection)

## Overview

> [!NOTE]
> Currently, this package only supports Linux. Support for Windows and macOS is planned for future releases.

Undetected GeckoDriver is a powerful Python package designed to work seamlessly with the [Selenium](https://github.com/SeleniumHQ/selenium) browser automation framework. Selenium allows you to control web browsers through code, making it an essential tool for web scraping, automated testing, and browser automation. However, when browsers are controlled by scripts (often referred to as "puppet browsers"), they typically set specific properties that can be detected by anti-bot services like Cloudflare. For instance, properties such as `navigator.webdriver` can be checked using JavaScript, which may restrict access to content on sites protected by such services.

To address this issue, Undetected GeckoDriver acts as an interface between your code and Selenium, helping you bypass bot detection mechanisms. When you create a new WebDriver instance using the `Firefox()` class from the Undetected GeckoDriver package (as opposed to using Selenium directly), the following processes occur:

1. The original Firefox binary is located, copied, and patched to prevent it from modifying properties such as `navigator.webdriver` while using Selenium.
2. A Selenium WebDriver instance is created that uses the patched Firefox binary.

This makes it possible to interact with websites without being detected as a bot, allowing you to scrape data, automate tasks, and perform other browser-based operations without triggering bot detection mechanisms.

## Installation

You can install the package via pip:

```bash
pip install undetected-geckodriver
```

Or you can install it from source:

```bash
git clone https://github.com/bytexenon/undetected_geckodriver
cd undetected_geckodriver
pip install .
```

> [!NOTE]
> The last installation method is not recommended unless you are planning to contribute to the project. For regular usage, it is recommended to install the package via regular pip installation.

## Usage

Since Undetected GeckoDriver acts as an interface for Selenium, you can use it the same way you would use Selenium.

You can integrate Undetected GeckoDriver into your existing Selenium code by simply replacing the `selenium.webdriver.Firefox` imports with `undetected_geckodriver.Firefox`.

Here are a couple of examples demonstrating how you can use this project:

1. **Creating a new undetected WebDriver instance and navigating to example.com**:

   ```python
   from undetected_geckodriver import Firefox

   driver = Firefox()
   driver.get("https://www.example.com")
   ```

2. **Searching for "Undetected Geckodriver 1337!" on Google**:

   ```python
   import time
   from undetected_geckodriver import Firefox
   from selenium.webdriver.common.by import By

   # Constants
   SEARCH_FOR = "Undetected Geckodriver 1337!"
   GOOGLE_URL = "https://www.google.com"

   # Initialize the undetected Firefox browser
   driver = Firefox()

   # Navigate to Google
   driver.get(GOOGLE_URL)

   # Locate the search box and perform the search
   search_box = driver.find_element(By.NAME, "q")
   search_box.send_keys(SEARCH_FOR)
   search_box.submit()

   # Wait for the page to load
   time.sleep(2)

   # Print the current URL after the search
   print("Current URL:", driver.current_url)

   # Wait for a while to observe the results
   time.sleep(15)

   # Ensure the browser is closed
   driver.quit() # Close the browser
   ```

For further information and advanced usage, you can take a look at the [official Selenium documentation](https://www.selenium.dev/documentation/en/) since Undetected GeckoDriver is built on top of Selenium.

## Requirements

- **`Firefox`**
- **`Python >= 3.6`**
- **`Selenium >= 4.10.0`**
- **`Psutil >= 5.8.0`**

## FAQ

### The browser is still being detected as a bot. What should I do?

If your browser is still being detected as a bot while using Undetected GeckoDriver, it may be due to advanced bot detection mechanisms on the website. In such cases, please open an issue on the GitHub repository with the website URL and any relevant information. This will help in investigating and potentially adding support for it in future releases.

### Why patch the Firefox binary?

When Firefox is controlled remotely by a script (such as when using Selenium), it sets certain properties that can be detected by anti-bot services as defined in the WebDriver specification. Selenium itself doesn't control these properties directly. By patching the Firefox binary, we can prevent it from modifying these properties, allowing us to interact with websites without being detected as a bot.

### Why use Undetected GeckoDriver over undetected-chromedriver?

While undetected-chromedriver is a great tool for bypassing bot detection mechanisms, it only supports Chrome and Edge browsers. Undetected GeckoDriver fills this gap by providing similar functionality for Firefox browsers.

## Roadmap

**Completed:**

- [x] **Spoof `navigator.webdriver` property**: Implement a method to spoof the `navigator.webdriver` property to prevent detection by services like Cloudflare. This helps in avoiding bot detection mechanisms.

**In Progress:**

- [ ] **Multi-platform support**: Extend the compatibility of the tool to work seamlessly across different operating systems other than Linux (Windows, macOS). This includes ensuring all dependencies and configurations are platform-independent.

**Planned:**

- [ ] **Helper functions for passing CAPTCHAs and other security measures**: Develop utility functions to automate the solving of CAPTCHAs and bypass other common security measures encountered during web scraping. (e.g. automatically passing Cloudflare's "I'm human" challenge)

- [ ] **Support for Selenium Wire**: Integrate Selenium Wire to allow for more advanced network interactions, such as modifying requests and responses.

## Contributing

Contributions are welcome! Please feel free to open an issue or submit a pull request if you encounter any problems or have any suggestions, improvements, or new features you would like to see implemented.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- Special thanks to the contributors of the Selenium project.
- Inspiration from the [undetected-chromedriver project](https://github.com/ultrafunkamsterdam/undetected-chromedriver).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ByteXenon/undetected_geckodriver",
    "name": "undetected-geckodriver",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "selenium firefox webdriver undetected bypass cloudflare distil web scraping automated testing bot development anti-detection automation browser automation",
    "author": "ByteXenon",
    "author_email": "ddavi142@asu.edu",
    "download_url": "https://files.pythonhosted.org/packages/d8/66/08dc41358d129adb2cc8cd97a8704da4c5e239aa551e7597912d7b52829a/undetected_geckodriver-1.0.7.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n\n# Undetected GeckoDriver v1.0.6\n\nA Python package that integrates with Firefox Selenium to bypass anti-bot detection mechanisms, ideal for web scraping, automated testing, and browser automation without being marked as a bot.\n\n[![PyPI version](https://badge.fury.io/py/undetected-geckodriver.svg)](https://badge.fury.io/py/undetected-geckodriver)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Downloads](https://pepy.tech/badge/undetected-geckodriver)](https://pepy.tech/project/undetected-geckodriver)\n[![Downloads](https://pepy.tech/badge/undetected-geckodriver/month)](https://pepy.tech/project/undetected-geckodriver)\n[![Downloads](https://pepy.tech/badge/undetected-geckodriver/week)](https://pepy.tech/project/undetected-geckodriver)\n\n</div>\n\n## Preview\n\n|                                           With undetected-geckodriver                                           |                                           Without undetected-geckodriver                                           |\n| :-------------------------------------------------------------------------------------------------------------: | :----------------------------------------------------------------------------------------------------------------: |\n| ![With undetected-geckodriver](https://github.com/user-attachments/assets/24a208c0-4793-4d5d-bf3c-22e3a1beb9a4) | ![Without undetected-geckodriver](https://github.com/user-attachments/assets/927be4df-06d6-4d88-8948-668c35efa68e) |\n\n> You can test it for yourself by going to [this website](https://www.browserscan.net/bot-detection)\n\n## Overview\n\n> [!NOTE]\n> Currently, this package only supports Linux. Support for Windows and macOS is planned for future releases.\n\nUndetected GeckoDriver is a powerful Python package designed to work seamlessly with the [Selenium](https://github.com/SeleniumHQ/selenium) browser automation framework. Selenium allows you to control web browsers through code, making it an essential tool for web scraping, automated testing, and browser automation. However, when browsers are controlled by scripts (often referred to as \"puppet browsers\"), they typically set specific properties that can be detected by anti-bot services like Cloudflare. For instance, properties such as `navigator.webdriver` can be checked using JavaScript, which may restrict access to content on sites protected by such services.\n\nTo address this issue, Undetected GeckoDriver acts as an interface between your code and Selenium, helping you bypass bot detection mechanisms. When you create a new WebDriver instance using the `Firefox()` class from the Undetected GeckoDriver package (as opposed to using Selenium directly), the following processes occur:\n\n1. The original Firefox binary is located, copied, and patched to prevent it from modifying properties such as `navigator.webdriver` while using Selenium.\n2. A Selenium WebDriver instance is created that uses the patched Firefox binary.\n\nThis makes it possible to interact with websites without being detected as a bot, allowing you to scrape data, automate tasks, and perform other browser-based operations without triggering bot detection mechanisms.\n\n## Installation\n\nYou can install the package via pip:\n\n```bash\npip install undetected-geckodriver\n```\n\nOr you can install it from source:\n\n```bash\ngit clone https://github.com/bytexenon/undetected_geckodriver\ncd undetected_geckodriver\npip install .\n```\n\n> [!NOTE]\n> The last installation method is not recommended unless you are planning to contribute to the project. For regular usage, it is recommended to install the package via regular pip installation.\n\n## Usage\n\nSince Undetected GeckoDriver acts as an interface for Selenium, you can use it the same way you would use Selenium.\n\nYou can integrate Undetected GeckoDriver into your existing Selenium code by simply replacing the `selenium.webdriver.Firefox` imports with `undetected_geckodriver.Firefox`.\n\nHere are a couple of examples demonstrating how you can use this project:\n\n1. **Creating a new undetected WebDriver instance and navigating to example.com**:\n\n   ```python\n   from undetected_geckodriver import Firefox\n\n   driver = Firefox()\n   driver.get(\"https://www.example.com\")\n   ```\n\n2. **Searching for \"Undetected Geckodriver 1337!\" on Google**:\n\n   ```python\n   import time\n   from undetected_geckodriver import Firefox\n   from selenium.webdriver.common.by import By\n\n   # Constants\n   SEARCH_FOR = \"Undetected Geckodriver 1337!\"\n   GOOGLE_URL = \"https://www.google.com\"\n\n   # Initialize the undetected Firefox browser\n   driver = Firefox()\n\n   # Navigate to Google\n   driver.get(GOOGLE_URL)\n\n   # Locate the search box and perform the search\n   search_box = driver.find_element(By.NAME, \"q\")\n   search_box.send_keys(SEARCH_FOR)\n   search_box.submit()\n\n   # Wait for the page to load\n   time.sleep(2)\n\n   # Print the current URL after the search\n   print(\"Current URL:\", driver.current_url)\n\n   # Wait for a while to observe the results\n   time.sleep(15)\n\n   # Ensure the browser is closed\n   driver.quit() # Close the browser\n   ```\n\nFor further information and advanced usage, you can take a look at the [official Selenium documentation](https://www.selenium.dev/documentation/en/) since Undetected GeckoDriver is built on top of Selenium.\n\n## Requirements\n\n- **`Firefox`**\n- **`Python >= 3.6`**\n- **`Selenium >= 4.10.0`**\n- **`Psutil >= 5.8.0`**\n\n## FAQ\n\n### The browser is still being detected as a bot. What should I do?\n\nIf your browser is still being detected as a bot while using Undetected GeckoDriver, it may be due to advanced bot detection mechanisms on the website. In such cases, please open an issue on the GitHub repository with the website URL and any relevant information. This will help in investigating and potentially adding support for it in future releases.\n\n### Why patch the Firefox binary?\n\nWhen Firefox is controlled remotely by a script (such as when using Selenium), it sets certain properties that can be detected by anti-bot services as defined in the WebDriver specification. Selenium itself doesn't control these properties directly. By patching the Firefox binary, we can prevent it from modifying these properties, allowing us to interact with websites without being detected as a bot.\n\n### Why use Undetected GeckoDriver over undetected-chromedriver?\n\nWhile undetected-chromedriver is a great tool for bypassing bot detection mechanisms, it only supports Chrome and Edge browsers. Undetected GeckoDriver fills this gap by providing similar functionality for Firefox browsers.\n\n## Roadmap\n\n**Completed:**\n\n- [x] **Spoof `navigator.webdriver` property**: Implement a method to spoof the `navigator.webdriver` property to prevent detection by services like Cloudflare. This helps in avoiding bot detection mechanisms.\n\n**In Progress:**\n\n- [ ] **Multi-platform support**: Extend the compatibility of the tool to work seamlessly across different operating systems other than Linux (Windows, macOS). This includes ensuring all dependencies and configurations are platform-independent.\n\n**Planned:**\n\n- [ ] **Helper functions for passing CAPTCHAs and other security measures**: Develop utility functions to automate the solving of CAPTCHAs and bypass other common security measures encountered during web scraping. (e.g. automatically passing Cloudflare's \"I'm human\" challenge)\n\n- [ ] **Support for Selenium Wire**: Integrate Selenium Wire to allow for more advanced network interactions, such as modifying requests and responses.\n\n## Contributing\n\nContributions are welcome! Please feel free to open an issue or submit a pull request if you encounter any problems or have any suggestions, improvements, or new features you would like to see implemented.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Special thanks to the contributors of the Selenium project.\n- Inspiration from the [undetected-chromedriver project](https://github.com/ultrafunkamsterdam/undetected-chromedriver).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Firefox Selenium WebDriver that patches the browser to avoid detection. Bypasses services such as Cloudflare, Distil Networks, and more. Ideal for web scraping, automated testing, and bot development without getting detected.",
    "version": "1.0.7",
    "project_urls": {
        "Changelog": "https://github.com/ByteXenon/undetected_geckodriver/releases",
        "Documentation": "https://github.com/ByteXenon/undetected_geckodriver#readme",
        "Homepage": "https://github.com/ByteXenon/undetected_geckodriver",
        "Source": "https://github.com/ByteXenon/undetected_geckodriver",
        "Tracker": "https://github.com/ByteXenon/undetected_geckodriver/issues"
    },
    "split_keywords": [
        "selenium",
        "firefox",
        "webdriver",
        "undetected",
        "bypass",
        "cloudflare",
        "distil",
        "web",
        "scraping",
        "automated",
        "testing",
        "bot",
        "development",
        "anti-detection",
        "automation",
        "browser",
        "automation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fe247a03507674a1da8b8216df88fb46be492b79237761104c10c78f4fe8a9b3",
                "md5": "d0f1b7cd03207997280498486b54ce14",
                "sha256": "1f082c733990701d4527cf6998e27916d306fa892051bb9fe4149680f6367d65"
            },
            "downloads": -1,
            "filename": "undetected_geckodriver-1.0.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d0f1b7cd03207997280498486b54ce14",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 9174,
            "upload_time": "2024-11-20T19:39:25",
            "upload_time_iso_8601": "2024-11-20T19:39:25.668677Z",
            "url": "https://files.pythonhosted.org/packages/fe/24/7a03507674a1da8b8216df88fb46be492b79237761104c10c78f4fe8a9b3/undetected_geckodriver-1.0.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d86608dc41358d129adb2cc8cd97a8704da4c5e239aa551e7597912d7b52829a",
                "md5": "a86fc0fb7f4106abd8d0adf7b9f6d9e3",
                "sha256": "4b68d63ecf227dc1ec9223a3eaff386a7aefee88934ee88f96e5bbb520512614"
            },
            "downloads": -1,
            "filename": "undetected_geckodriver-1.0.7.tar.gz",
            "has_sig": false,
            "md5_digest": "a86fc0fb7f4106abd8d0adf7b9f6d9e3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 11033,
            "upload_time": "2024-11-20T19:39:27",
            "upload_time_iso_8601": "2024-11-20T19:39:27.758589Z",
            "url": "https://files.pythonhosted.org/packages/d8/66/08dc41358d129adb2cc8cd97a8704da4c5e239aa551e7597912d7b52829a/undetected_geckodriver-1.0.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-20 19:39:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ByteXenon",
    "github_project": "undetected_geckodriver",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "selenium",
            "specs": []
        },
        {
            "name": "psutil",
            "specs": []
        }
    ],
    "lcname": "undetected-geckodriver"
}
        
Elapsed time: 0.42522s