hrequests


Namehrequests JSON
Version 0.8.2 PyPI version JSON
download
home_pagehttps://github.com/daijro/hrequests
SummaryHrequests (human requests) is a simple, configurable, feature-rich, replacement for the Python requests library.
upload_time2024-03-31 07:09:24
maintainerNone
docs_urlNone
authordaijro
requires_python<4.0,>=3.8
licenseApache-2.0
keywords tls client http scraping requests humans playwright
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <img src="https://i.imgur.com/r8GcQW1.png" align="center">
</img>

<h2 align="center">hrequests</h2>

<h4 align="center">
<p align="center">
    <a href="https://github.com/daijro/hrequests/blob/main/LICENSE">
        <img src="https://img.shields.io/github/license/daijro/hrequests.svg">
    </a>
    <a href="https://python.org/">
        <img src="https://img.shields.io/badge/python-3.8&#8208;3.12-blue">
    </a>
    <a href="https://pypi.org/project/hrequests/">
        <img alt="PyPI" src="https://img.shields.io/pypi/v/hrequests.svg">
    </a>
    <a href="https://pepy.tech/project/hrequests">
        <img alt="PyPI" src="https://static.pepy.tech/badge/hrequests">
    </a>
    <a href="https://github.com/ambv/black">
        <img src="https://img.shields.io/badge/code%20style-black-black.svg">
    </a>
    <a href="https://github.com/PyCQA/isort">
        <img src="https://img.shields.io/badge/imports-isort-yellow.svg">
    </a>
</p>
    Hrequests (human requests) is a simple, configurable, feature-rich, replacement for the Python requests library. 
</h4>

### ✨ Features

- Seamless transition between HTTP and headless browsing 💻
- Integrated fast HTML parser 🚀
- High performance network concurrency with goroutines & gevent 🚀
- Replication of browser TLS fingerprints 🚀
- JavaScript rendering 🚀
- Supports HTTP/2 🚀
- Realistic browser header generation 🚀
- JSON serializing up to 10x faster than the standard library 🚀

### 💻 Browser crawling

- Simple & uncomplicated browser automation
- Human-like cursor movement and typing
- Chrome and Firefox extension support
- Full page screenshots
- Proxy support
- Headless and headful support
- No CORS restrictions

### ⚡ More

- High performance ✨
- Minimal dependence on the python standard libraries
- HTTP backend written in Go
- Automatic gzip & brotli decode
- Written with type safety
- 100% threadsafe ❤️

---

# Installation

Install via pip:

```bash
pip install -U hrequests[all]
python -m hrequests install
```

<details>
<summary>Or, install without headless browsing support</i></summary>

**Ignore the `[all]` option if you don't want headless browsing support:**

```bash
pip install -U hrequests
```

</details>

---

# Documentation

**For the latest stable hrequests documentation, check the [Gitbook page](https://daijro.gitbook.io/hrequests/).**

1. [Simple Usage](https://github.com/daijro/hrequests#simple-usage)
2. [Sessions](https://github.com/daijro/hrequests#sessions)
3. [Concurrent & Lazy Requests](https://github.com/daijro/hrequests#concurrent--lazy-requests)
4. [HTML Parsing](https://github.com/daijro/hrequests#html-parsing)
5. [Browser Automation](https://github.com/daijro/hrequests#browser-automation)

<hr width=50>

## Simple Usage

Here is an example of a simple `get` request:

```py
>>> resp = hrequests.get('https://www.google.com/')
```

Requests are sent through [bogdanfinn's tls-client](https://github.com/bogdanfinn/tls-client) to spoof the TLS client fingerprint. This is done automatically, and is completely transparent to the user.

Other request methods include `post`, `put`, `delete`, `head`, `options`, and `patch`.

The `Response` object is a near 1:1 replica of the `requests.Response` object, with some additional attributes.

<details>
<summary>Parameters</summary>

```
Parameters:
    url (Union[str, Iterable[str]]): URL or list of URLs to request.
    data (Union[str, bytes, bytearray, dict], optional): Data to send to request. Defaults to None.
    files (Dict[str, Union[BufferedReader, tuple]], optional): Data to send to request. Defaults to None.
    headers (dict, optional): Dictionary of HTTP headers to send with the request. Defaults to None.
    params (dict, optional): Dictionary of URL parameters to append to the URL. Defaults to None.
    cookies (Union[RequestsCookieJar, dict, list], optional): Dict or CookieJar to send. Defaults to None.
    json (dict, optional): Json to send in the request body. Defaults to None.
    allow_redirects (bool, optional): Allow request to redirect. Defaults to True.
    history (bool, optional): Remember request history. Defaults to False.
    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
    timeout (float, optional): Timeout in seconds. Defaults to 30.
    proxy (str, optional): Proxy URL. Defaults to None.
    nohup (bool, optional): Run the request in the background. Defaults to False.
    <Additionally includes all parameters from `hrequests.Session` if a session was not specified>

Returns:
    hrequests.response.Response: Response object
```

</details>

### Properties

Get the response url:

```py
>>> resp.url: str
'https://www.google.com/'
```

Check if the request was successful:

```py
>>> resp.status_code: int
200
>>> resp.reason: str
'OK'
>>> resp.ok: bool
True
>>> bool(resp)
True
```

Getting the response body:

```py
>>> resp.text: str
'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta charset="UTF-8"><meta content="origin" name="referrer"><m...'
>>> resp.content: bytes
b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta charset="UTF-8"><meta content="origin" name="referrer"><m...'
>>> resp.encoding: str
'UTF-8'
```

Parse the response body as JSON:

```py
>>> resp.json(): Union[dict, list]
{'somedata': True}
```

Get the elapsed time of the request:

```py
>>> resp.elapsed: datetime.timedelta
datetime.timedelta(microseconds=77768)
```

Get the response cookies:

```py
>>> resp.cookies: RequestsCookieJar
<RequestsCookieJar[Cookie(version=0, name='1P_JAR', value='2023-07-05-20', port=None, port_specified=False, domain='.google.com', domain_specified=True...
```

Get the response headers:

```py
>>> resp.headers: CaseInsensitiveDict
{'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Cache-Control': 'private, max-age=0', 'Content-Encoding': 'br', 'Content-Length': '51288', 'Content-Security-Policy-Report-Only': "object-src 'none';base-uri 'se
```

<hr width=50>

## Sessions

Creating a new Chrome Session object:

```py
>>> session = hrequests.Session()  # version randomized by default
>>> session = hrequests.Session('chrome', version=120)
```

<details>
<summary>Parameters</summary>

```
Parameters:
    browser (Literal['firefox', 'chrome'], optional): Browser to use. Default is 'chrome'.
    version (int, optional): Version of the browser to use. Browser must be specified. Default is randomized.
    os (Literal['win', 'mac', 'lin'], optional): OS to use in header. Default is randomized.
    headers (dict, optional): Dictionary of HTTP headers to send with the request. Default is generated from `browser` and `os`.
    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
    timeout (float, optional): Default timeout in seconds. Defaults to 30.
    proxy (str, optional): Proxy URL. Defaults to None.
    cookies (Union[RequestsCookieJar, dict, list], optional): Cookie Jar, or cookie list/dict to send. Defaults to None.
    certificate_pinning (Dict[str, List[str]], optional): Certificate pinning. Defaults to None.
    disable_ipv6 (bool, optional): Disable IPv6. Defaults to False.
    detect_encoding (bool, optional): Detect encoding. Defaults to True.
    ja3_string (str, optional): JA3 string. Defaults to None.
    h2_settings (dict, optional): HTTP/2 settings. Defaults to None.
    additional_decode (str, optional): Decode response body with "gzip" or "br". Defaults to None.
    pseudo_header_order (list, optional): Pseudo header order. Defaults to None.
    priority_frames (list, optional): Priority frames. Defaults to None.
    header_order (list, optional): Header order. Defaults to None.
    force_http1 (bool, optional): Force HTTP/1. Defaults to False.
    catch_panics (bool, optional): Catch panics. Defaults to False.
    debug (bool, optional): Debug mode. Defaults to False.
```

</details>

Browsers can also be created through the `firefox` and `chrome` shortcuts:

```py
>>> session = hrequests.firefox.Session()
>>> session = hrequests.chrome.Session()
```

<details>
<summary>Parameters</summary>

```
Parameters:
    version (int, optional): Version of the browser to use. Browser must be specified. Default is randomized.
    os (Literal['win', 'mac', 'lin'], optional): OS to use in header. Default is randomized.
    headers (dict, optional): Dictionary of HTTP headers to send with the request. Default is generated from `browser` and `os`.
    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
    timeout (float, optional): Default timeout in seconds. Defaults to 30.
    proxy (str, optional): Proxy URL. Defaults to None.
    cookies (Union[RequestsCookieJar, dict, list], optional): Cookie Jar, or cookie list/dict to send. Defaults to None.
    certificate_pinning (Dict[str, List[str]], optional): Certificate pinning. Defaults to None.
    disable_ipv6 (bool, optional): Disable IPv6. Defaults to False.
    detect_encoding (bool, optional): Detect encoding. Defaults to True.
    ja3_string (str, optional): JA3 string. Defaults to None.
    h2_settings (dict, optional): HTTP/2 settings. Defaults to None.
    additional_decode (str, optional): Decode response body with "gzip" or "br". Defaults to None.
    pseudo_header_order (list, optional): Pseudo header order. Defaults to None.
    priority_frames (list, optional): Priority frames. Defaults to None.
    header_order (list, optional): Header order. Defaults to None.
    force_http1 (bool, optional): Force HTTP/1. Defaults to False.
    catch_panics (bool, optional): Catch panics. Defaults to False.
    debug (bool, optional): Debug mode. Defaults to False.
```

</details>

`os` can be `'win'`, `'mac'`, or `'lin'`. Default is randomized.

```py
>>> session = hrequests.chrome.Session(os='mac')
```

This will automatically generate headers based on the browser name and OS:

```py
>>> session.headers
{'Accept': '*/*', 'Connection': 'keep-alive', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4; rv:60.2.2) Gecko/20100101 Firefox/60.2.2', 'Accept-Encoding': 'gzip, deflate, br', 'Pragma': 'no-cache'}
```

<details>
<summary>Why is the browser version in the header different than the TLS browser version?</summary>

Website bot detection systems typically do not correlate the TLS fingerprint browser version with the browser header.

By adding more randomization to our headers, we can make our requests appear to be coming from a larger number of clients. We can make it seem like our requests are coming from a larger number of clients. This makes it harder for websites to identify and block our requests based on a consistent browser version.

</details>

### Properties

Here is a simple get request. This is a wrapper around `hrequests.get`. The only difference is that the session cookies are updated with each request. Creating sessions are recommended for making multiple requests to the same domain.

```py
>>> resp = session.get('https://www.google.com/')
```

Session cookies update with each request:

```py
>>> session.cookies: RequestsCookieJar
<RequestsCookieJar[Cookie(version=0, name='1P_JAR', value='2023-07-05-20', port=None, port_specified=False, domain='.google.com', domain_specified=True...
```

Regenerate headers for a different OS:

```py
>>> session.os = 'win'
>>> session.headers: CaseInsensitiveDict
{'Accept': '*/*', 'Connection': 'keep-alive', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0.3) Gecko/20100101 Firefox/66.0.3', 'Accept-Encoding': 'gzip, deflate, br', 'Accept-Language': 'en-US;q=0.5,en;q=0.3', 'Cache-Control': 'max-age=0', 'DNT': '1', 'Upgrade-Insecure-Requests': '1', 'Pragma': 'no-cache'}
```

### Closing Sessions

Sessions can also be closed to free memory:

```py
>>> session.close()
```

Alternatively, sessions can be used as context managers:

```py
with hrequests.Session() as session:
    resp = session.get('https://www.google.com/')
    print(resp)
```

<hr width=50>

## Concurrent & Lazy Requests

### Nohup Requests

Similar to Unix's nohup command, `nohup` requests are sent in the background.

Adding the `nohup=True` keyword argument will return a `LazyTLSRequest` object. This will send the request immediately, but doesn't wait for the response to be ready until an attribute of the response is accessed.

```py
resp1 = hrequests.get('https://www.google.com/', nohup=True)
resp2 = hrequests.get('https://www.google.com/', nohup=True)
```

`resp1` and `resp2` are sent concurrently. They will _never_ pause the current thread, unless an attribute of the response is accessed:

```py
print('Resp 1:', resp1.reason)  # will wait for resp1 to finish, if it hasn't already
print('Resp 2:', resp2.reason)  # will wait for resp2 to finish, if it hasn't already
```

This is useful for sending requests in the background that aren't needed until later.

Note: In `nohup`, a new thread is created for each request. For larger scale concurrency, please consider the following:

### Easy Concurrency

You can pass an array/iterator of links to the request methods to send them concurrently. This wraps around [`hrequests.map`](https://github.com/daijro/hrequests#map):

```py
>>> hrequests.get(['https://google.com/', 'https://github.com/'])
(<Response [200]>, <Response [200]>)
```

This also works with `nohup`:

```py
>>> resps = hrequests.get(['https://google.com/', 'https://github.com/'], nohup=True)
>>> resps
(<LazyResponse[Pending]>, <LazyResponse[Pending]>)
>>> # Sometime later...
>>> resps
(<Response [200]>, <Response [200]>)
```

### Grequests-style Concurrency

The methods `async_get`, `async_post`, etc. will create an unsent request. This levereges gevent, making it _blazing fast_.

<details>
<summary>Parameters</summary>

```
Parameters:
    url (str): URL to send request to
    data (Union[str, bytes, bytearray, dict], optional): Data to send to request. Defaults to None.
    files (Dict[str, Union[BufferedReader, tuple]], optional): Data to send to request. Defaults to None.
    headers (dict, optional): Dictionary of HTTP headers to send with the request. Defaults to None.
    params (dict, optional): Dictionary of URL parameters to append to the URL. Defaults to None.
    cookies (Union[RequestsCookieJar, dict, list], optional): Dict or CookieJar to send. Defaults to None.
    json (dict, optional): Json to send in the request body. Defaults to None.
    allow_redirects (bool, optional): Allow request to redirect. Defaults to True.
    history (bool, optional): Remember request history. Defaults to False.
    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
    timeout (float, optional): Timeout in seconds. Defaults to 30.
    proxy (str, optional): Proxy URL. Defaults to None.
    <Additionally includes all parameters from `hrequests.Session` if a session was not specified>

Returns:
    hrequests.response.Response: Response object
```

</details>

Async requests are evaluated on `hrequests.map`, `hrequests.imap`, or `hrequests.imap_enum`.

This functionality is similar to [grequests](https://github.com/spyoungtech/grequests). Unlike grequests, [monkey patching](https://www.gevent.org/api/gevent.monkey.html) is not required because this does not rely on the standard python SSL library.

Create a set of unsent Requests:

```py
>>> reqs = [
...     hrequests.async_get('https://www.google.com/', browser='firefox'),
...     hrequests.async_get('https://www.duckduckgo.com/'),
...     hrequests.async_get('https://www.yahoo.com/')
... ]
```

#### map

Send them all at the same time using map:

```py
>>> hrequests.map(reqs, size=3)
[<Response [200]>, <Response [200]>, <Response [200]>]
```

<details>
<summary>Parameters</summary>

```
Concurrently converts a list of Requests to Responses.
Parameters:
    requests - a collection of Request objects.
    size - Specifies the number of requests to make at a time. If None, no throttling occurs.
    exception_handler - Callback function, called when exception occurred. Params: Request, Exception
    timeout - Gevent joinall timeout in seconds. (Note: unrelated to requests timeout)

Returns:
    A list of Response objects.
```

</details>

#### imap

`imap` returns a generator that yields responses as they come in:

```py
>>> for resp in hrequests.imap(reqs, size=3):
...    print(resp)
<Response [200]>
<Response [200]>
<Response [200]>
```

<details>
<summary>Parameters</summary>

```
Concurrently converts a generator object of Requests to a generator of Responses.

Parameters:
    requests - a generator or sequence of Request objects.
    size - Specifies the number of requests to make at a time. default is 2
    exception_handler - Callback function, called when exception occurred. Params: Request, Exception

Yields:
    Response objects.
```

</details>

`imap_enum` returns a generator that yields a tuple of `(index, response)` as they come in. The `index` is the index of the request in the original list:

```py
>>> for index, resp in hrequests.imap_enum(reqs, size=3):
...     print(index, resp)
(1, <Response [200]>)
(0, <Response [200]>)
(2, <Response [200]>)
```

<details>
<summary>Parameters</summary>

```
Like imap, but yields tuple of original request index and response object
Unlike imap, failed results and responses from exception handlers that return None are not ignored. Instead, a
tuple of (index, None) is yielded.
Responses are still in arbitrary order.

Parameters:
    requests - a sequence of Request objects.
    size - Specifies the number of requests to make at a time. default is 2
    exception_handler - Callback function, called when exception occurred. Params: Request, Exception

Yields:
    (index, Response) tuples.
```

</details>

#### Exception Handling

To handle timeouts or any other exception during the connection of the request, you can add an optional exception handler that will be called with the request and exception inside the main thread.

```py
>>> def exception_handler(request, exception):
...    return f'Response failed: {exception}'

>>> bad_reqs = [
...     hrequests.async_get('http://httpbin.org/delay/5', timeout=1),
...     hrequests.async_get('http://fakedomain/'),
...     hrequests.async_get('http://example.com/'),
... ]
>>> hrequests.map(bad_reqs, size=3, exception_handler=exception_handler)
['Response failed: Connection error', 'Response failed: Connection error', <Response [200]>]
```

The value returned by the exception handler will be used in place of the response in the result list.

If an exception handler isn't specified, the default yield type is `hrequests.FailedResponse`.

<hr width=50>

## HTML Parsing

HTML scraping is based off [selectolax](https://github.com/rushter/selectolax), which is **over 25x faster** than bs4. This functionality is inspired by [requests-html](https://github.com/psf/requests-html).

| Library        | Time (1e5 trials) |
| -------------- | ----------------- |
| BeautifulSoup4 | 52.6              |
| PyQuery        | 7.5               |
| selectolax     | **1.9**               |

The HTML parser can be accessed through the `html` attribute of the response object:

```py
>>> resp = session.get('https://python.org/')
>>> resp.html
<HTML url='https://www.python.org/'>
```

### Parsing page

Grab a list of all links on the page, as-is (anchors excluded):

```py
>>> resp.html.links
{'//docs.python.org/3/tutorial/', '/about/apps/', 'https://github.com/python/pythondotorg/issues', '/accounts/login/', '/dev/peps/', '/about/legal/',...
```

Grab a list of all links on the page, in absolute form (anchors excluded):

```py
>>> resp.html.absolute_links
{'https://github.com/python/pythondotorg/issues', 'https://docs.python.org/3/tutorial/', 'https://www.python.org/about/success/', 'http://feedproxy.g...
```

Search for text on the page:

```py
>>> resp.html.search('Python is a {} language')[0]
programming
```

### Selecting elements

Select an element using a CSS Selector:

```py
>>> about = resp.html.find('#about')
```

<details>
<summary>Parameters</summary>

```
Given a CSS Selector, returns a list of
:class:`Element <Element>` objects or a single one.

Parameters:
    selector: CSS Selector to use.
    clean: Whether or not to sanitize the found HTML of ``<script>`` and ``<style>``
    containing: If specified, only return elements that contain the provided text.
    first: Whether or not to return just the first result.
    raise_exception: Raise an exception if no elements are found. Default is True.
    _encoding: The encoding format.

Returns:
    A list of :class:`Element <Element>` objects or a single one.

Example CSS Selectors:
- ``a``
- ``a.someClass``
- ``a#someID``
- ``a[target=_blank]``
See W3School's `CSS Selectors Reference
<https://www.w3schools.com/cssref/css_selectors.asp>`_
for more details.
If ``first`` is ``True``, only returns the first
:class:`Element <Element>` found.
```

</details>

### Introspecting elements

Grab an Element's text contents:

```py
>>> print(about.text)
About
Applications
Quotes
Getting Started
Help
Python Brochure
```

Getting an Element's attributes:

```py
>>> about.attrs
{'id': 'about', 'class': ('tier-1', 'element-1'), 'aria-haspopup': 'true'}
>>> about.id
'about'
```

Get an Element's raw HTML:

```py
>>> about.html
'<li aria-haspopup="true" class="tier-1 element-1 " id="about">\n<a class="" href="/about/" title="">About</a>\n<ul aria-hidden="true" class="subnav menu" role="menu">\n<li class="tier-2 element-1" role="treeitem"><a href="/about/apps/" title="">Applications</a></li>\n<li class="tier-2 element-2" role="treeitem"><a href="/about/quotes/" title="">Quotes</a></li>\n<li class="tier-2 element-3" role="treeitem"><a href="/about/gettingstarted/" title="">Getting Started</a></li>\n<li class="tier-2 element-4" role="treeitem"><a href="/about/help/" title="">Help</a></li>\n<li class="tier-2 element-5" role="treeitem"><a href="http://brochure.getpython.info/" title="">Python Brochure</a></li>\n</ul>\n</li>'
```

Select Elements within Elements:

```py
>>> about.find_all('a')
[<Element 'a' href='/about/' title='' class=''>, <Element 'a' href='/about/apps/' title=''>, <Element 'a' href='/about/quotes/' title=''>, <Element 'a' href='/about/gettingstarted/' title=''>, <Element 'a' href='/about/help/' title=''>, <Element 'a' href='http://brochure.getpython.info/' title=''>]
>>> about.find('a')
<Element 'a' href='/about/' title='' class=''>
```

Searching by HTML attributes:

```py
>>> about.find('il', role='treeitem')
<Element 'li' role='treeitem' class=('tier-2', 'element-1')>
```

Search for links within an element:

```py
>>> about.absolute_links
{'http://brochure.getpython.info/', 'https://www.python.org/about/gettingstarted/', 'https://www.python.org/about/', 'https://www.python.org/about/quotes/', 'https://www.python.org/about/help/', 'https://www.python.org/about/apps/'}
```

<hr width=50>

## Browser Automation

Hrequests supports both Firefox and Chrome browsers, headless and headful sessions, and browser addons/extensions:

#### Browser support table

Chrome supports both Manifest v2/v3 extensions. Firefox only supports Manifest v2 extensions.

Only Firefox supports CloudFlare WAFs.

| Browser | MV2                | MV3                | Cloudfare WAFs     |
| ------- | ------------------ | ------------------ | ------------------ |
| Firefox | :heavy_check_mark: | :x:                | :heavy_check_mark: |
| Chrome  | :heavy_check_mark: | :heavy_check_mark: | :x:                |

### Usage

You can spawn a `BrowserSession` instance by calling it:

```py
>>> page = hrequests.BrowserSession()  # headless=True by default
```

<details>
<summary>Parameters</summary>

```
Parameters:
    headless (bool, optional): Whether to run the browser in headless mode. Defaults to True.
    session (hrequests.session.TLSSession, optional): Session to use for headers, cookies, etc.
    resp (hrequests.response.Response, optional): Response to update with cookies, headers, etc.
    proxy (str, optional): Proxy to use for the browser. Example: http://1.2.3.4:8080
    mock_human (bool, optional): Whether to emulate human behavior. Defaults to False.
    browser (Literal['firefox', 'chrome'], optional): Generate useragent headers for a specific browser
    os (Literal['win', 'mac', 'lin'], optional): Generate headers for a specific OS
    extensions (Union[str, Iterable[str]], optional): Path to a folder of unpacked extensions, or a list of paths to unpacked extensions
```

</details>

By default, `BrowserSession` returns a Chrome browser.

To create a Firefox session, use the chrome shortcut instead:

```py
>>> page = hrequests.firefox.BrowserSession()
```

`BrowserSession` is entirely safe to use across threads.

### Render an existing Response

Responses have a `.render()` method. This will render the contents of the response in a browser page.

Once the page is closed, the Response content and the Response's session cookies will be updated.

#### Simple usage

Rendered browser sessions will use the browser set in the initial request.

You can set a request's browser with the `browser` parameter in the `hrequests.get` method:

```py
>>> resp = hrequests.get('https://example.com', browser='chrome')
```

Or by setting the `browser` parameter of the `hrequests.Session` object:

```py
>>> session = hrequests.Session(browser='chrome')
>>> resp = session.get('https://example.com')
```

**Example - submitting a login form:**

```py
>>> session = hrequests.Session(browser='chrome')
>>> resp = session.get('https://www.somewebsite.com/')
>>> with resp.render(mock_human=True) as page:
...     page.type('.input#username', 'myuser')
...     page.type('.input#password', 'p4ssw0rd')
...     page.click('#submit')
# `session` & `resp` now have updated cookies, content, etc.
```

<summary><strong>Or, without a context manager</strong></summary>

```py
>>> session = hrequests.Session(browser='chrome')
>>> resp = session.get('https://www.somewebsite.com/')
>>> page = resp.render(mock_human=True)
>>> page.type('.input#username', 'myuser')
>>> page.type('.input#password', 'p4ssw0rd')
>>> page.click('#submit')
>>> page.close()  # must close the page when done!
```

</details>

The `mock_human` parameter will emulate human-like behavior. This includes easing and randomizing mouse movements, and randomizing typing speed. This functionality is based on [botright](https://github.com/Vinyzu/botright/).

<details>
<summary>Parameters</summary>

```
Parameters:
    headless (bool, optional): Whether to run the browser in headless mode. Defaults to False.
    mock_human (bool, optional): Whether to emulate human behavior. Defaults to False.
    extensions (Union[str, Iterable[str]], optional): Path to a folder of unpacked extensions, or a list of paths to unpacked extensions
```

</details>

### Properties

Cookies are inherited from the session:

```py
>>> page.cookies: RequestsCookieJar  # cookies are inherited from the session
<RequestsCookieJar[Cookie(version=0, name='1P_JAR', value='2023-07-05-20', port=None, port_specified=False, domain='.somewebsite.com', domain_specified=True...
```

### Pulling page data

Get current page url:

```py
>>> page.url: str
https://www.somewebsite.com/
```

Get page content:

```py
>>> page.text: str
'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="Search the world\'s information, including webpag'
>>> page.content: bytes
b'<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en"><head><meta content="Search the world\'s information, including webpag'
```

Get the status of the last navigation:

```py
>>> page.status_code: int
200
>>> page.reason: str
'OK'
```

Parsing HTML from the page content:

```py
>>> page.html.find_all('a')
[<Element 'a' href='/about/' title='' class=''>, <Element 'a' href='/about/apps/' title=''>, ...]
>>> page.html.find('a')
<Element 'a' href='/about/' title='' class=''>, <Element 'a' href='/about/apps/' title=''>
```

Take a screenshot of the page:

```py
>>> page.screenshot(path='screenshot.png')
```

<details>
<summary>Parameters</summary>

```
Take a screenshot of the page

Parameters:
    selector (str, optional): CSS selector to screenshot
    path (str, optional): Path to save screenshot to. Defaults to None.
    full_page (bool): Whether to take a screenshot of the full scrollable page. Cannot be used with selector. Defaults to False.

Returns:
    Optional[bytes]: Returns the screenshot buffer, if `path` was not provided
```

</details>

### Navigate the browser

Navigate to a url:

```py
>>> page.url = 'https://bing.com'
# or use goto
>>> page.goto('https://bing.com')
```

Navigate through page history:

```py
>>> page.back()
>>> page.forward()
```

### Controlling elements

Click an element:

```py
>>> page.click('#my-button')
# or through the html parser
>>> page.html.find('#my-button').click()
```

<details>
<summary>Parameters</summary>

```
Parameters:
    selector (str): CSS selector to click.
    button (Literal['left', 'right', 'middle'], optional): Mouse button to click. Defaults to 'left'.
    count (int, optional): Number of clicks. Defaults to 1.
    timeout (float, optional): Timeout in seconds. Defaults to 30.
    wait_after (bool, optional): Wait for a page event before continuing. Defaults to True.
```

</details>

Hover over an element:

```py
>>> page.hover('.dropbtn')
# or through the html parser
>>> page.html.find('.dropbtn').hover()
```

<details>
<summary>Parameters</summary>

```
Parameters:
    selector (str): CSS selector to hover over
    modifiers (List[Literal['Alt', 'Control', 'Meta', 'Shift']], optional): Modifier keys to press. Defaults to None.
    timeout (float, optional): Timeout in seconds. Defaults to 90.
```

</details>

Type text into an element:

```py
>>> page.type('#my-input', 'Hello world!')
# or through the html parser
>>> page.html.find('#my-input').type('Hello world!')
```

<details>
<summary>Parameters</summary>

```
Parameters:
    selector (str): CSS selector to type in
    text (str): Text to type
    delay (int, optional): Delay between keypresses in ms. On mock_human, this is randomized by 50%. Defaults to 50.
    timeout (float, optional): Timeout in seconds. Defaults to 30.
```

</details>

Drag and drop an element:

```py
>>> page.dragTo('#source-selector', '#target-selector')
# or through the html parser
>>> page.html.find('#source-selector').dragTo('#target-selector')
```

<details>
<summary>Parameters</summary>

```
Parameters:
    source (str): Source to drag from
    target (str): Target to drop to
    timeout (float, optional): Timeout in seconds. Defaults to 30.
    wait_after (bool, optional): Wait for a page event before continuing. Defaults to False.
    check (bool, optional): Check if an element is draggable before running. Defaults to False.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached
```

</details>

### Check page elements

Check if a selector is visible and enabled:

```py
>>> page.isVisible('#my-selector'): bool
>>> page.isEnabled('#my-selector'): bool
```

<details>
<summary>Parameters</summary>

```
Parameters:
    selector (str): Selector to check
```

</details>

Evaluate and return a script:

```py
>>> page.evaluate('selector => document.querySelector(selector).checked', '#my-selector')
```

<details>
<summary>Parameters</summary>

```
Parameters:
    script (str): Javascript to evaluate in the page
    arg (str, optional): Argument to pass into the javascript function
```

</details>

### Awaiting events

```py
>>> page.awaitNavigation()
```

<details>
<summary>Parameters</summary>

```
Parameters:
    timeout (float, optional): Timeout in seconds. Defaults to 30.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached
```

</details>

Wait for a script or function to return a truthy value:

```py
>>> page.awaitScript('selector => document.querySelector(selector).value === 100', '#progress')
```

<details>
<summary>Parameters</summary>

```
Parameters:
    script (str): Script to evaluate
    arg (str, optional): Argument to pass to script
    timeout (float, optional): Timeout in seconds. Defaults to 30.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached
```

</details>

Wait for the URL to match:

```py
>>> page.awaitUrl(re.compile(r'https?://www\.google\.com/.*'), timeout=10)
```

<details>
<summary>Parameters</summary>

```
Parameters:
    url (Union[str, Pattern[str], Callable[[str], bool]]) - URL to match for
    timeout (float, optional): Timeout in seconds. Defaults to 30.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached
```

</details>

Wait for an element to exist on the page:

```py
>>> page.awaitSelector('#my-selector')
# or through the html parser
>>> page.html.find('#my-selector').awaitSelector()
```

<details>
<summary>Parameters</summary>

```
Parameters:
    selector (str): Selector to wait for
    timeout (float, optional): Timeout in seconds. Defaults to 30.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached
```

</details>

Wait for an element to be enabled:

```py
>>> page.awaitEnabled('#my-selector')
# or through the html parser
>>> page.html.find('#my-selector').awaitEnabled()
```

<details>
<summary>Parameters</summary>

```
Parameters:
    selector (str): Selector to wait for
    timeout (float, optional): Timeout in seconds. Defaults to 30.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached
```

</details>

Screenshot an element:

```py
>>> page.screenshot('#my-selector', path='screenshot.png')
# or through the html parser
>>> page.html.find('#my-selector').screenshot('selector.png')
```

<details>
<summary>Parameters</summary>

```
Screenshot an element

Parameters:
    selector (str, optional): CSS selector to screenshot
    path (str, optional): Path to save screenshot to. Defaults to None.
    full_page (bool): Whether to take a screenshot of the full scrollable page. Cannot be used with selector. Defaults to False.

Returns:
    Optional[bytes]: Returns the screenshot buffer, if `path` was not provided
```

</details>

### Adding Firefox/Chrome extensions

Firefox/Chrome extensions can be easily imported into a browser session. Some potentially useful extensions include:

- **hektCaptcha** - Hcaptcha solver ([Chrome](https://github.com/Wikidepia/hektCaptcha-extension))

- **uBlock Origin** - Ad & popup blocker ([Chrome](https://github.com/gorhill/uBlock), [Firefox](https://github.com/gorhill/uBlock))

- **FastForward** - Bypass & skip link redirects ([Chrome](https://nightly.link/FastForwardTeam/FastForward/workflows/main/main/FastForward_chromium.zip), [Firefox](https://nightly.link/FastForwardTeam/FastForward/workflows/main/main/FastForward_firefox.zip))

Note: Firefox extensions are _Firefox-only_, and Chrome extensions are _Chrome-only_.

If you plan on using Firefox-specific or Chrome-specific extensions, make sure to set your `browser` parameter to the correct browser before rendering the page:

```py
# when dealing with captchas, make sure to use firefox
>>> resp = hrequests.get('https://accounts.hcaptcha.com/demo', browser='firefox')
```

Extensions are added with the `extensions` parameter:

- This can be an list of absolute paths to unpacked extensions:

  ```py
  with resp.render(extensions=['C:\\extensions\\hektcaptcha', 'C:\\extensions\\ublockorigin']):
  ```

- Or a folder containing the unpacked extensions:
  ```py
  with resp.render(extensions='C:\\extentions'):
  ```
  Note that these need to be _unpacked_ extensions. You can unpack a `.crx` file by changing the file extension to `.zip` and extracting the contents.

Here is an usage example of using a captcha solver:

```py
>>> resp = hrequests.get('https://accounts.hcaptcha.com/demo', browser='firefox')
>>> with resp.render(extensions=['C:\\extensions\\hektcaptcha']) as page:
...     page.awaitSelector('.hcaptcha-success')  # wait for captcha to finish
...     page.click('input[type=submit]')
```

### Requests & Responses

Requests can also be sent within browser sessions. These operate the same as the standard `hrequests.request`, and will use the browser's cookies and headers. The `BrowserSession` cookies will be updated with each request.

This returns a normal `Response` object:

```py
>>> resp = page.get('https://duckduckgo.com')
```

<details>
<summary>Parameters</summary>

```
Parameters:
    url (str): URL to send request to
    params (dict, optional): Dictionary of URL parameters to append to the URL. Defaults to None.
    data (Union[str, dict], optional): Data to send to request. Defaults to None.
    headers (dict, optional): Dictionary of HTTP headers to send with the request. Defaults to None.
    form (dict, optional): Form data to send with the request. Defaults to None.
    multipart (dict, optional): Multipart data to send with the request. Defaults to None.
    timeout (float, optional): Timeout in seconds. Defaults to 30.
    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.
    max_redirects (int, optional): Maximum number of redirects to follow. Defaults to None.

Throws:
    hrequests.exceptions.BrowserTimeoutException: If timeout is reached

Returns:
    hrequests.response.Response: Response object
```

</details>

Other methods include `post`, `put`, `delete`, `head`, and `patch`.

### Closing the page

The `BrowserSession` object must be closed when finished. This will close the browser, update the response data, and merge new cookies with the session cookies.

```py
>>> page.close()
```

Note that this is automatically done when using a context manager.

Session cookies are updated:

```py
>>> session.cookies: RequestsCookieJar
<RequestsCookieJar[Cookie(version=0, name='MUID', value='123456789', port=None, port_specified=False, domain='.bing.com', domain_specified=True, domain_initial_dot=True...
```

Response data is updated:

```py
>>> resp.url: str
'https://www.bing.com/?toWww=1&redig=823778234657823652376438'
>>> resp.content: Union[bytes, str]
'<!DOCTYPE html><html lang="en" dir="ltr"><head><meta name="theme-color" content="#4F4F4F"><meta name="description" content="Bing helps you turn inform...
```

#### Other ways to create a Browser Session

You can use `.render` to spawn a `BrowserSession` object directly from a url:

```py
# Using a Session:
>>> page = session.render('https://google.com')
# Or without a session at all:
>>> page = hrequests.render('https://google.com')
```

Make sure to close all `BrowserSession` objects when done!

```py
>>> page.close()
```

---


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/daijro/hrequests",
    "name": "hrequests",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "tls, client, http, scraping, requests, humans, playwright",
    "author": "daijro",
    "author_email": "daijro.dev@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/33/c1/7caaceee3c3dec2d19efd40beeb77f374171bfbfc2fd7e63b011abfc04e6/hrequests-0.8.2.tar.gz",
    "platform": null,
    "description": "<img src=\"https://i.imgur.com/r8GcQW1.png\" align=\"center\">\n</img>\n\n<h2 align=\"center\">hrequests</h2>\n\n<h4 align=\"center\">\n<p align=\"center\">\n    <a href=\"https://github.com/daijro/hrequests/blob/main/LICENSE\">\n        <img src=\"https://img.shields.io/github/license/daijro/hrequests.svg\">\n    </a>\n    <a href=\"https://python.org/\">\n        <img src=\"https://img.shields.io/badge/python-3.8&#8208;3.12-blue\">\n    </a>\n    <a href=\"https://pypi.org/project/hrequests/\">\n        <img alt=\"PyPI\" src=\"https://img.shields.io/pypi/v/hrequests.svg\">\n    </a>\n    <a href=\"https://pepy.tech/project/hrequests\">\n        <img alt=\"PyPI\" src=\"https://static.pepy.tech/badge/hrequests\">\n    </a>\n    <a href=\"https://github.com/ambv/black\">\n        <img src=\"https://img.shields.io/badge/code%20style-black-black.svg\">\n    </a>\n    <a href=\"https://github.com/PyCQA/isort\">\n        <img src=\"https://img.shields.io/badge/imports-isort-yellow.svg\">\n    </a>\n</p>\n    Hrequests (human requests) is a simple, configurable, feature-rich, replacement for the Python requests library. \n</h4>\n\n### \u2728 Features\n\n- Seamless transition between HTTP and headless browsing \ud83d\udcbb\n- Integrated fast HTML parser \ud83d\ude80\n- High performance network concurrency with goroutines & gevent \ud83d\ude80\n- Replication of browser TLS fingerprints \ud83d\ude80\n- JavaScript rendering \ud83d\ude80\n- Supports HTTP/2 \ud83d\ude80\n- Realistic browser header generation \ud83d\ude80\n- JSON serializing up to 10x faster than the standard library \ud83d\ude80\n\n### \ud83d\udcbb Browser crawling\n\n- Simple & uncomplicated browser automation\n- Human-like cursor movement and typing\n- Chrome and Firefox extension support\n- Full page screenshots\n- Proxy support\n- Headless and headful support\n- No CORS restrictions\n\n### \u26a1 More\n\n- High performance \u2728\n- Minimal dependence on the python standard libraries\n- HTTP backend written in Go\n- Automatic gzip & brotli decode\n- Written with type safety\n- 100% threadsafe \u2764\ufe0f\n\n---\n\n# Installation\n\nInstall via pip:\n\n```bash\npip install -U hrequests[all]\npython -m hrequests install\n```\n\n<details>\n<summary>Or, install without headless browsing support</i></summary>\n\n**Ignore the `[all]` option if you don't want headless browsing support:**\n\n```bash\npip install -U hrequests\n```\n\n</details>\n\n---\n\n# Documentation\n\n**For the latest stable hrequests documentation, check the [Gitbook page](https://daijro.gitbook.io/hrequests/).**\n\n1. [Simple Usage](https://github.com/daijro/hrequests#simple-usage)\n2. [Sessions](https://github.com/daijro/hrequests#sessions)\n3. [Concurrent & Lazy Requests](https://github.com/daijro/hrequests#concurrent--lazy-requests)\n4. [HTML Parsing](https://github.com/daijro/hrequests#html-parsing)\n5. [Browser Automation](https://github.com/daijro/hrequests#browser-automation)\n\n<hr width=50>\n\n## Simple Usage\n\nHere is an example of a simple `get` request:\n\n```py\n>>> resp = hrequests.get('https://www.google.com/')\n```\n\nRequests are sent through [bogdanfinn's tls-client](https://github.com/bogdanfinn/tls-client) to spoof the TLS client fingerprint. This is done automatically, and is completely transparent to the user.\n\nOther request methods include `post`, `put`, `delete`, `head`, `options`, and `patch`.\n\nThe `Response` object is a near 1:1 replica of the `requests.Response` object, with some additional attributes.\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    url (Union[str, Iterable[str]]): URL or list of URLs to request.\n    data (Union[str, bytes, bytearray, dict], optional): Data to send to request. Defaults to None.\n    files (Dict[str, Union[BufferedReader, tuple]], optional): Data to send to request. Defaults to None.\n    headers (dict, optional): Dictionary of HTTP headers to send with the request. Defaults to None.\n    params (dict, optional): Dictionary of URL parameters to append to the URL. Defaults to None.\n    cookies (Union[RequestsCookieJar, dict, list], optional): Dict or CookieJar to send. Defaults to None.\n    json (dict, optional): Json to send in the request body. Defaults to None.\n    allow_redirects (bool, optional): Allow request to redirect. Defaults to True.\n    history (bool, optional): Remember request history. Defaults to False.\n    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n    proxy (str, optional): Proxy URL. Defaults to None.\n    nohup (bool, optional): Run the request in the background. Defaults to False.\n    <Additionally includes all parameters from `hrequests.Session` if a session was not specified>\n\nReturns:\n    hrequests.response.Response: Response object\n```\n\n</details>\n\n### Properties\n\nGet the response url:\n\n```py\n>>> resp.url: str\n'https://www.google.com/'\n```\n\nCheck if the request was successful:\n\n```py\n>>> resp.status_code: int\n200\n>>> resp.reason: str\n'OK'\n>>> resp.ok: bool\nTrue\n>>> bool(resp)\nTrue\n```\n\nGetting the response body:\n\n```py\n>>> resp.text: str\n'<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org/WebPage\" lang=\"en\"><head><meta charset=\"UTF-8\"><meta content=\"origin\" name=\"referrer\"><m...'\n>>> resp.content: bytes\nb'<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org/WebPage\" lang=\"en\"><head><meta charset=\"UTF-8\"><meta content=\"origin\" name=\"referrer\"><m...'\n>>> resp.encoding: str\n'UTF-8'\n```\n\nParse the response body as JSON:\n\n```py\n>>> resp.json(): Union[dict, list]\n{'somedata': True}\n```\n\nGet the elapsed time of the request:\n\n```py\n>>> resp.elapsed: datetime.timedelta\ndatetime.timedelta(microseconds=77768)\n```\n\nGet the response cookies:\n\n```py\n>>> resp.cookies: RequestsCookieJar\n<RequestsCookieJar[Cookie(version=0, name='1P_JAR', value='2023-07-05-20', port=None, port_specified=False, domain='.google.com', domain_specified=True...\n```\n\nGet the response headers:\n\n```py\n>>> resp.headers: CaseInsensitiveDict\n{'Alt-Svc': 'h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000', 'Cache-Control': 'private, max-age=0', 'Content-Encoding': 'br', 'Content-Length': '51288', 'Content-Security-Policy-Report-Only': \"object-src 'none';base-uri 'se\n```\n\n<hr width=50>\n\n## Sessions\n\nCreating a new Chrome Session object:\n\n```py\n>>> session = hrequests.Session()  # version randomized by default\n>>> session = hrequests.Session('chrome', version=120)\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    browser (Literal['firefox', 'chrome'], optional): Browser to use. Default is 'chrome'.\n    version (int, optional): Version of the browser to use. Browser must be specified. Default is randomized.\n    os (Literal['win', 'mac', 'lin'], optional): OS to use in header. Default is randomized.\n    headers (dict, optional): Dictionary of HTTP headers to send with the request. Default is generated from `browser` and `os`.\n    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.\n    timeout (float, optional): Default timeout in seconds. Defaults to 30.\n    proxy (str, optional): Proxy URL. Defaults to None.\n    cookies (Union[RequestsCookieJar, dict, list], optional): Cookie Jar, or cookie list/dict to send. Defaults to None.\n    certificate_pinning (Dict[str, List[str]], optional): Certificate pinning. Defaults to None.\n    disable_ipv6 (bool, optional): Disable IPv6. Defaults to False.\n    detect_encoding (bool, optional): Detect encoding. Defaults to True.\n    ja3_string (str, optional): JA3 string. Defaults to None.\n    h2_settings (dict, optional): HTTP/2 settings. Defaults to None.\n    additional_decode (str, optional): Decode response body with \"gzip\" or \"br\". Defaults to None.\n    pseudo_header_order (list, optional): Pseudo header order. Defaults to None.\n    priority_frames (list, optional): Priority frames. Defaults to None.\n    header_order (list, optional): Header order. Defaults to None.\n    force_http1 (bool, optional): Force HTTP/1. Defaults to False.\n    catch_panics (bool, optional): Catch panics. Defaults to False.\n    debug (bool, optional): Debug mode. Defaults to False.\n```\n\n</details>\n\nBrowsers can also be created through the `firefox` and `chrome` shortcuts:\n\n```py\n>>> session = hrequests.firefox.Session()\n>>> session = hrequests.chrome.Session()\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    version (int, optional): Version of the browser to use. Browser must be specified. Default is randomized.\n    os (Literal['win', 'mac', 'lin'], optional): OS to use in header. Default is randomized.\n    headers (dict, optional): Dictionary of HTTP headers to send with the request. Default is generated from `browser` and `os`.\n    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.\n    timeout (float, optional): Default timeout in seconds. Defaults to 30.\n    proxy (str, optional): Proxy URL. Defaults to None.\n    cookies (Union[RequestsCookieJar, dict, list], optional): Cookie Jar, or cookie list/dict to send. Defaults to None.\n    certificate_pinning (Dict[str, List[str]], optional): Certificate pinning. Defaults to None.\n    disable_ipv6 (bool, optional): Disable IPv6. Defaults to False.\n    detect_encoding (bool, optional): Detect encoding. Defaults to True.\n    ja3_string (str, optional): JA3 string. Defaults to None.\n    h2_settings (dict, optional): HTTP/2 settings. Defaults to None.\n    additional_decode (str, optional): Decode response body with \"gzip\" or \"br\". Defaults to None.\n    pseudo_header_order (list, optional): Pseudo header order. Defaults to None.\n    priority_frames (list, optional): Priority frames. Defaults to None.\n    header_order (list, optional): Header order. Defaults to None.\n    force_http1 (bool, optional): Force HTTP/1. Defaults to False.\n    catch_panics (bool, optional): Catch panics. Defaults to False.\n    debug (bool, optional): Debug mode. Defaults to False.\n```\n\n</details>\n\n`os` can be `'win'`, `'mac'`, or `'lin'`. Default is randomized.\n\n```py\n>>> session = hrequests.chrome.Session(os='mac')\n```\n\nThis will automatically generate headers based on the browser name and OS:\n\n```py\n>>> session.headers\n{'Accept': '*/*', 'Connection': 'keep-alive', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_4; rv:60.2.2) Gecko/20100101 Firefox/60.2.2', 'Accept-Encoding': 'gzip, deflate, br', 'Pragma': 'no-cache'}\n```\n\n<details>\n<summary>Why is the browser version in the header different than the TLS browser version?</summary>\n\nWebsite bot detection systems typically do not correlate the TLS fingerprint browser version with the browser header.\n\nBy adding more randomization to our headers, we can make our requests appear to be coming from a larger number of clients. We can make it seem like our requests are coming from a larger number of clients. This makes it harder for websites to identify and block our requests based on a consistent browser version.\n\n</details>\n\n### Properties\n\nHere is a simple get request. This is a wrapper around `hrequests.get`. The only difference is that the session cookies are updated with each request. Creating sessions are recommended for making multiple requests to the same domain.\n\n```py\n>>> resp = session.get('https://www.google.com/')\n```\n\nSession cookies update with each request:\n\n```py\n>>> session.cookies: RequestsCookieJar\n<RequestsCookieJar[Cookie(version=0, name='1P_JAR', value='2023-07-05-20', port=None, port_specified=False, domain='.google.com', domain_specified=True...\n```\n\nRegenerate headers for a different OS:\n\n```py\n>>> session.os = 'win'\n>>> session.headers: CaseInsensitiveDict\n{'Accept': '*/*', 'Connection': 'keep-alive', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0.3) Gecko/20100101 Firefox/66.0.3', 'Accept-Encoding': 'gzip, deflate, br', 'Accept-Language': 'en-US;q=0.5,en;q=0.3', 'Cache-Control': 'max-age=0', 'DNT': '1', 'Upgrade-Insecure-Requests': '1', 'Pragma': 'no-cache'}\n```\n\n### Closing Sessions\n\nSessions can also be closed to free memory:\n\n```py\n>>> session.close()\n```\n\nAlternatively, sessions can be used as context managers:\n\n```py\nwith hrequests.Session() as session:\n    resp = session.get('https://www.google.com/')\n    print(resp)\n```\n\n<hr width=50>\n\n## Concurrent & Lazy Requests\n\n### Nohup Requests\n\nSimilar to Unix's nohup command, `nohup` requests are sent in the background.\n\nAdding the `nohup=True` keyword argument will return a `LazyTLSRequest` object. This will send the request immediately, but doesn't wait for the response to be ready until an attribute of the response is accessed.\n\n```py\nresp1 = hrequests.get('https://www.google.com/', nohup=True)\nresp2 = hrequests.get('https://www.google.com/', nohup=True)\n```\n\n`resp1` and `resp2` are sent concurrently. They will _never_ pause the current thread, unless an attribute of the response is accessed:\n\n```py\nprint('Resp 1:', resp1.reason)  # will wait for resp1 to finish, if it hasn't already\nprint('Resp 2:', resp2.reason)  # will wait for resp2 to finish, if it hasn't already\n```\n\nThis is useful for sending requests in the background that aren't needed until later.\n\nNote: In `nohup`, a new thread is created for each request. For larger scale concurrency, please consider the following:\n\n### Easy Concurrency\n\nYou can pass an array/iterator of links to the request methods to send them concurrently. This wraps around [`hrequests.map`](https://github.com/daijro/hrequests#map):\n\n```py\n>>> hrequests.get(['https://google.com/', 'https://github.com/'])\n(<Response [200]>, <Response [200]>)\n```\n\nThis also works with `nohup`:\n\n```py\n>>> resps = hrequests.get(['https://google.com/', 'https://github.com/'], nohup=True)\n>>> resps\n(<LazyResponse[Pending]>, <LazyResponse[Pending]>)\n>>> # Sometime later...\n>>> resps\n(<Response [200]>, <Response [200]>)\n```\n\n### Grequests-style Concurrency\n\nThe methods `async_get`, `async_post`, etc. will create an unsent request. This levereges gevent, making it _blazing fast_.\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    url (str): URL to send request to\n    data (Union[str, bytes, bytearray, dict], optional): Data to send to request. Defaults to None.\n    files (Dict[str, Union[BufferedReader, tuple]], optional): Data to send to request. Defaults to None.\n    headers (dict, optional): Dictionary of HTTP headers to send with the request. Defaults to None.\n    params (dict, optional): Dictionary of URL parameters to append to the URL. Defaults to None.\n    cookies (Union[RequestsCookieJar, dict, list], optional): Dict or CookieJar to send. Defaults to None.\n    json (dict, optional): Json to send in the request body. Defaults to None.\n    allow_redirects (bool, optional): Allow request to redirect. Defaults to True.\n    history (bool, optional): Remember request history. Defaults to False.\n    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n    proxy (str, optional): Proxy URL. Defaults to None.\n    <Additionally includes all parameters from `hrequests.Session` if a session was not specified>\n\nReturns:\n    hrequests.response.Response: Response object\n```\n\n</details>\n\nAsync requests are evaluated on `hrequests.map`, `hrequests.imap`, or `hrequests.imap_enum`.\n\nThis functionality is similar to [grequests](https://github.com/spyoungtech/grequests). Unlike grequests, [monkey patching](https://www.gevent.org/api/gevent.monkey.html) is not required because this does not rely on the standard python SSL library.\n\nCreate a set of unsent Requests:\n\n```py\n>>> reqs = [\n...     hrequests.async_get('https://www.google.com/', browser='firefox'),\n...     hrequests.async_get('https://www.duckduckgo.com/'),\n...     hrequests.async_get('https://www.yahoo.com/')\n... ]\n```\n\n#### map\n\nSend them all at the same time using map:\n\n```py\n>>> hrequests.map(reqs, size=3)\n[<Response [200]>, <Response [200]>, <Response [200]>]\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nConcurrently converts a list of Requests to Responses.\nParameters:\n    requests - a collection of Request objects.\n    size - Specifies the number of requests to make at a time. If None, no throttling occurs.\n    exception_handler - Callback function, called when exception occurred. Params: Request, Exception\n    timeout - Gevent joinall timeout in seconds. (Note: unrelated to requests timeout)\n\nReturns:\n    A list of Response objects.\n```\n\n</details>\n\n#### imap\n\n`imap` returns a generator that yields responses as they come in:\n\n```py\n>>> for resp in hrequests.imap(reqs, size=3):\n...    print(resp)\n<Response [200]>\n<Response [200]>\n<Response [200]>\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nConcurrently converts a generator object of Requests to a generator of Responses.\n\nParameters:\n    requests - a generator or sequence of Request objects.\n    size - Specifies the number of requests to make at a time. default is 2\n    exception_handler - Callback function, called when exception occurred. Params: Request, Exception\n\nYields:\n    Response objects.\n```\n\n</details>\n\n`imap_enum` returns a generator that yields a tuple of `(index, response)` as they come in. The `index` is the index of the request in the original list:\n\n```py\n>>> for index, resp in hrequests.imap_enum(reqs, size=3):\n...     print(index, resp)\n(1, <Response [200]>)\n(0, <Response [200]>)\n(2, <Response [200]>)\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nLike imap, but yields tuple of original request index and response object\nUnlike imap, failed results and responses from exception handlers that return None are not ignored. Instead, a\ntuple of (index, None) is yielded.\nResponses are still in arbitrary order.\n\nParameters:\n    requests - a sequence of Request objects.\n    size - Specifies the number of requests to make at a time. default is 2\n    exception_handler - Callback function, called when exception occurred. Params: Request, Exception\n\nYields:\n    (index, Response) tuples.\n```\n\n</details>\n\n#### Exception Handling\n\nTo handle timeouts or any other exception during the connection of the request, you can add an optional exception handler that will be called with the request and exception inside the main thread.\n\n```py\n>>> def exception_handler(request, exception):\n...    return f'Response failed: {exception}'\n\n>>> bad_reqs = [\n...     hrequests.async_get('http://httpbin.org/delay/5', timeout=1),\n...     hrequests.async_get('http://fakedomain/'),\n...     hrequests.async_get('http://example.com/'),\n... ]\n>>> hrequests.map(bad_reqs, size=3, exception_handler=exception_handler)\n['Response failed: Connection error', 'Response failed: Connection error', <Response [200]>]\n```\n\nThe value returned by the exception handler will be used in place of the response in the result list.\n\nIf an exception handler isn't specified, the default yield type is `hrequests.FailedResponse`.\n\n<hr width=50>\n\n## HTML Parsing\n\nHTML scraping is based off [selectolax](https://github.com/rushter/selectolax), which is **over 25x faster** than bs4. This functionality is inspired by [requests-html](https://github.com/psf/requests-html).\n\n| Library        | Time (1e5 trials) |\n| -------------- | ----------------- |\n| BeautifulSoup4 | 52.6              |\n| PyQuery        | 7.5               |\n| selectolax     | **1.9**               |\n\nThe HTML parser can be accessed through the `html` attribute of the response object:\n\n```py\n>>> resp = session.get('https://python.org/')\n>>> resp.html\n<HTML url='https://www.python.org/'>\n```\n\n### Parsing page\n\nGrab a list of all links on the page, as-is (anchors excluded):\n\n```py\n>>> resp.html.links\n{'//docs.python.org/3/tutorial/', '/about/apps/', 'https://github.com/python/pythondotorg/issues', '/accounts/login/', '/dev/peps/', '/about/legal/',...\n```\n\nGrab a list of all links on the page, in absolute form (anchors excluded):\n\n```py\n>>> resp.html.absolute_links\n{'https://github.com/python/pythondotorg/issues', 'https://docs.python.org/3/tutorial/', 'https://www.python.org/about/success/', 'http://feedproxy.g...\n```\n\nSearch for text on the page:\n\n```py\n>>> resp.html.search('Python is a {} language')[0]\nprogramming\n```\n\n### Selecting elements\n\nSelect an element using a CSS Selector:\n\n```py\n>>> about = resp.html.find('#about')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nGiven a CSS Selector, returns a list of\n:class:`Element <Element>` objects or a single one.\n\nParameters:\n    selector: CSS Selector to use.\n    clean: Whether or not to sanitize the found HTML of ``<script>`` and ``<style>``\n    containing: If specified, only return elements that contain the provided text.\n    first: Whether or not to return just the first result.\n    raise_exception: Raise an exception if no elements are found. Default is True.\n    _encoding: The encoding format.\n\nReturns:\n    A list of :class:`Element <Element>` objects or a single one.\n\nExample CSS Selectors:\n- ``a``\n- ``a.someClass``\n- ``a#someID``\n- ``a[target=_blank]``\nSee W3School's `CSS Selectors Reference\n<https://www.w3schools.com/cssref/css_selectors.asp>`_\nfor more details.\nIf ``first`` is ``True``, only returns the first\n:class:`Element <Element>` found.\n```\n\n</details>\n\n### Introspecting elements\n\nGrab an Element's text contents:\n\n```py\n>>> print(about.text)\nAbout\nApplications\nQuotes\nGetting Started\nHelp\nPython Brochure\n```\n\nGetting an Element's attributes:\n\n```py\n>>> about.attrs\n{'id': 'about', 'class': ('tier-1', 'element-1'), 'aria-haspopup': 'true'}\n>>> about.id\n'about'\n```\n\nGet an Element's raw HTML:\n\n```py\n>>> about.html\n'<li aria-haspopup=\"true\" class=\"tier-1 element-1 \" id=\"about\">\\n<a class=\"\" href=\"/about/\" title=\"\">About</a>\\n<ul aria-hidden=\"true\" class=\"subnav menu\" role=\"menu\">\\n<li class=\"tier-2 element-1\" role=\"treeitem\"><a href=\"/about/apps/\" title=\"\">Applications</a></li>\\n<li class=\"tier-2 element-2\" role=\"treeitem\"><a href=\"/about/quotes/\" title=\"\">Quotes</a></li>\\n<li class=\"tier-2 element-3\" role=\"treeitem\"><a href=\"/about/gettingstarted/\" title=\"\">Getting Started</a></li>\\n<li class=\"tier-2 element-4\" role=\"treeitem\"><a href=\"/about/help/\" title=\"\">Help</a></li>\\n<li class=\"tier-2 element-5\" role=\"treeitem\"><a href=\"http://brochure.getpython.info/\" title=\"\">Python Brochure</a></li>\\n</ul>\\n</li>'\n```\n\nSelect Elements within Elements:\n\n```py\n>>> about.find_all('a')\n[<Element 'a' href='/about/' title='' class=''>, <Element 'a' href='/about/apps/' title=''>, <Element 'a' href='/about/quotes/' title=''>, <Element 'a' href='/about/gettingstarted/' title=''>, <Element 'a' href='/about/help/' title=''>, <Element 'a' href='http://brochure.getpython.info/' title=''>]\n>>> about.find('a')\n<Element 'a' href='/about/' title='' class=''>\n```\n\nSearching by HTML attributes:\n\n```py\n>>> about.find('il', role='treeitem')\n<Element 'li' role='treeitem' class=('tier-2', 'element-1')>\n```\n\nSearch for links within an element:\n\n```py\n>>> about.absolute_links\n{'http://brochure.getpython.info/', 'https://www.python.org/about/gettingstarted/', 'https://www.python.org/about/', 'https://www.python.org/about/quotes/', 'https://www.python.org/about/help/', 'https://www.python.org/about/apps/'}\n```\n\n<hr width=50>\n\n## Browser Automation\n\nHrequests supports both Firefox and Chrome browsers, headless and headful sessions, and browser addons/extensions:\n\n#### Browser support table\n\nChrome supports both Manifest v2/v3 extensions. Firefox only supports Manifest v2 extensions.\n\nOnly Firefox supports CloudFlare WAFs.\n\n| Browser | MV2                | MV3                | Cloudfare WAFs     |\n| ------- | ------------------ | ------------------ | ------------------ |\n| Firefox | :heavy_check_mark: | :x:                | :heavy_check_mark: |\n| Chrome  | :heavy_check_mark: | :heavy_check_mark: | :x:                |\n\n### Usage\n\nYou can spawn a `BrowserSession` instance by calling it:\n\n```py\n>>> page = hrequests.BrowserSession()  # headless=True by default\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    headless (bool, optional): Whether to run the browser in headless mode. Defaults to True.\n    session (hrequests.session.TLSSession, optional): Session to use for headers, cookies, etc.\n    resp (hrequests.response.Response, optional): Response to update with cookies, headers, etc.\n    proxy (str, optional): Proxy to use for the browser. Example: http://1.2.3.4:8080\n    mock_human (bool, optional): Whether to emulate human behavior. Defaults to False.\n    browser (Literal['firefox', 'chrome'], optional): Generate useragent headers for a specific browser\n    os (Literal['win', 'mac', 'lin'], optional): Generate headers for a specific OS\n    extensions (Union[str, Iterable[str]], optional): Path to a folder of unpacked extensions, or a list of paths to unpacked extensions\n```\n\n</details>\n\nBy default, `BrowserSession` returns a Chrome browser.\n\nTo create a Firefox session, use the chrome shortcut instead:\n\n```py\n>>> page = hrequests.firefox.BrowserSession()\n```\n\n`BrowserSession` is entirely safe to use across threads.\n\n### Render an existing Response\n\nResponses have a `.render()` method. This will render the contents of the response in a browser page.\n\nOnce the page is closed, the Response content and the Response's session cookies will be updated.\n\n#### Simple usage\n\nRendered browser sessions will use the browser set in the initial request.\n\nYou can set a request's browser with the `browser` parameter in the `hrequests.get` method:\n\n```py\n>>> resp = hrequests.get('https://example.com', browser='chrome')\n```\n\nOr by setting the `browser` parameter of the `hrequests.Session` object:\n\n```py\n>>> session = hrequests.Session(browser='chrome')\n>>> resp = session.get('https://example.com')\n```\n\n**Example - submitting a login form:**\n\n```py\n>>> session = hrequests.Session(browser='chrome')\n>>> resp = session.get('https://www.somewebsite.com/')\n>>> with resp.render(mock_human=True) as page:\n...     page.type('.input#username', 'myuser')\n...     page.type('.input#password', 'p4ssw0rd')\n...     page.click('#submit')\n# `session` & `resp` now have updated cookies, content, etc.\n```\n\n<summary><strong>Or, without a context manager</strong></summary>\n\n```py\n>>> session = hrequests.Session(browser='chrome')\n>>> resp = session.get('https://www.somewebsite.com/')\n>>> page = resp.render(mock_human=True)\n>>> page.type('.input#username', 'myuser')\n>>> page.type('.input#password', 'p4ssw0rd')\n>>> page.click('#submit')\n>>> page.close()  # must close the page when done!\n```\n\n</details>\n\nThe `mock_human` parameter will emulate human-like behavior. This includes easing and randomizing mouse movements, and randomizing typing speed. This functionality is based on [botright](https://github.com/Vinyzu/botright/).\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    headless (bool, optional): Whether to run the browser in headless mode. Defaults to False.\n    mock_human (bool, optional): Whether to emulate human behavior. Defaults to False.\n    extensions (Union[str, Iterable[str]], optional): Path to a folder of unpacked extensions, or a list of paths to unpacked extensions\n```\n\n</details>\n\n### Properties\n\nCookies are inherited from the session:\n\n```py\n>>> page.cookies: RequestsCookieJar  # cookies are inherited from the session\n<RequestsCookieJar[Cookie(version=0, name='1P_JAR', value='2023-07-05-20', port=None, port_specified=False, domain='.somewebsite.com', domain_specified=True...\n```\n\n### Pulling page data\n\nGet current page url:\n\n```py\n>>> page.url: str\nhttps://www.somewebsite.com/\n```\n\nGet page content:\n\n```py\n>>> page.text: str\n'<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org/WebPage\" lang=\"en\"><head><meta content=\"Search the world\\'s information, including webpag'\n>>> page.content: bytes\nb'<!doctype html><html itemscope=\"\" itemtype=\"http://schema.org/WebPage\" lang=\"en\"><head><meta content=\"Search the world\\'s information, including webpag'\n```\n\nGet the status of the last navigation:\n\n```py\n>>> page.status_code: int\n200\n>>> page.reason: str\n'OK'\n```\n\nParsing HTML from the page content:\n\n```py\n>>> page.html.find_all('a')\n[<Element 'a' href='/about/' title='' class=''>, <Element 'a' href='/about/apps/' title=''>, ...]\n>>> page.html.find('a')\n<Element 'a' href='/about/' title='' class=''>, <Element 'a' href='/about/apps/' title=''>\n```\n\nTake a screenshot of the page:\n\n```py\n>>> page.screenshot(path='screenshot.png')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nTake a screenshot of the page\n\nParameters:\n    selector (str, optional): CSS selector to screenshot\n    path (str, optional): Path to save screenshot to. Defaults to None.\n    full_page (bool): Whether to take a screenshot of the full scrollable page. Cannot be used with selector. Defaults to False.\n\nReturns:\n    Optional[bytes]: Returns the screenshot buffer, if `path` was not provided\n```\n\n</details>\n\n### Navigate the browser\n\nNavigate to a url:\n\n```py\n>>> page.url = 'https://bing.com'\n# or use goto\n>>> page.goto('https://bing.com')\n```\n\nNavigate through page history:\n\n```py\n>>> page.back()\n>>> page.forward()\n```\n\n### Controlling elements\n\nClick an element:\n\n```py\n>>> page.click('#my-button')\n# or through the html parser\n>>> page.html.find('#my-button').click()\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    selector (str): CSS selector to click.\n    button (Literal['left', 'right', 'middle'], optional): Mouse button to click. Defaults to 'left'.\n    count (int, optional): Number of clicks. Defaults to 1.\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n    wait_after (bool, optional): Wait for a page event before continuing. Defaults to True.\n```\n\n</details>\n\nHover over an element:\n\n```py\n>>> page.hover('.dropbtn')\n# or through the html parser\n>>> page.html.find('.dropbtn').hover()\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    selector (str): CSS selector to hover over\n    modifiers (List[Literal['Alt', 'Control', 'Meta', 'Shift']], optional): Modifier keys to press. Defaults to None.\n    timeout (float, optional): Timeout in seconds. Defaults to 90.\n```\n\n</details>\n\nType text into an element:\n\n```py\n>>> page.type('#my-input', 'Hello world!')\n# or through the html parser\n>>> page.html.find('#my-input').type('Hello world!')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    selector (str): CSS selector to type in\n    text (str): Text to type\n    delay (int, optional): Delay between keypresses in ms. On mock_human, this is randomized by 50%. Defaults to 50.\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n```\n\n</details>\n\nDrag and drop an element:\n\n```py\n>>> page.dragTo('#source-selector', '#target-selector')\n# or through the html parser\n>>> page.html.find('#source-selector').dragTo('#target-selector')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    source (str): Source to drag from\n    target (str): Target to drop to\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n    wait_after (bool, optional): Wait for a page event before continuing. Defaults to False.\n    check (bool, optional): Check if an element is draggable before running. Defaults to False.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n```\n\n</details>\n\n### Check page elements\n\nCheck if a selector is visible and enabled:\n\n```py\n>>> page.isVisible('#my-selector'): bool\n>>> page.isEnabled('#my-selector'): bool\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    selector (str): Selector to check\n```\n\n</details>\n\nEvaluate and return a script:\n\n```py\n>>> page.evaluate('selector => document.querySelector(selector).checked', '#my-selector')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    script (str): Javascript to evaluate in the page\n    arg (str, optional): Argument to pass into the javascript function\n```\n\n</details>\n\n### Awaiting events\n\n```py\n>>> page.awaitNavigation()\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n```\n\n</details>\n\nWait for a script or function to return a truthy value:\n\n```py\n>>> page.awaitScript('selector => document.querySelector(selector).value === 100', '#progress')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    script (str): Script to evaluate\n    arg (str, optional): Argument to pass to script\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n```\n\n</details>\n\nWait for the URL to match:\n\n```py\n>>> page.awaitUrl(re.compile(r'https?://www\\.google\\.com/.*'), timeout=10)\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    url (Union[str, Pattern[str], Callable[[str], bool]]) - URL to match for\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n```\n\n</details>\n\nWait for an element to exist on the page:\n\n```py\n>>> page.awaitSelector('#my-selector')\n# or through the html parser\n>>> page.html.find('#my-selector').awaitSelector()\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    selector (str): Selector to wait for\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n```\n\n</details>\n\nWait for an element to be enabled:\n\n```py\n>>> page.awaitEnabled('#my-selector')\n# or through the html parser\n>>> page.html.find('#my-selector').awaitEnabled()\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    selector (str): Selector to wait for\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n```\n\n</details>\n\nScreenshot an element:\n\n```py\n>>> page.screenshot('#my-selector', path='screenshot.png')\n# or through the html parser\n>>> page.html.find('#my-selector').screenshot('selector.png')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nScreenshot an element\n\nParameters:\n    selector (str, optional): CSS selector to screenshot\n    path (str, optional): Path to save screenshot to. Defaults to None.\n    full_page (bool): Whether to take a screenshot of the full scrollable page. Cannot be used with selector. Defaults to False.\n\nReturns:\n    Optional[bytes]: Returns the screenshot buffer, if `path` was not provided\n```\n\n</details>\n\n### Adding Firefox/Chrome extensions\n\nFirefox/Chrome extensions can be easily imported into a browser session. Some potentially useful extensions include:\n\n- **hektCaptcha** - Hcaptcha solver ([Chrome](https://github.com/Wikidepia/hektCaptcha-extension))\n\n- **uBlock Origin** - Ad & popup blocker ([Chrome](https://github.com/gorhill/uBlock), [Firefox](https://github.com/gorhill/uBlock))\n\n- **FastForward** - Bypass & skip link redirects ([Chrome](https://nightly.link/FastForwardTeam/FastForward/workflows/main/main/FastForward_chromium.zip), [Firefox](https://nightly.link/FastForwardTeam/FastForward/workflows/main/main/FastForward_firefox.zip))\n\nNote: Firefox extensions are _Firefox-only_, and Chrome extensions are _Chrome-only_.\n\nIf you plan on using Firefox-specific or Chrome-specific extensions, make sure to set your `browser` parameter to the correct browser before rendering the page:\n\n```py\n# when dealing with captchas, make sure to use firefox\n>>> resp = hrequests.get('https://accounts.hcaptcha.com/demo', browser='firefox')\n```\n\nExtensions are added with the `extensions` parameter:\n\n- This can be an list of absolute paths to unpacked extensions:\n\n  ```py\n  with resp.render(extensions=['C:\\\\extensions\\\\hektcaptcha', 'C:\\\\extensions\\\\ublockorigin']):\n  ```\n\n- Or a folder containing the unpacked extensions:\n  ```py\n  with resp.render(extensions='C:\\\\extentions'):\n  ```\n  Note that these need to be _unpacked_ extensions. You can unpack a `.crx` file by changing the file extension to `.zip` and extracting the contents.\n\nHere is an usage example of using a captcha solver:\n\n```py\n>>> resp = hrequests.get('https://accounts.hcaptcha.com/demo', browser='firefox')\n>>> with resp.render(extensions=['C:\\\\extensions\\\\hektcaptcha']) as page:\n...     page.awaitSelector('.hcaptcha-success')  # wait for captcha to finish\n...     page.click('input[type=submit]')\n```\n\n### Requests & Responses\n\nRequests can also be sent within browser sessions. These operate the same as the standard `hrequests.request`, and will use the browser's cookies and headers. The `BrowserSession` cookies will be updated with each request.\n\nThis returns a normal `Response` object:\n\n```py\n>>> resp = page.get('https://duckduckgo.com')\n```\n\n<details>\n<summary>Parameters</summary>\n\n```\nParameters:\n    url (str): URL to send request to\n    params (dict, optional): Dictionary of URL parameters to append to the URL. Defaults to None.\n    data (Union[str, dict], optional): Data to send to request. Defaults to None.\n    headers (dict, optional): Dictionary of HTTP headers to send with the request. Defaults to None.\n    form (dict, optional): Form data to send with the request. Defaults to None.\n    multipart (dict, optional): Multipart data to send with the request. Defaults to None.\n    timeout (float, optional): Timeout in seconds. Defaults to 30.\n    verify (bool, optional): Verify the server's TLS certificate. Defaults to True.\n    max_redirects (int, optional): Maximum number of redirects to follow. Defaults to None.\n\nThrows:\n    hrequests.exceptions.BrowserTimeoutException: If timeout is reached\n\nReturns:\n    hrequests.response.Response: Response object\n```\n\n</details>\n\nOther methods include `post`, `put`, `delete`, `head`, and `patch`.\n\n### Closing the page\n\nThe `BrowserSession` object must be closed when finished. This will close the browser, update the response data, and merge new cookies with the session cookies.\n\n```py\n>>> page.close()\n```\n\nNote that this is automatically done when using a context manager.\n\nSession cookies are updated:\n\n```py\n>>> session.cookies: RequestsCookieJar\n<RequestsCookieJar[Cookie(version=0, name='MUID', value='123456789', port=None, port_specified=False, domain='.bing.com', domain_specified=True, domain_initial_dot=True...\n```\n\nResponse data is updated:\n\n```py\n>>> resp.url: str\n'https://www.bing.com/?toWww=1&redig=823778234657823652376438'\n>>> resp.content: Union[bytes, str]\n'<!DOCTYPE html><html lang=\"en\" dir=\"ltr\"><head><meta name=\"theme-color\" content=\"#4F4F4F\"><meta name=\"description\" content=\"Bing helps you turn inform...\n```\n\n#### Other ways to create a Browser Session\n\nYou can use `.render` to spawn a `BrowserSession` object directly from a url:\n\n```py\n# Using a Session:\n>>> page = session.render('https://google.com')\n# Or without a session at all:\n>>> page = hrequests.render('https://google.com')\n```\n\nMake sure to close all `BrowserSession` objects when done!\n\n```py\n>>> page.close()\n```\n\n---\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Hrequests (human requests) is a simple, configurable, feature-rich, replacement for the Python requests library.",
    "version": "0.8.2",
    "project_urls": {
        "Documentation": "https://daijro.gitbook.io/hrequests/",
        "Homepage": "https://github.com/daijro/hrequests",
        "Repository": "https://github.com/daijro/hrequests"
    },
    "split_keywords": [
        "tls",
        " client",
        " http",
        " scraping",
        " requests",
        " humans",
        " playwright"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9d0c4345f1279afc4223eb0402d4aeb441986dab2cb1ca4fab321e2548c0fe51",
                "md5": "f4656fcef1baeea211a0b4e3c715c733",
                "sha256": "1304891c007a43424c68aa1b2d4fb33615e81dbce7d2c69ef7bb15e7b271fed8"
            },
            "downloads": -1,
            "filename": "hrequests-0.8.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f4656fcef1baeea211a0b4e3c715c733",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 78300,
            "upload_time": "2024-03-31T07:09:22",
            "upload_time_iso_8601": "2024-03-31T07:09:22.035969Z",
            "url": "https://files.pythonhosted.org/packages/9d/0c/4345f1279afc4223eb0402d4aeb441986dab2cb1ca4fab321e2548c0fe51/hrequests-0.8.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "33c17caaceee3c3dec2d19efd40beeb77f374171bfbfc2fd7e63b011abfc04e6",
                "md5": "4c5bab9af56f4ac26aa41dca98980835",
                "sha256": "cb6ff1e9e2090fed7b4f79260811313ecfce8fc00707d7c745bce728c566e2dd"
            },
            "downloads": -1,
            "filename": "hrequests-0.8.2.tar.gz",
            "has_sig": false,
            "md5_digest": "4c5bab9af56f4ac26aa41dca98980835",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 76189,
            "upload_time": "2024-03-31T07:09:24",
            "upload_time_iso_8601": "2024-03-31T07:09:24.475476Z",
            "url": "https://files.pythonhosted.org/packages/33/c1/7caaceee3c3dec2d19efd40beeb77f374171bfbfc2fd7e63b011abfc04e6/hrequests-0.8.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-31 07:09:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "daijro",
    "github_project": "hrequests",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "hrequests"
}
        
Elapsed time: 0.21735s