# Playwright Capture
Simple replacement for [splash](https://github.com/scrapinghub/splash) using [playwright](https://github.com/microsoft/playwright-python).
# Install
```bash
pip install playwrightcapture
```
# Usage
A very basic example:
```python
from playwrightcapture import Capture
async with Capture() as capture:
await capture.initialize_context()
entries = await capture.capture_page(url, max_depth_capture_time=90)
```
Entries is a dictionaries that contains (if all goes well) the HAR, the screenshot, all the cookies of the session, the URL as it is in the browser at the end of the capture, and the full HTML page as rendered.
# reCAPTCHA bypass
No blackmagic, it is just a reimplementation of a [well known technique](https://github.com/NikolaiT/uncaptcha3)
as implemented [there](https://github.com/Binit-Dhakal/Google-reCAPTCHA-v3-solver-using-playwright-python),
and [there](https://github.com/embium/solverecaptchas).
This modules will try to bypass reCAPTCHA protected websites if you install it this way:
```bash
pip install playwrightcapture[recaptcha]
```
This will install `requests`, `pydub` and `SpeechRecognition`. In order to work, `pydub`
requires `ffmpeg` or `libav`, look at the [install guide ](https://github.com/jiaaro/pydub#installation)
for more details.
`SpeechRecognition` uses the Google Speech Recognition API to turn the audio file into text (I hope you appreciate the irony).
Raw data
{
"_id": null,
"home_page": "https://github.com/Lookyloo/PlaywrightCapture",
"name": "PlaywrightCapture",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Rapha\u00ebl Vinot",
"author_email": "raphael.vinot@circl.lu",
"download_url": "https://files.pythonhosted.org/packages/49/87/36419fc0d45f8c2a4cb68928f95b821e79eac3bb7e4a002ad52981d89e59/playwrightcapture-1.27.5.tar.gz",
"platform": null,
"description": "# Playwright Capture\n\nSimple replacement for [splash](https://github.com/scrapinghub/splash) using [playwright](https://github.com/microsoft/playwright-python).\n\n# Install\n\n```bash\npip install playwrightcapture\n```\n\n# Usage\n\nA very basic example:\n\n```python\nfrom playwrightcapture import Capture\n\nasync with Capture() as capture:\n await capture.initialize_context()\n entries = await capture.capture_page(url, max_depth_capture_time=90)\n```\n\nEntries is a dictionaries that contains (if all goes well) the HAR, the screenshot, all the cookies of the session, the URL as it is in the browser at the end of the capture, and the full HTML page as rendered.\n\n\n# reCAPTCHA bypass\n\nNo blackmagic, it is just a reimplementation of a [well known technique](https://github.com/NikolaiT/uncaptcha3)\nas implemented [there](https://github.com/Binit-Dhakal/Google-reCAPTCHA-v3-solver-using-playwright-python),\nand [there](https://github.com/embium/solverecaptchas).\n\nThis modules will try to bypass reCAPTCHA protected websites if you install it this way:\n\n```bash\npip install playwrightcapture[recaptcha]\n```\n\nThis will install `requests`, `pydub` and `SpeechRecognition`. In order to work, `pydub`\nrequires `ffmpeg` or `libav`, look at the [install guide ](https://github.com/jiaaro/pydub#installation)\nfor more details.\n`SpeechRecognition` uses the Google Speech Recognition API to turn the audio file into text (I hope you appreciate the irony).\n\n",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "A simple library to capture websites using playwright",
"version": "1.27.5",
"project_urls": {
"Homepage": "https://github.com/Lookyloo/PlaywrightCapture",
"Repository": "https://github.com/Lookyloo/PlaywrightCapture"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "624983f36a9c0cc2550f1066503ffbad9448b31729bd3f1addaa14fbbd364e72",
"md5": "ecd52105eb06d1331f70fc0da0919c66",
"sha256": "cf74abab1089c26e4752467ac911da803a3eaf72f1a2c2f320116ccf5f3fd0a4"
},
"downloads": -1,
"filename": "playwrightcapture-1.27.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ecd52105eb06d1331f70fc0da0919c66",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 22988,
"upload_time": "2024-12-26T12:53:07",
"upload_time_iso_8601": "2024-12-26T12:53:07.118749Z",
"url": "https://files.pythonhosted.org/packages/62/49/83f36a9c0cc2550f1066503ffbad9448b31729bd3f1addaa14fbbd364e72/playwrightcapture-1.27.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "498736419fc0d45f8c2a4cb68928f95b821e79eac3bb7e4a002ad52981d89e59",
"md5": "eabbd614b34d9c1c8241a013dd472989",
"sha256": "59af72b847abfc4ee16ec0aa5b90a136e8c4de26f8b7e62a5cc53b7a82248aa2"
},
"downloads": -1,
"filename": "playwrightcapture-1.27.5.tar.gz",
"has_sig": false,
"md5_digest": "eabbd614b34d9c1c8241a013dd472989",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 22341,
"upload_time": "2024-12-26T12:53:09",
"upload_time_iso_8601": "2024-12-26T12:53:09.940923Z",
"url": "https://files.pythonhosted.org/packages/49/87/36419fc0d45f8c2a4cb68928f95b821e79eac3bb7e4a002ad52981d89e59/playwrightcapture-1.27.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-26 12:53:09",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Lookyloo",
"github_project": "PlaywrightCapture",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "playwrightcapture"
}