quokka-web


Namequokka-web JSON
Version 0.0.2.0 PyPI version JSON
download
home_page
SummaryQuokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and web scraping tasks. With Quokka, you can easily navigate web pages, extract data, and interact with page elements using an intuitive API. Quokka supports asynchronous and parallel execution, making it suitable for a wide range of IO and CPU-bound workloads. Get started with Quokka to streamline your browser automation and web scraping workflows.
upload_time2024-03-10 04:47:29
maintainer
docs_urlNone
authorsteveflyer
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Quokka - Browser Automation Library with Playwright

Quokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and manipulation tasks. It provides a convenient facade for various browser interactions, making it easier to navigate web pages, extract data, and interact with page elements.

## Key Features

- **Asynchronous and Parallel Execution:** Quokka operates entirely in an asynchronous manner. Leveraging the power of Playwright, it utilizes multiple processes, each containing a single coroutine, for efficient parallel execution. This architecture excels in handling both IO and CPU-bound workloads when ample resources are available.
- **Multi-threaded Crawling with Ease:** Quokka's `BaseCrawler` class enables users to effortlessly transition from single-threaded to multi-threaded crawling. By taking advantage of the provided crawler template, you can seamlessly convert a single-threaded crawler into a multi-threaded one.
- Easy Browser Management: Quokka's `Agent` class provides a streamlined interface for managing browser instances, including starting, stopping, and page navigation.
- Data Extraction: With the `data_extractor` module, Quokka allows you to easily extract data from web pages using customizable selectors and extraction patterns.
- Page Interaction: The `page_interactor` module enables you to interact with web page elements, such as clicking, typing, and scrolling, making automation tasks a breeze.
- Custom Hooks: Quokka supports customizable hooks, allowing you to extend and customize the behavior of the `Agent` class to fit your specific needs.
- Extensible: Quokka exposes Playwright's `playwright` and `page` instances, enabling users to extend the library's functionality as required.

## Installation

```bash
pip install quokka-web
```

## Getting Started
Quokka's intuitive API makes browser automation a straightforward process. Here's a simple example:

```python
from quokka_web import Agent


async def main():
    agent = await Agent.instantiate(headless=True)
    await agent.start()

    # Your automation code here

    await agent.stop()


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())
```

## Documentation

For detailed usage instructions, examples, and customization options, please refer to the [Documentation](link_to_documentation).

## Examples

Base Crawler Example:

```python
from quokka_web import BaseCrawler, Debugger


class MyCrawler(BaseCrawler):
    async def _crawl(self, *args, **kwargs):
# Core crawling logic using browser_agent


if __name__ == "__main__":
    import asyncio


    async def main():
        crawler = await MyCrawler.instantiate(debug_tool=Debugger(verbose=True))
        await crawler.start()
        await crawler.crawl()
        await crawler.stop()


    asyncio.run(main())
```
## Contributing

Contributions to Quokka are welcome! Please read our [Contribution Guidelines](link_to_contribution_guidelines) for more information on how to contribute to the project.

## License

This project is licensed under the [MIT License](link_to_license).

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "quokka-web",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "steveflyer",
    "author_email": "steveflyer7@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/4a/7d/9f851284d48e62f567a8d516865f82ec8c00b2208048db04bf5a47a48672/quokka-web-0.0.2.0.tar.gz",
    "platform": null,
    "description": "# Quokka - Browser Automation Library with Playwright\r\n\r\nQuokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and manipulation tasks. It provides a convenient facade for various browser interactions, making it easier to navigate web pages, extract data, and interact with page elements.\r\n\r\n## Key Features\r\n\r\n- **Asynchronous and Parallel Execution:** Quokka operates entirely in an asynchronous manner. Leveraging the power of Playwright, it utilizes multiple processes, each containing a single coroutine, for efficient parallel execution. This architecture excels in handling both IO and CPU-bound workloads when ample resources are available.\r\n- **Multi-threaded Crawling with Ease:** Quokka's `BaseCrawler` class enables users to effortlessly transition from single-threaded to multi-threaded crawling. By taking advantage of the provided crawler template, you can seamlessly convert a single-threaded crawler into a multi-threaded one.\r\n- Easy Browser Management: Quokka's `Agent` class provides a streamlined interface for managing browser instances, including starting, stopping, and page navigation.\r\n- Data Extraction: With the `data_extractor` module, Quokka allows you to easily extract data from web pages using customizable selectors and extraction patterns.\r\n- Page Interaction: The `page_interactor` module enables you to interact with web page elements, such as clicking, typing, and scrolling, making automation tasks a breeze.\r\n- Custom Hooks: Quokka supports customizable hooks, allowing you to extend and customize the behavior of the `Agent` class to fit your specific needs.\r\n- Extensible: Quokka exposes Playwright's `playwright` and `page` instances, enabling users to extend the library's functionality as required.\r\n\r\n## Installation\r\n\r\n```bash\r\npip install quokka-web\r\n```\r\n\r\n## Getting Started\r\nQuokka's intuitive API makes browser automation a straightforward process. Here's a simple example:\r\n\r\n```python\r\nfrom quokka_web import Agent\r\n\r\n\r\nasync def main():\r\n    agent = await Agent.instantiate(headless=True)\r\n    await agent.start()\r\n\r\n    # Your automation code here\r\n\r\n    await agent.stop()\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    import asyncio\r\n\r\n    asyncio.run(main())\r\n```\r\n\r\n## Documentation\r\n\r\nFor detailed usage instructions, examples, and customization options, please refer to the [Documentation](link_to_documentation).\r\n\r\n## Examples\r\n\r\nBase Crawler Example:\r\n\r\n```python\r\nfrom quokka_web import BaseCrawler, Debugger\r\n\r\n\r\nclass MyCrawler(BaseCrawler):\r\n    async def _crawl(self, *args, **kwargs):\r\n# Core crawling logic using browser_agent\r\n\r\n\r\nif __name__ == \"__main__\":\r\n    import asyncio\r\n\r\n\r\n    async def main():\r\n        crawler = await MyCrawler.instantiate(debug_tool=Debugger(verbose=True))\r\n        await crawler.start()\r\n        await crawler.crawl()\r\n        await crawler.stop()\r\n\r\n\r\n    asyncio.run(main())\r\n```\r\n## Contributing\r\n\r\nContributions to Quokka are welcome! Please read our [Contribution Guidelines](link_to_contribution_guidelines) for more information on how to contribute to the project.\r\n\r\n## License\r\n\r\nThis project is licensed under the [MIT License](link_to_license).\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Quokka is a powerful Python library built on top of Playwright, designed to simplify browser automation and web scraping tasks. With Quokka, you can easily navigate web pages, extract data, and interact with page elements using an intuitive API. Quokka supports asynchronous and parallel execution, making it suitable for a wide range of IO and CPU-bound workloads. Get started with Quokka to streamline your browser automation and web scraping workflows.",
    "version": "0.0.2.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "38a39ad7952b2f7e1a26d8d5ea2894c2bdc4d4c84569b2ff115249055c015f19",
                "md5": "ccd8f6fc04b7298e6bb16f8ac32cb92b",
                "sha256": "6b2c84c4daecdf37282ae0eb2c3700f6c2eec3252a97efd0e53299b27c88c9eb"
            },
            "downloads": -1,
            "filename": "quokka_web-0.0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ccd8f6fc04b7298e6bb16f8ac32cb92b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 22931,
            "upload_time": "2024-03-10T04:47:27",
            "upload_time_iso_8601": "2024-03-10T04:47:27.731209Z",
            "url": "https://files.pythonhosted.org/packages/38/a3/9ad7952b2f7e1a26d8d5ea2894c2bdc4d4c84569b2ff115249055c015f19/quokka_web-0.0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4a7d9f851284d48e62f567a8d516865f82ec8c00b2208048db04bf5a47a48672",
                "md5": "e276452a45beea2a20bde21f13f37953",
                "sha256": "f00a52ef95fd0d84fc721d04d559d8e75b85cff93b6b1fab32d211187645214b"
            },
            "downloads": -1,
            "filename": "quokka-web-0.0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e276452a45beea2a20bde21f13f37953",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 17348,
            "upload_time": "2024-03-10T04:47:29",
            "upload_time_iso_8601": "2024-03-10T04:47:29.332102Z",
            "url": "https://files.pythonhosted.org/packages/4a/7d/9f851284d48e62f567a8d516865f82ec8c00b2208048db04bf5a47a48672/quokka-web-0.0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-10 04:47:29",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "quokka-web"
}
        
Elapsed time: 0.30270s