<h1 align="center">
<img src="https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/icons/logo.png" alt="byteflow" width="200px">
<br>
</h1>
# **Simple data workflows**
Byteflow is a microframework that makes it easier to retrieve information from APIs and regular websites.
Byteflow, unlike complex projects like Scrapy or simple libraries like BeautifulSoup, is extremely easy to use due to the unification of the information extraction process and at the same time has quite a wide range of functionality.
## **Why use Byteflow?**
* 🚀 Byteflow is built on top of asyncio and asynchronous libraries, which significantly speeds up your code in the context of I/O operations.
* 🔁 With Byteflow, there is no need to continuously customize the data scraping process. From project to project, you will have a single, transparent architecture.
* ![s3](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/amazons3.svg) ![kafka](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/apachekafka.svg) ![psql](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/postgresql.svg) ![clickhouse](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/clickhouse.svg) Byteflow allows you to route data to any backend: s3-like storage, database, network file system, broker/message bus, etc.
* ⚙️ Byteflow allows the user to choose what to do with the data: hold it in memory until a certain critical value accumulates, or immediately send it to the backend, perform pre-processing, or leave it as is.
## **Installation**
Installation is as simple as:
`
pip install byteflow
`
## **Dependencies**
>The list of core Byteflow dependencies is represented by the following libraries:
>
> * aiohttp
> * aioitertools
> * fsspec
> * more-itertools
> * regex
> * uvloop (for Unix platforms)
> * yarl
> * dateparser
## **More information about the project**
You can learn more about Byteflow in the [project documentation](https://danchukivan.github.io/Byteflow/), including the API and Tutorial sections. Changes can be monitored in the Changelog section.
## **Project status**
Byteflow is currently a deep alpha project with an unstable API and limited functionality. Its use in production is **strictly not recommended**.
Raw data
{
"_id": null,
"home_page": null,
"name": "byteflow",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Danchuk Ivan <ivan.s.danchuk@gmail.com>",
"keywords": "scraping, web scraping, asyncio, web crawler, api crawler, api scraping",
"author": null,
"author_email": "Danchuk Ivan <ivan.s.danchuk@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/46/33/2e112b238009ff3843267ec7042a1b48457a7e1912a0fb15eba89bad6047/byteflow-0.2.1a7.post5.tar.gz",
"platform": null,
"description": "<h1 align=\"center\">\n <img src=\"https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/icons/logo.png\" alt=\"byteflow\" width=\"200px\">\n <br>\n</h1>\n\n# **Simple data workflows**\n\nByteflow is a microframework that makes it easier to retrieve information from APIs and regular websites.\n\nByteflow, unlike complex projects like Scrapy or simple libraries like BeautifulSoup, is extremely easy to use due to the unification of the information extraction process and at the same time has quite a wide range of functionality.\n\n## **Why use Byteflow?**\n\n* \ud83d\ude80 Byteflow is built on top of asyncio and asynchronous libraries, which significantly speeds up your code in the context of I/O operations.\n\n* \ud83d\udd01 With Byteflow, there is no need to continuously customize the data scraping process. From project to project, you will have a single, transparent architecture.\n\n* ![s3](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/amazons3.svg) ![kafka](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/apachekafka.svg) ![psql](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/postgresql.svg) ![clickhouse](https://raw.githubusercontent.com/DanchukIvan/byteflow/main/docs/img/clickhouse.svg) Byteflow allows you to route data to any backend: s3-like storage, database, network file system, broker/message bus, etc.\n\n* \u2699\ufe0f Byteflow allows the user to choose what to do with the data: hold it in memory until a certain critical value accumulates, or immediately send it to the backend, perform pre-processing, or leave it as is.\n\n## **Installation**\n\nInstallation is as simple as:\n\n`\npip install byteflow\n`\n\n## **Dependencies**\n\n>The list of core Byteflow dependencies is represented by the following libraries:\n>\n> * aiohttp\n> * aioitertools\n> * fsspec\n> * more-itertools\n> * regex\n> * uvloop (for Unix platforms)\n> * yarl\n> * dateparser\n\n## **More information about the project**\n\nYou can learn more about Byteflow in the [project documentation](https://danchukivan.github.io/Byteflow/), including the API and Tutorial sections. Changes can be monitored in the Changelog section.\n\n## **Project status**\n\nByteflow is currently a deep alpha project with an unstable API and limited functionality. Its use in production is **strictly not recommended**.\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "Simple scrape as SELECT * FROM ANYTHING in network",
"version": "0.2.1a7.post5",
"project_urls": {
"Documentation": "https://danchukivan.github.io/byteflow/",
"Repository": "https://github.com/DanchukIvan/byteflow.git"
},
"split_keywords": [
"scraping",
" web scraping",
" asyncio",
" web crawler",
" api crawler",
" api scraping"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "86fe2c500d916a08e8901960aabb38c972fa996d1b8ddb1ff47042fd3717a675",
"md5": "0db5fe64d2b747394e2783002c0d03d0",
"sha256": "e251ede61d5c6dae76435a03d9bd540bbdc841c5a9051718b9fd74a9b0fdb14e"
},
"downloads": -1,
"filename": "byteflow-0.2.1a7.post5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0db5fe64d2b747394e2783002c0d03d0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 52698,
"upload_time": "2024-08-21T06:37:52",
"upload_time_iso_8601": "2024-08-21T06:37:52.028375Z",
"url": "https://files.pythonhosted.org/packages/86/fe/2c500d916a08e8901960aabb38c972fa996d1b8ddb1ff47042fd3717a675/byteflow-0.2.1a7.post5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "46332e112b238009ff3843267ec7042a1b48457a7e1912a0fb15eba89bad6047",
"md5": "0ef29792d5025b688b21003aa6d8b794",
"sha256": "b5e2a165a0733b8e73609942e6aa3395cb57b1d53cfc91f4736b758272c5579a"
},
"downloads": -1,
"filename": "byteflow-0.2.1a7.post5.tar.gz",
"has_sig": false,
"md5_digest": "0ef29792d5025b688b21003aa6d8b794",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 42544,
"upload_time": "2024-08-21T06:37:53",
"upload_time_iso_8601": "2024-08-21T06:37:53.932748Z",
"url": "https://files.pythonhosted.org/packages/46/33/2e112b238009ff3843267ec7042a1b48457a7e1912a0fb15eba89bad6047/byteflow-0.2.1a7.post5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-21 06:37:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DanchukIvan",
"github_project": "byteflow",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "byteflow"
}