<p align="center">
<!-- Python -->
<a href="https://www.python.org" alt="Python">
<img src="https://badges.aleen42.com/src/python.svg" />
</a>
<!-- Version -->
<a href="https://badge.fury.io/py/multipipe"><img src="https://badge.fury.io/py/multipipe.svg" alt="PyPI version" height="18"></a>
<!-- Black -->
<a href="https://github.com/psf/black" alt="Code style: black">
<img src="https://img.shields.io/badge/code%20style-black-000000.svg" />
</a>
<!-- License -->
<a href="https://lbesson.mit-license.org/"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"></a>
<!-- Google Colab -->
<!-- <a href="https://colab.research.google.com/github/AmenRa/multipipe/blob/master/notebooks/1_overview.ipynb"> -->
<!-- <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> -->
</a>
</p>
## ⚡️ Introduction
[multipipe](https://github.com/AmenRa/multipipe) is a Python utility that allows you to create pipelines of functions to execute on any given iterable (e.g., lists, generators) by leveraging multiprocessing. [multipipe](https://github.com/AmenRa/multipipe) is built on top of [multiprocess](https://github.com/uqfoundation/multiprocess).
## 🔌 Requirements
```
python>=3.8
```
## 💾 Installation
```bash
pip install multipipe
```
## 💡 Examples
### Basic usage
```python
from multipipe import Multipipe
def add(x):
return x + 1
def mul(x):
return x * 2
pipe = Multipipe([ add, mul ])
pipe(range(10))
```
Output:
```python
[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]
```
### Using partials
Sometimes, you may want to use [partials](https://docs.python.org/3/library/functools.html#functools.partial) to pass arguments to your functions.
```python
from multipipe import Multipipe
from functools import partial
def add(x, y):
return x + y
def mul(x, y):
return x * y
pipe = Multipipe([ partial(add, y=1), partial(mul, y=2) ])
pipe(range(10))
```
Output:
```python
[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]
```
### Complex IO pipeline
In this example, we lazily read data from a [JSONl](https://jsonlines.org) file, execute a pipeline of functions lazily, and write the results to a new [JSONl](https://jsonlines.org) file.
In practice, this allows you to process huge files without loading their content into memory all-at-once.
```python
from multipipe import Multipipe
from unified_io import read_jsonl, write_jsonl
# Create a pipeline of functions
pipe = Multipipe([ ... ])
# Read a JSONl file line-by-line as a generator, i.e., lazily
in_data = read_jsonl("path/to/input/file.jsonl", generator=True)
# This is still a generator.
# The pipeline will be executed lazily.
out_data = pipe(in_data, generator=True)
# Write a JSONl file from the generator executing the pipeline
write_jsonl(out_data, "path/to/output/file.jsonl")
```
## 🎁 Feature Requests
Would you like to see other features implemented? Please, open a [feature request](https://github.com/AmenRa/multipipe/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeature+Request%5D+title).
## 🤘 Want to contribute?
Would you like to contribute? Please, drop me an [e-mail](mailto:elias.bssn@gmail.com?subject=[GitHub]%20multipipe).
## 📄 License
[multipipe](https://github.com/AmenRa/multipipe) is an open-sourced software licensed under the [MIT license](LICENSE).
Raw data
{
"_id": null,
"home_page": "https://github.com/AmenRa/unified-io",
"name": "multipipe",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "pipeline,multiprocessing,multithreading,utils,utilities",
"author": "Elias Bassani",
"author_email": "elias.bssn@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/95/05/b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d/multipipe-0.1.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <!-- Python -->\n <a href=\"https://www.python.org\" alt=\"Python\">\n <img src=\"https://badges.aleen42.com/src/python.svg\" />\n </a>\n <!-- Version -->\n <a href=\"https://badge.fury.io/py/multipipe\"><img src=\"https://badge.fury.io/py/multipipe.svg\" alt=\"PyPI version\" height=\"18\"></a>\n <!-- Black -->\n <a href=\"https://github.com/psf/black\" alt=\"Code style: black\">\n <img src=\"https://img.shields.io/badge/code%20style-black-000000.svg\" />\n </a>\n <!-- License -->\n <a href=\"https://lbesson.mit-license.org/\"><img src=\"https://img.shields.io/badge/License-MIT-blue.svg\" alt=\"License: MIT\"></a>\n <!-- Google Colab -->\n <!-- <a href=\"https://colab.research.google.com/github/AmenRa/multipipe/blob/master/notebooks/1_overview.ipynb\"> -->\n <!-- <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/> -->\n </a>\n</p>\n\n## \u26a1\ufe0f Introduction\n\n[multipipe](https://github.com/AmenRa/multipipe) is a Python utility that allows you to create pipelines of functions to execute on any given iterable (e.g., lists, generators) by leveraging multiprocessing. [multipipe](https://github.com/AmenRa/multipipe) is built on top of [multiprocess](https://github.com/uqfoundation/multiprocess).\n\n\n## \ud83d\udd0c Requirements\n```\npython>=3.8\n```\n\n## \ud83d\udcbe Installation\n```bash\npip install multipipe\n```\n\n## \ud83d\udca1 Examples\n\n### Basic usage\n```python\nfrom multipipe import Multipipe\n\ndef add(x):\n return x + 1\n\ndef mul(x):\n return x * 2\n\npipe = Multipipe([ add, mul ])\npipe(range(10))\n```\nOutput:\n```python\n[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]\n```\n\n### Using partials\n\nSometimes, you may want to use [partials](https://docs.python.org/3/library/functools.html#functools.partial) to pass arguments to your functions.\n\n```python\nfrom multipipe import Multipipe\nfrom functools import partial\n\ndef add(x, y):\n return x + y\n\ndef mul(x, y):\n return x * y\n\npipe = Multipipe([ partial(add, y=1), partial(mul, y=2) ])\npipe(range(10))\n```\nOutput:\n```python\n[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]\n```\n\n### Complex IO pipeline\n\nIn this example, we lazily read data from a [JSONl](https://jsonlines.org) file, execute a pipeline of functions lazily, and write the results to a new [JSONl](https://jsonlines.org) file.\nIn practice, this allows you to process huge files without loading their content into memory all-at-once.\n\n```python\nfrom multipipe import Multipipe\nfrom unified_io import read_jsonl, write_jsonl\n\n# Create a pipeline of functions\npipe = Multipipe([ ... ])\n\n# Read a JSONl file line-by-line as a generator, i.e., lazily\nin_data = read_jsonl(\"path/to/input/file.jsonl\", generator=True)\n\n# This is still a generator.\n# The pipeline will be executed lazily.\nout_data = pipe(in_data, generator=True)\n\n# Write a JSONl file from the generator executing the pipeline\nwrite_jsonl(out_data, \"path/to/output/file.jsonl\")\n```\n\n## \ud83c\udf81 Feature Requests\nWould you like to see other features implemented? Please, open a [feature request](https://github.com/AmenRa/multipipe/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeature+Request%5D+title).\n\n\n## \ud83e\udd18 Want to contribute?\nWould you like to contribute? Please, drop me an [e-mail](mailto:elias.bssn@gmail.com?subject=[GitHub]%20multipipe).\n\n\n## \ud83d\udcc4 License\n[multipipe](https://github.com/AmenRa/multipipe) is an open-sourced software licensed under the [MIT license](LICENSE).\n",
"bugtrack_url": null,
"license": "",
"summary": "A Python utility for multiprocessing pipelines",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/AmenRa/unified-io"
},
"split_keywords": [
"pipeline",
"multiprocessing",
"multithreading",
"utils",
"utilities"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "30c1e8e82b2b78c9b549faf2a8ff7c78f713cf2eaa7f7f59c09e6941f59f5bc1",
"md5": "1295dbc92967d92286728c00a0ad4e0b",
"sha256": "4b96a57941f3b833696323e9850841d0ec71b4b6e080e84972688e84f91d7dc5"
},
"downloads": -1,
"filename": "multipipe-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1295dbc92967d92286728c00a0ad4e0b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 4553,
"upload_time": "2023-05-19T12:46:30",
"upload_time_iso_8601": "2023-05-19T12:46:30.834069Z",
"url": "https://files.pythonhosted.org/packages/30/c1/e8e82b2b78c9b549faf2a8ff7c78f713cf2eaa7f7f59c09e6941f59f5bc1/multipipe-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9505b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d",
"md5": "5f39be6e90ed17a9609e6ba7fe87a375",
"sha256": "1d939a8e38ae83ffb837faf4595f9ef411a3d90eb96a13b179f40ecdde6a6242"
},
"downloads": -1,
"filename": "multipipe-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "5f39be6e90ed17a9609e6ba7fe87a375",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 4003,
"upload_time": "2023-05-19T12:46:32",
"upload_time_iso_8601": "2023-05-19T12:46:32.675217Z",
"url": "https://files.pythonhosted.org/packages/95/05/b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d/multipipe-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-05-19 12:46:32",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AmenRa",
"github_project": "unified-io",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "multipipe"
}