multipipe


Namemultipipe JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/AmenRa/unified-io
SummaryA Python utility for multiprocessing pipelines
upload_time2023-05-19 12:46:32
maintainer
docs_urlNone
authorElias Bassani
requires_python>=3.7
license
keywords pipeline multiprocessing multithreading utils utilities
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <!-- Python -->
  <a href="https://www.python.org" alt="Python">
      <img src="https://badges.aleen42.com/src/python.svg" />
  </a>
  <!-- Version -->
  <a href="https://badge.fury.io/py/multipipe"><img src="https://badge.fury.io/py/multipipe.svg" alt="PyPI version" height="18"></a>
  <!-- Black -->
  <a href="https://github.com/psf/black" alt="Code style: black">
      <img src="https://img.shields.io/badge/code%20style-black-000000.svg" />
  </a>
  <!-- License -->
  <a href="https://lbesson.mit-license.org/"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"></a>
  <!-- Google Colab -->
  <!-- <a href="https://colab.research.google.com/github/AmenRa/multipipe/blob/master/notebooks/1_overview.ipynb"> -->
      <!-- <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> -->
  </a>
</p>

## ⚡️ Introduction

[multipipe](https://github.com/AmenRa/multipipe) is a Python utility that allows you to create pipelines of functions to execute on any given iterable (e.g., lists, generators) by leveraging multiprocessing. [multipipe](https://github.com/AmenRa/multipipe) is built on top of [multiprocess](https://github.com/uqfoundation/multiprocess).


## 🔌 Requirements
```
python>=3.8
```

## 💾 Installation
```bash
pip install multipipe
```

## 💡 Examples

### Basic usage
```python
from multipipe import Multipipe

def add(x):
    return x + 1

def mul(x):
    return x * 2

pipe = Multipipe([ add, mul ])
pipe(range(10))
```
Output:
```python
[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]
```

### Using partials

Sometimes, you may want to use [partials](https://docs.python.org/3/library/functools.html#functools.partial) to pass arguments to your functions.

```python
from multipipe import Multipipe
from functools import partial

def add(x, y):
    return x + y

def mul(x, y):
    return x * y

pipe = Multipipe([ partial(add, y=1), partial(mul, y=2) ])
pipe(range(10))
```
Output:
```python
[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]
```

### Complex IO pipeline

In this example, we lazily read data from a [JSONl](https://jsonlines.org) file, execute a pipeline of functions lazily, and write the results to a new [JSONl](https://jsonlines.org) file.
In practice, this allows you to process huge files without loading their content into memory all-at-once.

```python
from multipipe import Multipipe
from unified_io import read_jsonl, write_jsonl

# Create a pipeline of functions
pipe = Multipipe([ ... ])

# Read a JSONl file line-by-line as a generator, i.e., lazily
in_data = read_jsonl("path/to/input/file.jsonl", generator=True)

# This is still a generator.
# The pipeline will be executed lazily.
out_data = pipe(in_data, generator=True)

# Write a JSONl file from the generator executing the pipeline
write_jsonl(out_data, "path/to/output/file.jsonl")
```

## 🎁 Feature Requests
Would you like to see other features implemented? Please, open a [feature request](https://github.com/AmenRa/multipipe/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeature+Request%5D+title).


## 🤘 Want to contribute?
Would you like to contribute? Please, drop me an [e-mail](mailto:elias.bssn@gmail.com?subject=[GitHub]%20multipipe).


## 📄 License
[multipipe](https://github.com/AmenRa/multipipe) is an open-sourced software licensed under the [MIT license](LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/AmenRa/unified-io",
    "name": "multipipe",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "pipeline,multiprocessing,multithreading,utils,utilities",
    "author": "Elias Bassani",
    "author_email": "elias.bssn@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/95/05/b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d/multipipe-0.1.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <!-- Python -->\n  <a href=\"https://www.python.org\" alt=\"Python\">\n      <img src=\"https://badges.aleen42.com/src/python.svg\" />\n  </a>\n  <!-- Version -->\n  <a href=\"https://badge.fury.io/py/multipipe\"><img src=\"https://badge.fury.io/py/multipipe.svg\" alt=\"PyPI version\" height=\"18\"></a>\n  <!-- Black -->\n  <a href=\"https://github.com/psf/black\" alt=\"Code style: black\">\n      <img src=\"https://img.shields.io/badge/code%20style-black-000000.svg\" />\n  </a>\n  <!-- License -->\n  <a href=\"https://lbesson.mit-license.org/\"><img src=\"https://img.shields.io/badge/License-MIT-blue.svg\" alt=\"License: MIT\"></a>\n  <!-- Google Colab -->\n  <!-- <a href=\"https://colab.research.google.com/github/AmenRa/multipipe/blob/master/notebooks/1_overview.ipynb\"> -->\n      <!-- <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/> -->\n  </a>\n</p>\n\n## \u26a1\ufe0f Introduction\n\n[multipipe](https://github.com/AmenRa/multipipe) is a Python utility that allows you to create pipelines of functions to execute on any given iterable (e.g., lists, generators) by leveraging multiprocessing. [multipipe](https://github.com/AmenRa/multipipe) is built on top of [multiprocess](https://github.com/uqfoundation/multiprocess).\n\n\n## \ud83d\udd0c Requirements\n```\npython>=3.8\n```\n\n## \ud83d\udcbe Installation\n```bash\npip install multipipe\n```\n\n## \ud83d\udca1 Examples\n\n### Basic usage\n```python\nfrom multipipe import Multipipe\n\ndef add(x):\n    return x + 1\n\ndef mul(x):\n    return x * 2\n\npipe = Multipipe([ add, mul ])\npipe(range(10))\n```\nOutput:\n```python\n[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]\n```\n\n### Using partials\n\nSometimes, you may want to use [partials](https://docs.python.org/3/library/functools.html#functools.partial) to pass arguments to your functions.\n\n```python\nfrom multipipe import Multipipe\nfrom functools import partial\n\ndef add(x, y):\n    return x + y\n\ndef mul(x, y):\n    return x * y\n\npipe = Multipipe([ partial(add, y=1), partial(mul, y=2) ])\npipe(range(10))\n```\nOutput:\n```python\n[ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 ]\n```\n\n### Complex IO pipeline\n\nIn this example, we lazily read data from a [JSONl](https://jsonlines.org) file, execute a pipeline of functions lazily, and write the results to a new [JSONl](https://jsonlines.org) file.\nIn practice, this allows you to process huge files without loading their content into memory all-at-once.\n\n```python\nfrom multipipe import Multipipe\nfrom unified_io import read_jsonl, write_jsonl\n\n# Create a pipeline of functions\npipe = Multipipe([ ... ])\n\n# Read a JSONl file line-by-line as a generator, i.e., lazily\nin_data = read_jsonl(\"path/to/input/file.jsonl\", generator=True)\n\n# This is still a generator.\n# The pipeline will be executed lazily.\nout_data = pipe(in_data, generator=True)\n\n# Write a JSONl file from the generator executing the pipeline\nwrite_jsonl(out_data, \"path/to/output/file.jsonl\")\n```\n\n## \ud83c\udf81 Feature Requests\nWould you like to see other features implemented? Please, open a [feature request](https://github.com/AmenRa/multipipe/issues/new?assignees=&labels=enhancement&template=feature_request.md&title=%5BFeature+Request%5D+title).\n\n\n## \ud83e\udd18 Want to contribute?\nWould you like to contribute? Please, drop me an [e-mail](mailto:elias.bssn@gmail.com?subject=[GitHub]%20multipipe).\n\n\n## \ud83d\udcc4 License\n[multipipe](https://github.com/AmenRa/multipipe) is an open-sourced software licensed under the [MIT license](LICENSE).\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A Python utility for multiprocessing pipelines",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/AmenRa/unified-io"
    },
    "split_keywords": [
        "pipeline",
        "multiprocessing",
        "multithreading",
        "utils",
        "utilities"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "30c1e8e82b2b78c9b549faf2a8ff7c78f713cf2eaa7f7f59c09e6941f59f5bc1",
                "md5": "1295dbc92967d92286728c00a0ad4e0b",
                "sha256": "4b96a57941f3b833696323e9850841d0ec71b4b6e080e84972688e84f91d7dc5"
            },
            "downloads": -1,
            "filename": "multipipe-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1295dbc92967d92286728c00a0ad4e0b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 4553,
            "upload_time": "2023-05-19T12:46:30",
            "upload_time_iso_8601": "2023-05-19T12:46:30.834069Z",
            "url": "https://files.pythonhosted.org/packages/30/c1/e8e82b2b78c9b549faf2a8ff7c78f713cf2eaa7f7f59c09e6941f59f5bc1/multipipe-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9505b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d",
                "md5": "5f39be6e90ed17a9609e6ba7fe87a375",
                "sha256": "1d939a8e38ae83ffb837faf4595f9ef411a3d90eb96a13b179f40ecdde6a6242"
            },
            "downloads": -1,
            "filename": "multipipe-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5f39be6e90ed17a9609e6ba7fe87a375",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 4003,
            "upload_time": "2023-05-19T12:46:32",
            "upload_time_iso_8601": "2023-05-19T12:46:32.675217Z",
            "url": "https://files.pythonhosted.org/packages/95/05/b3f876e623a43d6f8247191df8b80750c737988dcd7006667857a426545d/multipipe-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-19 12:46:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AmenRa",
    "github_project": "unified-io",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "multipipe"
}
        
Elapsed time: 0.18697s