parfor


Nameparfor JSON
Version 2024.4.0 PyPI version JSON
download
home_pagehttps://github.com/wimpomp/parfor
SummaryA package to mimic the use of parfor as done in Matlab.
upload_time2024-04-26 16:34:45
maintainerNone
docs_urlNone
authorWim Pomp
requires_python<4.0,>=3.10
licenseGPLv3
keywords parfor concurrency multiprocessing parallel
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![pytest](https://github.com/wimpomp/parfor/actions/workflows/pytest.yml/badge.svg)](https://github.com/wimpomp/parfor/actions/workflows/pytest.yml)

# Parfor
Used to parallelize for-loops using parfor in Matlab? This package allows you to do the same in python.
Take any normal serial but parallelizable for-loop and execute it in parallel using easy syntax.
Don't worry about the technical details of using the multiprocessing module, race conditions, queues,
parfor handles all that. 

Tested on linux, Windows and OSX with python 3.10.

## Why is parfor better than just using multiprocessing?
- Easy to use
- Using dill instead of pickle: a lot more objects can be used when parallelizing
- Progress bars are built-in

## Installation
`pip install parfor`

## Usage
Parfor decorates a functions and returns the result of that function evaluated in parallel for each iteration of
an iterator.

## Requires
tqdm, dill

## Limitations
Objects passed to the pool need to be dillable (dill needs to serialize them). Generators and SwigPyObjects are examples
of objects that cannot be used. They can be used however, for the iterator argument when using parfor, but its
iterations need to be dillable. You might be able to make objects dillable anyhow using `dill.register` or with
`__reduce__`, `__getstate__`, etc.

## Arguments
### Required:
    fun:      function taking arguments: iteration from  iterable, other arguments defined in args & kwargs
    iterable: iterable or iterator from which an item is given to fun as a first argument

### Optional:
    args:   tuple with other unnamed arguments to fun
    kwargs: dict with other named arguments to fun
    total:  give the length of the iterator in cases where len(iterator) results in an error
    desc:   string with description of the progress bar
    bar:    bool enable progress bar,
                or a callback function taking the number of passed iterations as an argument
    pbar:   bool enable buffer indicator bar, or a callback function taking the queue size as an argument
    rP:     ratio workers to cpu cores, default: 1
    nP:     number of workers, default, None, overrides rP if not None
    serial: execute in series instead of parallel if True, None (default): let pmap decide
    qsize:  maximum size of the task queue
    length: deprecated alias for total
    **bar_kwargs: keywords arguments for tqdm.tqdm

### Return
    list with results from applying the function 'fun' to each iteration of the iterable / iterator

## Examples
### Normal serial for loop
    <<
    from time import sleep

    a = 3
    fun = []
    for i in range(10):
        sleep(1)
        fun.append(a*i**2)
    print(fun)

    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]
    
### Using parfor to parallelize
    <<
    from time import sleep
    from parfor import parfor
    @parfor(range(10), (3,))
    def fun(i, a):
        sleep(1)
        return a*i**2
    print(fun)

    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]

    <<
    @parfor(range(10), (3,), bar=False)
    def fun(i, a):
        sleep(1)
        return a*i**2
    print(fun)

    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]

### Using parfor in a script/module/.py-file
Parfor should never be executed during the import phase of a .py-file. To prevent that from happening
use the `if __name__ == '__main__':` structure:

    <<
    from time import sleep
    from parfor import parfor
    
    if __name__ == '__main__':
        @parfor(range(10), (3,))
        def fun(i, a):
            sleep(1)
            return a*i**2
        print(fun)

    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]    
or:

    <<
    from time import sleep
    from parfor import parfor
    
    def my_fun(*args, **kwargs):
        @parfor(range(10), (3,))
        def fun(i, a):
            sleep(1)
            return a*i**2
        return fun
    
    if __name__ == '__main__':
        print(my_fun())

    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]

### If you hate decorators not returning a function
pmap maps an iterator to a function like map does, but in parallel

    <<
    from parfor import pmap
    from time import sleep
    def fun(i, a):
        sleep(1)
        return a*i**2
    print(pmap(fun, range(10), (3,)))

    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]     
    
### Using generators
If iterators like lists and tuples are too big for the memory, use generators instead.
Since generators don't have a predefined length, give parfor the length (total) as an argument (optional). 
    
    <<
    import numpy as np
    c = (im for im in imagereader)
    @parfor(c, total=len(imagereader))
    def fun(im):
        return np.mean(im)
        
    >> [list with means of the images]
    
# Extra's
## `pmap`
The function parfor decorates, use it like `map`.

## `Chunks`
Split a long iterator in bite-sized chunks to parallelize

## `ParPool`
More low-level accessibility to parallel execution. Submit tasks and request the result at any time,
(although necessarily submit first, then request a specific task), use different functions and function
arguments for different tasks.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/wimpomp/parfor",
    "name": "parfor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": "parfor, concurrency, multiprocessing, parallel",
    "author": "Wim Pomp",
    "author_email": "wimpomp@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/5c/5b/b42cc537f715e4146671be1a86b4fd14b593477cd1b1fdd6661054039a96/parfor-2024.4.0.tar.gz",
    "platform": null,
    "description": "[![pytest](https://github.com/wimpomp/parfor/actions/workflows/pytest.yml/badge.svg)](https://github.com/wimpomp/parfor/actions/workflows/pytest.yml)\n\n# Parfor\nUsed to parallelize for-loops using parfor in Matlab? This package allows you to do the same in python.\nTake any normal serial but parallelizable for-loop and execute it in parallel using easy syntax.\nDon't worry about the technical details of using the multiprocessing module, race conditions, queues,\nparfor handles all that. \n\nTested on linux, Windows and OSX with python 3.10.\n\n## Why is parfor better than just using multiprocessing?\n- Easy to use\n- Using dill instead of pickle: a lot more objects can be used when parallelizing\n- Progress bars are built-in\n\n## Installation\n`pip install parfor`\n\n## Usage\nParfor decorates a functions and returns the result of that function evaluated in parallel for each iteration of\nan iterator.\n\n## Requires\ntqdm, dill\n\n## Limitations\nObjects passed to the pool need to be dillable (dill needs to serialize them). Generators and SwigPyObjects are examples\nof objects that cannot be used. They can be used however, for the iterator argument when using parfor, but its\niterations need to be dillable. You might be able to make objects dillable anyhow using `dill.register` or with\n`__reduce__`, `__getstate__`, etc.\n\n## Arguments\n### Required:\n    fun:      function taking arguments: iteration from  iterable, other arguments defined in args & kwargs\n    iterable: iterable or iterator from which an item is given to fun as a first argument\n\n### Optional:\n    args:   tuple with other unnamed arguments to fun\n    kwargs: dict with other named arguments to fun\n    total:  give the length of the iterator in cases where len(iterator) results in an error\n    desc:   string with description of the progress bar\n    bar:    bool enable progress bar,\n                or a callback function taking the number of passed iterations as an argument\n    pbar:   bool enable buffer indicator bar, or a callback function taking the queue size as an argument\n    rP:     ratio workers to cpu cores, default: 1\n    nP:     number of workers, default, None, overrides rP if not None\n    serial: execute in series instead of parallel if True, None (default): let pmap decide\n    qsize:  maximum size of the task queue\n    length: deprecated alias for total\n    **bar_kwargs: keywords arguments for tqdm.tqdm\n\n### Return\n    list with results from applying the function 'fun' to each iteration of the iterable / iterator\n\n## Examples\n### Normal serial for loop\n    <<\n    from time import sleep\n\n    a = 3\n    fun = []\n    for i in range(10):\n        sleep(1)\n        fun.append(a*i**2)\n    print(fun)\n\n    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]\n    \n### Using parfor to parallelize\n    <<\n    from time import sleep\n    from parfor import parfor\n    @parfor(range(10), (3,))\n    def fun(i, a):\n        sleep(1)\n        return a*i**2\n    print(fun)\n\n    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]\n\n    <<\n    @parfor(range(10), (3,), bar=False)\n    def fun(i, a):\n        sleep(1)\n        return a*i**2\n    print(fun)\n\n    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]\n\n### Using parfor in a script/module/.py-file\nParfor should never be executed during the import phase of a .py-file. To prevent that from happening\nuse the `if __name__ == '__main__':` structure:\n\n    <<\n    from time import sleep\n    from parfor import parfor\n    \n    if __name__ == '__main__':\n        @parfor(range(10), (3,))\n        def fun(i, a):\n            sleep(1)\n            return a*i**2\n        print(fun)\n\n    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]    \nor:\n\n    <<\n    from time import sleep\n    from parfor import parfor\n    \n    def my_fun(*args, **kwargs):\n        @parfor(range(10), (3,))\n        def fun(i, a):\n            sleep(1)\n            return a*i**2\n        return fun\n    \n    if __name__ == '__main__':\n        print(my_fun())\n\n    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]\n\n### If you hate decorators not returning a function\npmap maps an iterator to a function like map does, but in parallel\n\n    <<\n    from parfor import pmap\n    from time import sleep\n    def fun(i, a):\n        sleep(1)\n        return a*i**2\n    print(pmap(fun, range(10), (3,)))\n\n    >> [0, 3, 12, 27, 48, 75, 108, 147, 192, 243]     \n    \n### Using generators\nIf iterators like lists and tuples are too big for the memory, use generators instead.\nSince generators don't have a predefined length, give parfor the length (total) as an argument (optional). \n    \n    <<\n    import numpy as np\n    c = (im for im in imagereader)\n    @parfor(c, total=len(imagereader))\n    def fun(im):\n        return np.mean(im)\n        \n    >> [list with means of the images]\n    \n# Extra's\n## `pmap`\nThe function parfor decorates, use it like `map`.\n\n## `Chunks`\nSplit a long iterator in bite-sized chunks to parallelize\n\n## `ParPool`\nMore low-level accessibility to parallel execution. Submit tasks and request the result at any time,\n(although necessarily submit first, then request a specific task), use different functions and function\narguments for different tasks.\n\n",
    "bugtrack_url": null,
    "license": "GPLv3",
    "summary": "A package to mimic the use of parfor as done in Matlab.",
    "version": "2024.4.0",
    "project_urls": {
        "Homepage": "https://github.com/wimpomp/parfor",
        "Repository": "https://github.com/wimpomp/parfor"
    },
    "split_keywords": [
        "parfor",
        " concurrency",
        " multiprocessing",
        " parallel"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "613d8917a2d59ce1cae802d8524f17dd5b3da673eed44a8d71dd88324c336248",
                "md5": "fc198f839c7a58fe2a7770bf0184f98c",
                "sha256": "f7e5e6be44c5620dcd0ea38a69e85ed5c027c2e8689a72ab2bee2ccb29435a1d"
            },
            "downloads": -1,
            "filename": "parfor-2024.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fc198f839c7a58fe2a7770bf0184f98c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 23760,
            "upload_time": "2024-04-26T16:34:43",
            "upload_time_iso_8601": "2024-04-26T16:34:43.719044Z",
            "url": "https://files.pythonhosted.org/packages/61/3d/8917a2d59ce1cae802d8524f17dd5b3da673eed44a8d71dd88324c336248/parfor-2024.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5c5bb42cc537f715e4146671be1a86b4fd14b593477cd1b1fdd6661054039a96",
                "md5": "4fcda67fb749347f0611ae35ca6de43d",
                "sha256": "e4204eda23b26d7697cbb4909a2e0d921100150dc895cfa967eb28f60a94a82b"
            },
            "downloads": -1,
            "filename": "parfor-2024.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "4fcda67fb749347f0611ae35ca6de43d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 23891,
            "upload_time": "2024-04-26T16:34:45",
            "upload_time_iso_8601": "2024-04-26T16:34:45.352756Z",
            "url": "https://files.pythonhosted.org/packages/5c/5b/b42cc537f715e4146671be1a86b4fd14b593477cd1b1fdd6661054039a96/parfor-2024.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-26 16:34:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "wimpomp",
    "github_project": "parfor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "parfor"
}
        
Elapsed time: 0.25687s