Cluster Futures
===============
This module provides a Python `concurrent.futures`_ executor that lets you run
functions on remote systems in your `HTCondor`_ or `Slurm`_ cluster. Stop worrying
about writing job files, scattering/gathering, and serialization---this module
does it all for you.
It uses the `cloudpickle`_ library to allow (most) closures to be used
transparently, so you're not limited to "pure" functions.
Installation::
pip install clusterfutures
Usage:
.. code-block:: python
import cfut
def square(n):
return n * n
with cfut.SlurmExecutor() as executor:
for result in executor.map(square, [5, 7, 11]):
print(result)
See `slurm_example.py`_ and `condor_example.py`_ for further examples.
The easiest way to get started is to
ignore the fact that futures are being used at all and just use the provided
``map`` function, which behaves like `itertools.imap`_ but transparently
distributes your work across the cluster.
Goals & design
--------------
*clusterfutures* is a simple wrapper to run Python functions in batch jobs on
an HPC cluster. Each future corresponds to one batch job. The functions
that you run through clusterfutures should normally run for at least a few
seconds each: running smaller functions will be inefficient because of the
overhead of launching jobs and moving data.
Functions, parameters and return values are sent by creating files; this assumes
that the control process and the worker nodes have a shared filesystem.
This mechanism is convenient for relatively small amounts of data; it's probably
not the best way to transfer large amounts of data to & from workers.
.. _concurrent.futures:
https://docs.python.org/3/library/concurrent.futures.html
.. _HTCondor: https://research.cs.wisc.edu/htcondor/
.. _cloudpickle: https://github.com/cloudpipe/cloudpickle
.. _itertools.imap: https://docs.python.org/3/library/itertools.html#itertools.imap
.. _Slurm: https://slurm.schedmd.com/
.. _slurm_example.py: https://github.com/sampsyo/clusterfutures/blob/master/slurm_example.py
.. _condor_example.py: https://github.com/sampsyo/clusterfutures/blob/master/condor_example.py
Raw data
{
"_id": null,
"home_page": "https://github.com/sampsyo/clusterfutures",
"name": "clusterfutures",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Adrian Sampson",
"author_email": "asampson@cs.washington.edu",
"download_url": "https://files.pythonhosted.org/packages/91/05/667b80f05dfb2b175f8d36f3c737e5ae9cb6e65f0c0a2cea0d655d73c2c8/clusterfutures-0.5.tar.gz",
"platform": "ALL",
"description": "Cluster Futures\n===============\n\nThis module provides a Python `concurrent.futures`_ executor that lets you run\nfunctions on remote systems in your `HTCondor`_ or `Slurm`_ cluster. Stop worrying\nabout writing job files, scattering/gathering, and serialization---this module\ndoes it all for you.\n\nIt uses the `cloudpickle`_ library to allow (most) closures to be used\ntransparently, so you're not limited to \"pure\" functions.\n\nInstallation::\n\n pip install clusterfutures\n\nUsage:\n\n.. code-block:: python\n\n import cfut\n def square(n):\n return n * n\n\n with cfut.SlurmExecutor() as executor:\n for result in executor.map(square, [5, 7, 11]):\n print(result)\n\nSee `slurm_example.py`_ and `condor_example.py`_ for further examples.\nThe easiest way to get started is to\nignore the fact that futures are being used at all and just use the provided\n``map`` function, which behaves like `itertools.imap`_ but transparently\ndistributes your work across the cluster.\n\nGoals & design\n--------------\n\n*clusterfutures* is a simple wrapper to run Python functions in batch jobs on\nan HPC cluster. Each future corresponds to one batch job. The functions\nthat you run through clusterfutures should normally run for at least a few\nseconds each: running smaller functions will be inefficient because of the\noverhead of launching jobs and moving data.\n\nFunctions, parameters and return values are sent by creating files; this assumes\nthat the control process and the worker nodes have a shared filesystem.\nThis mechanism is convenient for relatively small amounts of data; it's probably\nnot the best way to transfer large amounts of data to & from workers.\n\n.. _concurrent.futures:\n https://docs.python.org/3/library/concurrent.futures.html\n.. _HTCondor: https://research.cs.wisc.edu/htcondor/\n.. _cloudpickle: https://github.com/cloudpipe/cloudpickle\n.. _itertools.imap: https://docs.python.org/3/library/itertools.html#itertools.imap\n.. _Slurm: https://slurm.schedmd.com/\n.. _slurm_example.py: https://github.com/sampsyo/clusterfutures/blob/master/slurm_example.py\n.. _condor_example.py: https://github.com/sampsyo/clusterfutures/blob/master/condor_example.py\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "futures for remote execution on clusters",
"version": "0.5",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8a0cb3357f3f7fe40009f24c2aac4ad6ae902071c5eff340c162f4c895751a20",
"md5": "11ca7a908f1a5f6cd5277c8830ba80aa",
"sha256": "391c9258da445366b7e859ac8ed2883aecfd26de364959fc1910e1a7c63bb933"
},
"downloads": -1,
"filename": "clusterfutures-0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "11ca7a908f1a5f6cd5277c8830ba80aa",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9569,
"upload_time": "2023-01-04T12:00:04",
"upload_time_iso_8601": "2023-01-04T12:00:04.375847Z",
"url": "https://files.pythonhosted.org/packages/8a/0c/b3357f3f7fe40009f24c2aac4ad6ae902071c5eff340c162f4c895751a20/clusterfutures-0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9105667b80f05dfb2b175f8d36f3c737e5ae9cb6e65f0c0a2cea0d655d73c2c8",
"md5": "a90c41b7a8ad8d5be30638330008dba0",
"sha256": "261e82c44b500e39b71e3a2c41db66d9777e9674d58449cb90aa83a1955e9453"
},
"downloads": -1,
"filename": "clusterfutures-0.5.tar.gz",
"has_sig": false,
"md5_digest": "a90c41b7a8ad8d5be30638330008dba0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 9113,
"upload_time": "2023-01-04T12:00:06",
"upload_time_iso_8601": "2023-01-04T12:00:06.569674Z",
"url": "https://files.pythonhosted.org/packages/91/05/667b80f05dfb2b175f8d36f3c737e5ae9cb6e65f0c0a2cea0d655d73c2c8/clusterfutures-0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-04 12:00:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "sampsyo",
"github_project": "clusterfutures",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "clusterfutures"
}