submitit


Namesubmitit JSON
Version 1.5.2 PyPI version JSON
download
home_pageNone
Summary"Python 3.8+ toolbox for submitting jobs to Slurm
upload_time2024-09-18 16:05:11
maintainerNone
docs_urlNone
authorFacebook AI Research
requires_python>=3.8
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![CircleCI](https://circleci.com/gh/facebookincubator/submitit.svg?style=svg)](https://circleci.com/gh/facebookincubator/workflows/submitit)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Pypi](https://img.shields.io/pypi/v/submitit)](https://pypi.org/project/submitit/)
[![conda-forge](https://img.shields.io/conda/vn/conda-forge/submitit)](https://anaconda.org/conda-forge/submitit)
# Submit it!

## What is submitit?

Submitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster.
It basically wraps submission and provide access to results, logs and more.
[Slurm](https://slurm.schedmd.com/quickstart.html) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.
Submitit allows to switch seamlessly between executing on Slurm or locally.

### An example is worth a thousand words: performing an addition

From inside an environment with `submitit` installed:

```python
import submitit

def add(a, b):
    return a + b

# executor is the submission interface (logs are dumped in the folder)
executor = submitit.AutoExecutor(folder="log_test")
# set timeout in min, and partition for running the job
executor.update_parameters(timeout_min=1, slurm_partition="dev")
job = executor.submit(add, 5, 7)  # will compute add(5, 7)
print(job.job_id)  # ID of your job

output = job.result()  # waits for completion and returns output
assert output == 12  # 5 + 7 = 12...  your addition was computed in the cluster
```

The `Job` class also provides tools for reading the log files (`job.stdout()` and `job.stderr()`).

If what you want to run is a command, turn it into a Python function using `submitit.helpers.CommandFunction`, then submit it.
By default stdout is silenced in `CommandFunction`, but it can be unsilenced with `verbose=True`.

**Find more examples [here](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/examples.md)!!!**

Submitit is a Python 3.8+ toolbox for submitting jobs to Slurm.
It aims at running python function from python code.


## Install

Quick install, in a virtualenv/conda environment where `pip` is installed (check `which pip`):
- stable release:
  ```
  pip install submitit
  ```
- stable release using __conda__:
  ```
  conda install -c conda-forge submitit
  ```
- main branch:
  ```
  pip install git+https://github.com/facebookincubator/submitit@main#egg=submitit
  ```

You can try running the [MNIST example](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/mnist.py) to check that everything is working as expected (requires sklearn).


## Documentation

See the following pages for more detailled information:

- [Examples](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/examples.md): for a bunch of examples dealing with errors, concurrency, multi-tasking etc...
- [Structure and main objects](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/structure.md): to get a better understanding of how `submitit` works, which files are created for each job, and the main objects you will interact with.
- [Checkpointing](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/checkpointing.md): to understand how you can configure your job to get checkpointed when preempted and/or timed-out.
- [Tips and caveats](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/tips.md): for a bunch of information that can be handy when working with `submitit`.
- [Hyperparameter search with nevergrad](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/nevergrad.md): basic example of `nevergrad` usage and how it interfaces with `submitit`.


### Goals

The aim of this Python3 package is to be able to launch jobs on Slurm painlessly from *inside Python*, using the same submission and job patterns than the standard library package `concurrent.futures`:

Here are a few benefits of using this lightweight package:
 - submit any function, even lambda and script-defined functions.
 - raises an error with stack trace if the job failed.
 - requeue preempted jobs (Slurm only)
 - swap between `submitit` executor and one of `concurrent.futures` executors in a line, so that it is easy to run your code either on slurm, or locally with multithreading for instance.
 - checkpoints stateful callables when preempted or timed-out and requeue from current state (advanced feature).
 - easy access to task local/global rank for multi-nodes/tasks jobs.
 - same code can work for different clusters thanks to a plugin system.

Submitit is used by FAIR researchers on the FAIR cluster.
The defaults are chosen to make their life easier, and might not be ideal for every cluster.

### Non-goals

- a commandline tool for running slurm jobs. Here, everything happens inside Python. To this end, you can however use [Hydra](https://hydra.cc/)'s [submitit plugin](https://hydra.cc/docs/next/plugins/submitit_launcher) (version >= 1.0.0).
- a task queue, this only implements the ability to launch tasks, but does not schedule them in any way.
- being used in Python2! This is a Python3.8+ only package :)


### Comparison with dask.distributed

[`dask`](https://distributed.dask.org/en/latest/) is a nice framework for distributed computing. `dask.distributed` provides the same `concurrent.futures` executor API as `submitit`:

```python
from distributed import Client
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(processes=1, cores=2, memory="2GB")
cluster.scale(2)  # this may take a few seconds to launch
executor = Client(cluster)
executor.submit(...)
```

The key difference with `submitit` is that `dask.distributed` distributes the jobs to a pool of workers (see the `cluster` variable above) while `submitit` jobs are directly jobs on the cluster. In that sense `submitit` is a lower level interface than `dask.distributed` and you get more direct control over your jobs, including individual `stdout` and `stderr`, and possibly checkpointing in case of preemption and timeout. On the other hand, you should avoid submitting multiple small tasks with `submitit`, which would create many independent jobs and possibly overload the cluster, while you can do it without any problem through `dask.distributed`.


## Contributors

By chronological order: Jérémy Rapin, Louis Martin, Lowik Chanussot, Lucas Hosseini, Fabio Petroni, Francisco Massa, Guillaume Wenzek, Thibaut Lavril, Vinayak Tantia, Andrea Vedaldi, Max Nickel, Quentin Duval (feel free to [contribute](https://github.com/facebookincubator/submitit/blob/1.5.2/.github/CONTRIBUTING.md) and add your name ;) )

## License

Submitit is released under the [MIT License](https://github.com/facebookincubator/submitit/blob/1.5.2/LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "submitit",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Facebook AI Research",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/1c/0a/854409283d533279b1b7523ebb65e1926ff611ee81bdf9c298bbed7b75ac/submitit-1.5.2.tar.gz",
    "platform": null,
    "description": "[![CircleCI](https://circleci.com/gh/facebookincubator/submitit.svg?style=svg)](https://circleci.com/gh/facebookincubator/workflows/submitit)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Pypi](https://img.shields.io/pypi/v/submitit)](https://pypi.org/project/submitit/)\n[![conda-forge](https://img.shields.io/conda/vn/conda-forge/submitit)](https://anaconda.org/conda-forge/submitit)\n# Submit it!\n\n## What is submitit?\n\nSubmitit is a lightweight tool for submitting Python functions for computation within a Slurm cluster.\nIt basically wraps submission and provide access to results, logs and more.\n[Slurm](https://slurm.schedmd.com/quickstart.html) is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters.\nSubmitit allows to switch seamlessly between executing on Slurm or locally.\n\n### An example is worth a thousand words: performing an addition\n\nFrom inside an environment with `submitit` installed:\n\n```python\nimport submitit\n\ndef add(a, b):\n    return a + b\n\n# executor is the submission interface (logs are dumped in the folder)\nexecutor = submitit.AutoExecutor(folder=\"log_test\")\n# set timeout in min, and partition for running the job\nexecutor.update_parameters(timeout_min=1, slurm_partition=\"dev\")\njob = executor.submit(add, 5, 7)  # will compute add(5, 7)\nprint(job.job_id)  # ID of your job\n\noutput = job.result()  # waits for completion and returns output\nassert output == 12  # 5 + 7 = 12...  your addition was computed in the cluster\n```\n\nThe `Job` class also provides tools for reading the log files (`job.stdout()` and `job.stderr()`).\n\nIf what you want to run is a command, turn it into a Python function using `submitit.helpers.CommandFunction`, then submit it.\nBy default stdout is silenced in `CommandFunction`, but it can be unsilenced with `verbose=True`.\n\n**Find more examples [here](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/examples.md)!!!**\n\nSubmitit is a Python 3.8+ toolbox for submitting jobs to Slurm.\nIt aims at running python function from python code.\n\n\n## Install\n\nQuick install, in a virtualenv/conda environment where `pip` is installed (check `which pip`):\n- stable release:\n  ```\n  pip install submitit\n  ```\n- stable release using __conda__:\n  ```\n  conda install -c conda-forge submitit\n  ```\n- main branch:\n  ```\n  pip install git+https://github.com/facebookincubator/submitit@main#egg=submitit\n  ```\n\nYou can try running the [MNIST example](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/mnist.py) to check that everything is working as expected (requires sklearn).\n\n\n## Documentation\n\nSee the following pages for more detailled information:\n\n- [Examples](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/examples.md): for a bunch of examples dealing with errors, concurrency, multi-tasking etc...\n- [Structure and main objects](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/structure.md): to get a better understanding of how `submitit` works, which files are created for each job, and the main objects you will interact with.\n- [Checkpointing](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/checkpointing.md): to understand how you can configure your job to get checkpointed when preempted and/or timed-out.\n- [Tips and caveats](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/tips.md): for a bunch of information that can be handy when working with `submitit`.\n- [Hyperparameter search with nevergrad](https://github.com/facebookincubator/submitit/blob/1.5.2/docs/nevergrad.md): basic example of `nevergrad` usage and how it interfaces with `submitit`.\n\n\n### Goals\n\nThe aim of this Python3 package is to be able to launch jobs on Slurm painlessly from *inside Python*, using the same submission and job patterns than the standard library package `concurrent.futures`:\n\nHere are a few benefits of using this lightweight package:\n - submit any function, even lambda and script-defined functions.\n - raises an error with stack trace if the job failed.\n - requeue preempted jobs (Slurm only)\n - swap between `submitit` executor and one of `concurrent.futures` executors in a line, so that it is easy to run your code either on slurm, or locally with multithreading for instance.\n - checkpoints stateful callables when preempted or timed-out and requeue from current state (advanced feature).\n - easy access to task local/global rank for multi-nodes/tasks jobs.\n - same code can work for different clusters thanks to a plugin system.\n\nSubmitit is used by FAIR researchers on the FAIR cluster.\nThe defaults are chosen to make their life easier, and might not be ideal for every cluster.\n\n### Non-goals\n\n- a commandline tool for running slurm jobs. Here, everything happens inside Python. To this end, you can however use [Hydra](https://hydra.cc/)'s [submitit plugin](https://hydra.cc/docs/next/plugins/submitit_launcher) (version >= 1.0.0).\n- a task queue, this only implements the ability to launch tasks, but does not schedule them in any way.\n- being used in Python2! This is a Python3.8+ only package :)\n\n\n### Comparison with dask.distributed\n\n[`dask`](https://distributed.dask.org/en/latest/) is a nice framework for distributed computing. `dask.distributed` provides the same `concurrent.futures` executor API as `submitit`:\n\n```python\nfrom distributed import Client\nfrom dask_jobqueue import SLURMCluster\ncluster = SLURMCluster(processes=1, cores=2, memory=\"2GB\")\ncluster.scale(2)  # this may take a few seconds to launch\nexecutor = Client(cluster)\nexecutor.submit(...)\n```\n\nThe key difference with `submitit` is that `dask.distributed` distributes the jobs to a pool of workers (see the `cluster` variable above) while `submitit` jobs are directly jobs on the cluster. In that sense `submitit` is a lower level interface than `dask.distributed` and you get more direct control over your jobs, including individual `stdout` and `stderr`, and possibly checkpointing in case of preemption and timeout. On the other hand, you should avoid submitting multiple small tasks with `submitit`, which would create many independent jobs and possibly overload the cluster, while you can do it without any problem through `dask.distributed`.\n\n\n## Contributors\n\nBy chronological order: J\u00e9r\u00e9my Rapin, Louis Martin, Lowik Chanussot, Lucas Hosseini, Fabio Petroni, Francisco Massa, Guillaume Wenzek, Thibaut Lavril, Vinayak Tantia, Andrea Vedaldi, Max Nickel, Quentin Duval (feel free to [contribute](https://github.com/facebookincubator/submitit/blob/1.5.2/.github/CONTRIBUTING.md) and add your name ;) )\n\n## License\n\nSubmitit is released under the [MIT License](https://github.com/facebookincubator/submitit/blob/1.5.2/LICENSE).\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "\"Python 3.8+ toolbox for submitting jobs to Slurm",
    "version": "1.5.2",
    "project_urls": {
        "Source": "https://github.com/facebookincubator/submitit",
        "Tracker": "https://github.com/facebookincubator/submitit/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "dca490123871996bfb8a7148cd11d61e7a0ddc0118114c071730b3dc3a05c7bc",
                "md5": "9a75f70ef73f4934b5cd1ea23af65f75",
                "sha256": "c6d5867fbcc78588d0ded3338436903f8db9fdb759f80e9639e6025a9ea32ade"
            },
            "downloads": -1,
            "filename": "submitit-1.5.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9a75f70ef73f4934b5cd1ea23af65f75",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 74917,
            "upload_time": "2024-09-18T16:05:09",
            "upload_time_iso_8601": "2024-09-18T16:05:09.664670Z",
            "url": "https://files.pythonhosted.org/packages/dc/a4/90123871996bfb8a7148cd11d61e7a0ddc0118114c071730b3dc3a05c7bc/submitit-1.5.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1c0a854409283d533279b1b7523ebb65e1926ff611ee81bdf9c298bbed7b75ac",
                "md5": "d29c43f532b8ab4ff46c0a1cd7fb4ce1",
                "sha256": "36a8a54ad4e10171111e7618eefe28fe819f931a89c9cd1f6d2770900c013f12"
            },
            "downloads": -1,
            "filename": "submitit-1.5.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d29c43f532b8ab4ff46c0a1cd7fb4ce1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 80390,
            "upload_time": "2024-09-18T16:05:11",
            "upload_time_iso_8601": "2024-09-18T16:05:11.330789Z",
            "url": "https://files.pythonhosted.org/packages/1c/0a/854409283d533279b1b7523ebb65e1926ff611ee81bdf9c298bbed7b75ac/submitit-1.5.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-18 16:05:11",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "facebookincubator",
    "github_project": "submitit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "submitit"
}
        
Elapsed time: 0.37716s