snakemake-executor-plugin-lsf


Namesnakemake-executor-plugin-lsf JSON
Version 0.2.6 PyPI version JSON
download
home_pagehttps://github.com/befh/snakemake-executor-plugin-lsf
SummaryA Snakemake executor plugin for submitting jobs to a LSF cluster.
upload_time2024-06-03 14:47:21
maintainerNone
docs_urlNone
authorBrian Fulton-Howard
requires_python<4.0,>=3.11
licenseMIT
keywords snakemake plugin executor cluster lsf
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Snakemake executor plugin: LSF

[LSF](https://www.ibm.com/docs/en/spectrum-lsf/) is common high performance
computing batch system.

## Specifying Project and Queue

LSF clusters can have mandatory resource indicators for
accounting and scheduling, [Project]{.title-ref} and
[Queue]{.title-ref}, respectivily. These resources are usually
omitted from Snakemake workflows in order to keep the workflow
definition independent from the platform. However, it is also possible
to specify them inside of the workflow as resources in the rule
definition (see `snakefiles-resources`{.interpreted-text role="ref"}).

To specify them at the command line, define them as default resources:

``` console
$ snakemake --executor lsf --default-resources lsf_project=<your LSF project> lsf_queue=<your LSF queue>
```

If individual rules require e.g. a different queue, you can override
the default per rule:

``` console
$ snakemake --executor lsf --default-resources lsf_project=<your LSF project> lsf_queue=<your LSF queue> --set-resources <somerule>:lsf_queue=<some other queue>
```

Usually, it is advisable to persist such settings via a
[configuration profile](https://snakemake.readthedocs.io/en/latest/executing/cli.html#profiles), which
can be provided system-wide, per user, and in addition per workflow.

This is an example of the relevant profile settings:

```yaml
jobs: '<max concurrent jobs>'
executor: lsf
default-resources:
  - 'lsf_project=<your LSF project>'
  - 'lsf_queue=<your LSF queue>'
```

## Ordinary SMP jobs

Most jobs will be carried out by programs which are either single core
scripts or threaded programs, hence SMP ([shared memory
programs](https://en.wikipedia.org/wiki/Shared_memory)) in nature. Any
given threads and `mem_mb` requirements will be passed to LSF:

``` python
rule a:
    input: ...
    output: ...
    threads: 8
    resources:
        mem_mb=14000
```

This will give jobs from this rule 14GB of memory and 8 CPU cores. It is
advisable to use resonable default resources, such that you don\'t need
to specify them for every rule. Snakemake already has reasonable
defaults built in, which are automatically activated when using any non-local executor
(hence also with lsf). Use mem_mb_per_cpu to give the standard LSF type memory per CPU

## MPI jobs

Snakemake\'s LSF backend also supports MPI jobs, see
`snakefiles-mpi`{.interpreted-text role="ref"} for details.

``` python
rule calc_pi:
  output:
      "pi.calc",
  log:
      "logs/calc_pi.log",
  threads: 40
  resources:
      tasks=10,
      mpi='mpirun,
  shell:
      "{resources.mpi} -np {resources.tasks} calc-pi-mpi > {output} 2> {log}"
```

``` console
$ snakemake --set-resources calc_pi:mpi="mpiexec" ...
```

## Advanced Resource Specifications

A workflow rule may support a number of
[resource specifications](https://snakemake.readthedocs.io/en/latest/snakefiles/rules.html#resources).
For a LSF cluster, a mapping between Snakemake and LSF needs to be performed.

You can use the following specifications:

| LSF                                | Snakemake        | Description                            |
|------------------------------------|------------------|----------------------------------------|
| `-q`                               | `lsf_queue`      | the queue a rule/job is to use         |
| `--W`                              | `walltime`       | the walltime per job in minutes        |
| `-R "rusage[mem=<memory_amount>]"` | `mem`, `mem_mb`  | memory a cluster node must provide     |
|                                    |                  | (`mem`: string with unit, `mem_mb`: i) |
| `-R "rusage[mem=<memory_amount>]"` | `mem_mb_per_cpu` | memory per reserved CPU                |
| omit `-R span[hosts=1]`            | `mpi`            | Allow splitting across nodes for MPI   |
| `-R span[ptile=<ptile>]`           | `ptile`          | Processors per host. Reqires `mpi`     |
| Other `bsub` arguments             | `lsf_extra`      | Other args to pass to `bsub` (str)     |


Each of these can be part of a rule, e.g.:

``` python
rule:
    input: ...
    output: ...
    resources:
        partition: <partition name>
        walltime: <some number>
```

`walltime` and `runtime` are synonyms.

Please note: as `--mem` and `--mem-per-cpu` are mutually exclusive,
their corresponding resource flags `mem`/`mem_mb` and
`mem_mb_per_cpu` are mutually exclusive, too. You can only reserve
memory a compute node has to provide or the memory required per CPU
(LSF does not make any distintion between real CPU cores and those
provided by hyperthreads). The executor will convert the provided options
based on cluster config.

## Additional custom job configuration

There are various `bsub` options not directly supported via the resource
definitions shown above. You may use the `lsf_extra` resource to specify
additional flags to `bsub`:

``` python
rule myrule:
    input: ...
    output: ...
    resources:
        lsf_extra="-R a100 -gpu num=2"
```

Again, rather use a [profile](https://snakemake.readthedocs.io/en/latest/executing/cli.html#profiles) to specify such resources.

## Clusters that use per-job memory requests instead of per-core

By default, this plugin converts the specified memory request into the per-core request expected by most LSF clusters.
So `threads: 4` and `mem_mb=128` will result in `-R rusage[mem=32]`. If the request should be per-job on your cluster
(i.e. `-R rusage[mem=<mem_mb>]`) then set the environment variable `SNAKEMAKE_LSF_MEMFMT` to `perjob`.

The executor automatically detects the request unit from cluster configuration, so if your cluster does not use MB,
you do not need to do anything.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/befh/snakemake-executor-plugin-lsf",
    "name": "snakemake-executor-plugin-lsf",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": "snakemake, plugin, executor, cluster, lsf",
    "author": "Brian Fulton-Howard",
    "author_email": "brian.fulton-howard@mssm.edu",
    "download_url": "https://files.pythonhosted.org/packages/60/3a/7d891f9dcdc663d230c05ac7a41f548cbd1d0794f72b5266a14a793e6aa2/snakemake_executor_plugin_lsf-0.2.6.tar.gz",
    "platform": null,
    "description": "# Snakemake executor plugin: LSF\n\n[LSF](https://www.ibm.com/docs/en/spectrum-lsf/) is common high performance\ncomputing batch system.\n\n## Specifying Project and Queue\n\nLSF clusters can have mandatory resource indicators for\naccounting and scheduling, [Project]{.title-ref} and\n[Queue]{.title-ref}, respectivily. These resources are usually\nomitted from Snakemake workflows in order to keep the workflow\ndefinition independent from the platform. However, it is also possible\nto specify them inside of the workflow as resources in the rule\ndefinition (see `snakefiles-resources`{.interpreted-text role=\"ref\"}).\n\nTo specify them at the command line, define them as default resources:\n\n``` console\n$ snakemake --executor lsf --default-resources lsf_project=<your LSF project> lsf_queue=<your LSF queue>\n```\n\nIf individual rules require e.g. a different queue, you can override\nthe default per rule:\n\n``` console\n$ snakemake --executor lsf --default-resources lsf_project=<your LSF project> lsf_queue=<your LSF queue> --set-resources <somerule>:lsf_queue=<some other queue>\n```\n\nUsually, it is advisable to persist such settings via a\n[configuration profile](https://snakemake.readthedocs.io/en/latest/executing/cli.html#profiles), which\ncan be provided system-wide, per user, and in addition per workflow.\n\nThis is an example of the relevant profile settings:\n\n```yaml\njobs: '<max concurrent jobs>'\nexecutor: lsf\ndefault-resources:\n  - 'lsf_project=<your LSF project>'\n  - 'lsf_queue=<your LSF queue>'\n```\n\n## Ordinary SMP jobs\n\nMost jobs will be carried out by programs which are either single core\nscripts or threaded programs, hence SMP ([shared memory\nprograms](https://en.wikipedia.org/wiki/Shared_memory)) in nature. Any\ngiven threads and `mem_mb` requirements will be passed to LSF:\n\n``` python\nrule a:\n    input: ...\n    output: ...\n    threads: 8\n    resources:\n        mem_mb=14000\n```\n\nThis will give jobs from this rule 14GB of memory and 8 CPU cores. It is\nadvisable to use resonable default resources, such that you don\\'t need\nto specify them for every rule. Snakemake already has reasonable\ndefaults built in, which are automatically activated when using any non-local executor\n(hence also with lsf). Use mem_mb_per_cpu to give the standard LSF type memory per CPU\n\n## MPI jobs\n\nSnakemake\\'s LSF backend also supports MPI jobs, see\n`snakefiles-mpi`{.interpreted-text role=\"ref\"} for details.\n\n``` python\nrule calc_pi:\n  output:\n      \"pi.calc\",\n  log:\n      \"logs/calc_pi.log\",\n  threads: 40\n  resources:\n      tasks=10,\n      mpi='mpirun,\n  shell:\n      \"{resources.mpi} -np {resources.tasks} calc-pi-mpi > {output} 2> {log}\"\n```\n\n``` console\n$ snakemake --set-resources calc_pi:mpi=\"mpiexec\" ...\n```\n\n## Advanced Resource Specifications\n\nA workflow rule may support a number of\n[resource specifications](https://snakemake.readthedocs.io/en/latest/snakefiles/rules.html#resources).\nFor a LSF cluster, a mapping between Snakemake and LSF needs to be performed.\n\nYou can use the following specifications:\n\n| LSF                                | Snakemake        | Description                            |\n|------------------------------------|------------------|----------------------------------------|\n| `-q`                               | `lsf_queue`      | the queue a rule/job is to use         |\n| `--W`                              | `walltime`       | the walltime per job in minutes        |\n| `-R \"rusage[mem=<memory_amount>]\"` | `mem`, `mem_mb`  | memory a cluster node must provide     |\n|                                    |                  | (`mem`: string with unit, `mem_mb`: i) |\n| `-R \"rusage[mem=<memory_amount>]\"` | `mem_mb_per_cpu` | memory per reserved CPU                |\n| omit `-R span[hosts=1]`            | `mpi`            | Allow splitting across nodes for MPI   |\n| `-R span[ptile=<ptile>]`           | `ptile`          | Processors per host. Reqires `mpi`     |\n| Other `bsub` arguments             | `lsf_extra`      | Other args to pass to `bsub` (str)     |\n\n\nEach of these can be part of a rule, e.g.:\n\n``` python\nrule:\n    input: ...\n    output: ...\n    resources:\n        partition: <partition name>\n        walltime: <some number>\n```\n\n`walltime` and `runtime` are synonyms.\n\nPlease note: as `--mem` and `--mem-per-cpu` are mutually exclusive,\ntheir corresponding resource flags `mem`/`mem_mb` and\n`mem_mb_per_cpu` are mutually exclusive, too. You can only reserve\nmemory a compute node has to provide or the memory required per CPU\n(LSF does not make any distintion between real CPU cores and those\nprovided by hyperthreads). The executor will convert the provided options\nbased on cluster config.\n\n## Additional custom job configuration\n\nThere are various `bsub` options not directly supported via the resource\ndefinitions shown above. You may use the `lsf_extra` resource to specify\nadditional flags to `bsub`:\n\n``` python\nrule myrule:\n    input: ...\n    output: ...\n    resources:\n        lsf_extra=\"-R a100 -gpu num=2\"\n```\n\nAgain, rather use a [profile](https://snakemake.readthedocs.io/en/latest/executing/cli.html#profiles) to specify such resources.\n\n## Clusters that use per-job memory requests instead of per-core\n\nBy default, this plugin converts the specified memory request into the per-core request expected by most LSF clusters.\nSo `threads: 4` and `mem_mb=128` will result in `-R rusage[mem=32]`. If the request should be per-job on your cluster\n(i.e. `-R rusage[mem=<mem_mb>]`) then set the environment variable `SNAKEMAKE_LSF_MEMFMT` to `perjob`.\n\nThe executor automatically detects the request unit from cluster configuration, so if your cluster does not use MB,\nyou do not need to do anything.\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Snakemake executor plugin for submitting jobs to a LSF cluster.",
    "version": "0.2.6",
    "project_urls": {
        "Documentation": "https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/lsf.html",
        "Homepage": "https://github.com/befh/snakemake-executor-plugin-lsf",
        "Repository": "https://github.com/befh/snakemake-executor-plugin-lsf"
    },
    "split_keywords": [
        "snakemake",
        " plugin",
        " executor",
        " cluster",
        " lsf"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "41da4a210bfa7b9bad082b084873ec122d7b65b16fe46b824c5c20dbd5af4cc2",
                "md5": "74d055b8541f007878b6224baabdb580",
                "sha256": "8ddbd53c20a48c2447ff72cb65e95f3fd12b293c5f5348fc15e8a29dc8069823"
            },
            "downloads": -1,
            "filename": "snakemake_executor_plugin_lsf-0.2.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "74d055b8541f007878b6224baabdb580",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 11908,
            "upload_time": "2024-06-03T14:47:19",
            "upload_time_iso_8601": "2024-06-03T14:47:19.778338Z",
            "url": "https://files.pythonhosted.org/packages/41/da/4a210bfa7b9bad082b084873ec122d7b65b16fe46b824c5c20dbd5af4cc2/snakemake_executor_plugin_lsf-0.2.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "603a7d891f9dcdc663d230c05ac7a41f548cbd1d0794f72b5266a14a793e6aa2",
                "md5": "167652a8b47193868465c78a71c750d1",
                "sha256": "ddd4205c3a6da299d0962b6d3e689131ace1fd8547b528ab628543dd865f608d"
            },
            "downloads": -1,
            "filename": "snakemake_executor_plugin_lsf-0.2.6.tar.gz",
            "has_sig": false,
            "md5_digest": "167652a8b47193868465c78a71c750d1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 13210,
            "upload_time": "2024-06-03T14:47:21",
            "upload_time_iso_8601": "2024-06-03T14:47:21.481052Z",
            "url": "https://files.pythonhosted.org/packages/60/3a/7d891f9dcdc663d230c05ac7a41f548cbd1d0794f72b5266a14a793e6aa2/snakemake_executor_plugin_lsf-0.2.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-03 14:47:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "befh",
    "github_project": "snakemake-executor-plugin-lsf",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "snakemake-executor-plugin-lsf"
}
        
Elapsed time: 4.35413s