snakemake-executor-plugin-pcluster-slurm


Namesnakemake-executor-plugin-pcluster-slurm JSON
Version 0.0.31 PyPI version JSON
download
home_pagehttps://github.com/Daylily-Informatics/snakemake-executor-plugin-pcluster-slurm
SummaryA Snakemake executor plugin for submitting jobs to an AWS Parallel Cluster (pcluster) SLURM cluster.
upload_time2024-10-22 13:03:07
maintainerNone
docs_urlNone
authorJohn Major
requires_python<4.0,>=3.11
licenseMIT
keywords snakemake plugin executor cluster slurm pcluster aws parallel-compute parallel-cluster
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Snakemake executor plugin: pcluster-slurm v_0.0.7_

# Snakemake Executor Plugins (generally)
[Snakemake plugin catalog docs](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor).

## `pcluster-slurm` plugin
### AWS Parallel Cluster, `pcluster` `slurm`
[AWS Parallel Cluster](https://aws.amazon.com/hpc/parallelcluster/) is a framework to deploy and manage dynamically scalable HPC clusters on AWS, running SLURM as the batch system, and `pcluster` manages all of the creating, configuring, and deleting of the cluster compute nodes. Nodes may be spot or dedicated.  **note**, the `AWS Parallel Cluster` port of slurm has a few small, but critical differences from the standard slurm distribution.  This plugin enables using slurm from pcluster head and compute nodes via snakemake `>=v8.*`.

#### [Daylily Bfx Framework](https://github.com/Daylily-Informatics/daylily)
[Daylily](https://github.com/Daylily-Informatics/daylily) is a bioinformatics framework that automates and standardizes all aspects of creating a self-scaling ephemeral cluster which can grow from 1 head node to many thousands of as-needed compute spot instances (modulo your quotas and budget). This is accomplished by using [AWS Parallel Cluster](https://aws.amazon.com/hpc/parallelcluster/) to manage the cluster, and snakemake to manage the bfx workflows. In this context, `slurm` is the intermediary between snakemake and the cluster resource management. The `pcluster` slurm variant does not play nicely with vanilla slurm, and to date, the slurm snakemake executor has not worked with `pcluster` slurm. This plugin is a bridge between snakemake and `pcluster-slurm`.



# Pre-requisites
## Snakemake >=8.*
### Conda
```bash
conda create -n snakemake -c conda-forge -c bioconda snakemake==8.20.6
conda activate snakemake
```

# Installation (pip)
_from an environment with snakemake and pip installed_
```bash
pip install snakemake-executor-plugin-pcluster-slurm
```

# Example Usage [daylily cluster headnode](https://github.com/Daylily-Informatics/daylily)
```bash
mkdir -p /fsx/resources/environments/containers/ubuntu/cache/
export SNAKEMAKE_OUTPUT_CACHE=/fsx/resources/environments/containers/ubuntu/cache/
snakemake --use-conda --use-singularity -j 10  --singularity-prefix /fsx/resources/environments/containers/ubuntu/ip-10-0-0-240/ --singularity-args "  -B /tmp:/tmp -B /fsx:/fsx  -B /home/$USER:/home/$USER -B $PWD/:$PWD" --conda-prefix /fsx/resources/environments/containers/ubuntu/ip-10-0-0-240/ --executor pcluster-slurm --default-resources slurm_partition='i64,i128,i192' --cache  --verbose -k
```



## What Partitions Are Available?
Use `sinfo` to learn about your cluster (note, `sinfo` reports on all potential and active compute nodes. Read the docs to interpret which are active, which are not yet requested s\
pot instances, etc). Below is what the [daylily AWS parallel cluster](https://github.com/Daylily-Informatics/daylily/blob/main/config/day_cluster/prod_cluster.yaml) looks like.

```bash
sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
i8*          up   infinite     12  idle~ i8-dy-gb64-[1-12]
i64          up   infinite     16  idle~ i64-dy-gb256-[1-8],i64-dy-gb512-[1-8]
i96          up   infinite     16  idle~ i96-dy-gb384-[1-8],i96-dy-gb768-[1-8]
i128         up   infinite     28  idle~ i128-dy-gb256-[1-8],i128-dy-gb512-[1-10],i128-dy-gb1024-[1-10]
i192         up   infinite     30  idle~ i192-dy-gb384-[1-10],i192-dy-gb768-[1-10],i192-dy-gb1536-[1-10]
a192         up   infinite     30  idle~ a192-dy-gb384-[1-10],a192-dy-gb768-[1-10],a192-dy-gb1536-[1-10]
```
-  As I look at this, it is possible that if unset, the partition will default to `i8` in the output above. Maybe.

  

# Other Cool Stuff
## Real Time Cost Tracking & Use Throttling via Budgets, Tagging ... and the `--comment` sbatch flag.
I etensively make use of  [Cost allocation tags with AWS ParallelCluster](https://github.com/Daylily-Informatics/aws-parallelcluster-cost-allocation-tags) in the [daylily omics analysis framework](https://github.com/Daylily-Informatics/daylily?tab=readme-ov-file#daylily-aws-ephemeral-cluster-setup-0714) [_$3 30x WGS analysis_](https://github.com/Daylily-Informatics/daylily?tab=readme-ov-file#3-30x-fastq-bam-bamdeduplicated-snvvcfsvvcf-add-035-for-a-raft-of-qc-reports)  to track AWS cluster usage costs in realtime, and impose limits where appropriate (by user and project). This makes use of overriding the `--comment` flag to hold `project/budget` tags applied to ephemeral AWS resources, and thus enabling cost tracking/controls.

* To change the	--comment flag in v`0.0.8` of the pcluster-slurm plugin, set the comment flag value in the envvar `SMK_SLURM_COMMENT=RandD` (RandD is the default).
 
 
 
 


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Daylily-Informatics/snakemake-executor-plugin-pcluster-slurm",
    "name": "snakemake-executor-plugin-pcluster-slurm",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.11",
    "maintainer_email": null,
    "keywords": "snakemake, plugin, executor, cluster, slurm, pcluster, aws, parallel-compute, parallel-cluster",
    "author": "John Major",
    "author_email": "john@daylilyinformatics.com",
    "download_url": "https://files.pythonhosted.org/packages/6f/e6/0cf9baa76a46b2360e46d130a13d57092710eff64e3e1b7d3b5db9a0ad90/snakemake_executor_plugin_pcluster_slurm-0.0.31.tar.gz",
    "platform": null,
    "description": "# Snakemake executor plugin: pcluster-slurm v_0.0.7_\n\n# Snakemake Executor Plugins (generally)\n[Snakemake plugin catalog docs](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor).\n\n## `pcluster-slurm` plugin\n### AWS Parallel Cluster, `pcluster` `slurm`\n[AWS Parallel Cluster](https://aws.amazon.com/hpc/parallelcluster/) is a framework to deploy and manage dynamically scalable HPC clusters on AWS, running SLURM as the batch system, and `pcluster` manages all of the creating, configuring, and deleting of the cluster compute nodes. Nodes may be spot or dedicated.  **note**, the `AWS Parallel Cluster` port of slurm has a few small, but critical differences from the standard slurm distribution.  This plugin enables using slurm from pcluster head and compute nodes via snakemake `>=v8.*`.\n\n#### [Daylily Bfx Framework](https://github.com/Daylily-Informatics/daylily)\n[Daylily](https://github.com/Daylily-Informatics/daylily) is a bioinformatics framework that automates and standardizes all aspects of creating a self-scaling ephemeral cluster which can grow from 1 head node to many thousands of as-needed compute spot instances (modulo your quotas and budget). This is accomplished by using [AWS Parallel Cluster](https://aws.amazon.com/hpc/parallelcluster/) to manage the cluster, and snakemake to manage the bfx workflows. In this context, `slurm` is the intermediary between snakemake and the cluster resource management. The `pcluster` slurm variant does not play nicely with vanilla slurm, and to date, the slurm snakemake executor has not worked with `pcluster` slurm. This plugin is a bridge between snakemake and `pcluster-slurm`.\n\n\n\n# Pre-requisites\n## Snakemake >=8.*\n### Conda\n```bash\nconda create -n snakemake -c conda-forge -c bioconda snakemake==8.20.6\nconda activate snakemake\n```\n\n# Installation (pip)\n_from an environment with snakemake and pip installed_\n```bash\npip install snakemake-executor-plugin-pcluster-slurm\n```\n\n# Example Usage [daylily cluster headnode](https://github.com/Daylily-Informatics/daylily)\n```bash\nmkdir -p /fsx/resources/environments/containers/ubuntu/cache/\nexport SNAKEMAKE_OUTPUT_CACHE=/fsx/resources/environments/containers/ubuntu/cache/\nsnakemake --use-conda --use-singularity -j 10  --singularity-prefix /fsx/resources/environments/containers/ubuntu/ip-10-0-0-240/ --singularity-args \"  -B /tmp:/tmp -B /fsx:/fsx  -B /home/$USER:/home/$USER -B $PWD/:$PWD\" --conda-prefix /fsx/resources/environments/containers/ubuntu/ip-10-0-0-240/ --executor pcluster-slurm --default-resources slurm_partition='i64,i128,i192' --cache  --verbose -k\n```\n\n\n\n## What Partitions Are Available?\nUse `sinfo` to learn about your cluster (note, `sinfo` reports on all potential and active compute nodes. Read the docs to interpret which are active, which are not yet requested s\\\npot instances, etc). Below is what the [daylily AWS parallel cluster](https://github.com/Daylily-Informatics/daylily/blob/main/config/day_cluster/prod_cluster.yaml) looks like.\n\n```bash\nsinfo\nPARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST\ni8*          up   infinite     12  idle~ i8-dy-gb64-[1-12]\ni64          up   infinite     16  idle~ i64-dy-gb256-[1-8],i64-dy-gb512-[1-8]\ni96          up   infinite     16  idle~ i96-dy-gb384-[1-8],i96-dy-gb768-[1-8]\ni128         up   infinite     28  idle~ i128-dy-gb256-[1-8],i128-dy-gb512-[1-10],i128-dy-gb1024-[1-10]\ni192         up   infinite     30  idle~ i192-dy-gb384-[1-10],i192-dy-gb768-[1-10],i192-dy-gb1536-[1-10]\na192         up   infinite     30  idle~ a192-dy-gb384-[1-10],a192-dy-gb768-[1-10],a192-dy-gb1536-[1-10]\n```\n-  As I look at this, it is possible that if unset, the partition will default to `i8` in the output above. Maybe.\n\n  \n\n# Other Cool Stuff\n## Real Time Cost Tracking & Use Throttling via Budgets, Tagging ... and the `--comment` sbatch flag.\nI etensively make use of  [Cost allocation tags with AWS ParallelCluster](https://github.com/Daylily-Informatics/aws-parallelcluster-cost-allocation-tags) in the [daylily omics analysis framework](https://github.com/Daylily-Informatics/daylily?tab=readme-ov-file#daylily-aws-ephemeral-cluster-setup-0714) [_$3 30x WGS analysis_](https://github.com/Daylily-Informatics/daylily?tab=readme-ov-file#3-30x-fastq-bam-bamdeduplicated-snvvcfsvvcf-add-035-for-a-raft-of-qc-reports)  to track AWS cluster usage costs in realtime, and impose limits where appropriate (by user and project). This makes use of overriding the `--comment` flag to hold `project/budget` tags applied to ephemeral AWS resources, and thus enabling cost tracking/controls.\n\n* To change the\t--comment flag in v`0.0.8` of the pcluster-slurm plugin, set the comment flag value in the envvar `SMK_SLURM_COMMENT=RandD` (RandD is the default).\n \n \n \n \n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Snakemake executor plugin for submitting jobs to an AWS Parallel Cluster (pcluster) SLURM cluster.",
    "version": "0.0.31",
    "project_urls": {
        "Documentation": "https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/pcluster-slurm.NOTWRITTENYET.html",
        "Homepage": "https://github.com/Daylily-Informatics/snakemake-executor-plugin-pcluster-slurm",
        "Repository": "https://github.com/Daylily-Informatics/snakemake-executor-plugin-pcluster-slurm"
    },
    "split_keywords": [
        "snakemake",
        " plugin",
        " executor",
        " cluster",
        " slurm",
        " pcluster",
        " aws",
        " parallel-compute",
        " parallel-cluster"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "228a46f9cbd2cf38352d12f492be62316529ab8dc7855bcb06e4b62e33d3f895",
                "md5": "c3305d282d14f592782676bf8c976212",
                "sha256": "3bde77a2e97b622d090e30127856ec7b48f055a9791ddec9b96b2b731407d288"
            },
            "downloads": -1,
            "filename": "snakemake_executor_plugin_pcluster_slurm-0.0.31-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c3305d282d14f592782676bf8c976212",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.11",
            "size": 14426,
            "upload_time": "2024-10-22T13:03:05",
            "upload_time_iso_8601": "2024-10-22T13:03:05.488342Z",
            "url": "https://files.pythonhosted.org/packages/22/8a/46f9cbd2cf38352d12f492be62316529ab8dc7855bcb06e4b62e33d3f895/snakemake_executor_plugin_pcluster_slurm-0.0.31-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6fe60cf9baa76a46b2360e46d130a13d57092710eff64e3e1b7d3b5db9a0ad90",
                "md5": "936a98d03b75fe806815c37dd38f3ca6",
                "sha256": "b500d8f7400f40659ccca7e2421b5069108a0b3bb352ffe1c1dacbf1193d58bb"
            },
            "downloads": -1,
            "filename": "snakemake_executor_plugin_pcluster_slurm-0.0.31.tar.gz",
            "has_sig": false,
            "md5_digest": "936a98d03b75fe806815c37dd38f3ca6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.11",
            "size": 11770,
            "upload_time": "2024-10-22T13:03:07",
            "upload_time_iso_8601": "2024-10-22T13:03:07.066629Z",
            "url": "https://files.pythonhosted.org/packages/6f/e6/0cf9baa76a46b2360e46d130a13d57092710eff64e3e1b7d3b5db9a0ad90/snakemake_executor_plugin_pcluster_slurm-0.0.31.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-22 13:03:07",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Daylily-Informatics",
    "github_project": "snakemake-executor-plugin-pcluster-slurm",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "snakemake-executor-plugin-pcluster-slurm"
}
        
Elapsed time: 0.35943s