| Name | dask4dvc JSON | 
            
| Version | 
                  0.2.3
                   
                  JSON | 
            
 | download  | 
            
| home_page |   | 
            
| Summary | Use dask to run the DVC graph | 
            | upload_time | 2023-04-28 12:30:16 | 
            | maintainer |  | 
            
            | docs_url | None | 
            | author | zincwarecode | 
            
            | requires_python | >=3.8,<4.0 | 
            
            
            | license | Apache-2.0 | 
            | keywords | 
                
                    data-science
                
                    hpc
                
                    dask
                
                    dvc
                 | 
            | VCS | 
                
                    | 
                
            
            | bugtrack_url | 
                
                 | 
             
            
            | requirements | 
                
                  No requirements were recorded.
                
             | 
            
| Travis-CI | 
                
                   No Travis.
                
             | 
            | coveralls test coverage | 
                
                   No coveralls.
                
             | 
        
        
            
            [](https://coveralls.io/github/zincware/dask4dvc?branch=main)

[](https://badge.fury.io/py/dask4dvc)
[](https://github.com/zincware)
# Dask4DVC - Distributed Node Exectuion
[DVC](dvc.org) provides tools for building and executing the computational graph locally through various methods. 
The `dask4dvc` package combines [Dask Distributed](https://distributed.dask.org/) with DVC to make it easier to use with HPC managers like [Slurm](https://github.com/SchedMD/slurm).
The `dask4dvc repro` package will run the DVC graph in parallel where possible.
Currently, `dask4dvc run` will not run stages per experiment sequentially.
> :warning: This is an experimental package **not** affiliated in any way with iterative or DVC.
## Usage
Dask4DVC provides a CLI similar to DVC.
- `dvc repro` becomes `dask4dvc repro`.
- `dvc queue start` becomes `dask4dvc run`
You can follow the progress using `dask4dvc <cmd> --dashboard`.
### SLURM Cluster
You can use `dask4dvc` easily with a slurm cluster.
This requires a running dask scheduler:
```python
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(
    cores=1, memory='128GB',
    queue="gpu",
    processes=1,
    walltime='8:00:00',
    job_cpu=1,
    job_extra=['-N 1', '--cpus-per-task=1', '--tasks-per-node=64', "--gres=gpu:1"],
    scheduler_options={"port": 31415}
)
cluster.adapt()
```
with this setup you can then run `dask4dvc repro --address 127.0.0.1:31415` on the example port `31415`.
You can also use config files with `dask4dvc repro --config myconfig.yaml`.
All `dask.distributed` Clusters should be supported.
```yaml
default:
  SGECluster:
    queue: regular
    cores: 10
    memory: 16 GB
```

            
         
        Raw data
        
            {
    "_id": null,
    "home_page": "",
    "name": "dask4dvc",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "data-science,HPC,dask,DVC",
    "author": "zincwarecode",
    "author_email": "zincwarecode@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f7/b6/18163d26a00668f314f1d3c3146ec93fa9f1fe78e0253ae0cad0216c4a3a/dask4dvc-0.2.3.tar.gz",
    "platform": null,
    "description": "[](https://coveralls.io/github/zincware/dask4dvc?branch=main)\n\n[](https://badge.fury.io/py/dask4dvc)\n[](https://github.com/zincware)\n\n# Dask4DVC - Distributed Node Exectuion\n[DVC](dvc.org) provides tools for building and executing the computational graph locally through various methods. \nThe `dask4dvc` package combines [Dask Distributed](https://distributed.dask.org/) with DVC to make it easier to use with HPC managers like [Slurm](https://github.com/SchedMD/slurm).\n\nThe `dask4dvc repro` package will run the DVC graph in parallel where possible.\nCurrently, `dask4dvc run` will not run stages per experiment sequentially.\n\n> :warning: This is an experimental package **not** affiliated in any way with iterative or DVC.\n\n## Usage\nDask4DVC provides a CLI similar to DVC.\n\n- `dvc repro` becomes `dask4dvc repro`.\n- `dvc queue start` becomes `dask4dvc run`\n\nYou can follow the progress using `dask4dvc <cmd> --dashboard`.\n\n\n### SLURM Cluster\n\nYou can use `dask4dvc` easily with a slurm cluster.\nThis requires a running dask scheduler:\n```python\nfrom dask_jobqueue import SLURMCluster\n\ncluster = SLURMCluster(\n    cores=1, memory='128GB',\n    queue=\"gpu\",\n    processes=1,\n    walltime='8:00:00',\n    job_cpu=1,\n    job_extra=['-N 1', '--cpus-per-task=1', '--tasks-per-node=64', \"--gres=gpu:1\"],\n    scheduler_options={\"port\": 31415}\n)\ncluster.adapt()\n```\n\nwith this setup you can then run `dask4dvc repro --address 127.0.0.1:31415` on the example port `31415`.\n\nYou can also use config files with `dask4dvc repro --config myconfig.yaml`.\nAll `dask.distributed` Clusters should be supported.\n\n```yaml\ndefault:\n  SGECluster:\n    queue: regular\n    cores: 10\n    memory: 16 GB\n```\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Use dask to run the DVC graph",
    "version": "0.2.3",
    "split_keywords": [
        "data-science",
        "hpc",
        "dask",
        "dvc"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "17860e1d09f8e95893fe21b2715eaccbbc287baa7650dc7fe079172827d136ae",
                "md5": "cb7a3a80f311e9537bd9ba651debf1b7",
                "sha256": "e5adf2f493794d8f5750d32ce3ed834859d2826491d3c18c987b267b553ddc82"
            },
            "downloads": -1,
            "filename": "dask4dvc-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cb7a3a80f311e9537bd9ba651debf1b7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 12831,
            "upload_time": "2023-04-28T12:30:14",
            "upload_time_iso_8601": "2023-04-28T12:30:14.682083Z",
            "url": "https://files.pythonhosted.org/packages/17/86/0e1d09f8e95893fe21b2715eaccbbc287baa7650dc7fe079172827d136ae/dask4dvc-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f7b618163d26a00668f314f1d3c3146ec93fa9f1fe78e0253ae0cad0216c4a3a",
                "md5": "213e2c010bfddc9491583f0512e13472",
                "sha256": "de696c0c9e79f5583a4352434bee41f321113bb19f8a7303fa3627a82bc3accb"
            },
            "downloads": -1,
            "filename": "dask4dvc-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "213e2c010bfddc9491583f0512e13472",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 11232,
            "upload_time": "2023-04-28T12:30:16",
            "upload_time_iso_8601": "2023-04-28T12:30:16.370058Z",
            "url": "https://files.pythonhosted.org/packages/f7/b6/18163d26a00668f314f1d3c3146ec93fa9f1fe78e0253ae0cad0216c4a3a/dask4dvc-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-28 12:30:16",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "dask4dvc"
}