hydraMPP


NamehydraMPP JSON
Version 0.0.4 PyPI version JSON
download
home_pagehttps://github.com/raw-lab/hydraMPP
SummaryA simple yet powerfull library for Distributed Computing
upload_time2024-08-06 18:44:56
maintainerNone
docs_urlNone
authorJose L. Figueroa III
requires_python>=3.6
licenseBSD License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # HydraMPP

## A massive parallel processing library for distributed processing in Python

HydraMPP is a library to make it easyer to create scalable distributed parallel processing applications.  
It will function seamlessly from a single computer to a computing cluster environment with multiple nodes.

## Requirements

HydraMPP is designed to be lightweight and requires little dependencies.  

Python >= 3.6

## Install

### pip

HydraMPP can be easily installed from PyPi through pip:

```bash
pip install hydraMPP
```

If you don't have administrative permission and get an error, try the --user flag to install HydraMPP in your home folder.

```bash
pip install --user hydraMPP
```

### Anaconda

HydraMPP is available through the conda-forge channel on Anaconda.

```bash
conda install -c conda-forge HydraMPP
```

## Usage

### Step 1: Import the library

The HydraMPP library can be imported in python using:

```python
import hydraMPP
```

### Step 2: Tag methods

Methods or functions that you would like to use with HydraMPP for parallel processing need to be tagged:

```python
@hydraMPP.remote
def my_slow_function():
    time.sleep(10)
    return
```

### Step 3: Initialize the connection(s)

HydraMPP can run in 3 modes:

1. local
2. host
3. client

### Step 4: Call your methods

Once HydraMPP has been initialized, just call the method you would like with the .remote tag and the library will queue and dispatch when enough CPUs are available either locally or on another node in your setup.

### Step 5: Get return values

Use ```hydraMPP.wait``` to check the status of running jobs.  
It will return two lists. The first is a list of job IDs for the jobs that have finished and the second a list of jobs in queue or still running.  
  
Once jobs have finished running, use ```hydraMPP.get``` to get the return value and some stats on the job.  
The return value of ```hydraMPP.get``` is a list with the following values:  
  
1. Boolean value stating if the job has finished
2. The method name
3. The return value
4. Number of CPUs used for the job
5. Time to run the job, in seconds
6. The hostname of the node that the job ran on

## Status monitor

A script is included to monitor the status of HydraMPP while it is running.

```bash
usage: hydra-status.py [-h] [address] [port]

positional arguments:
  address     Address of the HydraMPP server to get status from [127.0.0.1]
  port        Port to connect to [24515]

options:
  -h, --help  show this help message and exit
```

This will query the status of HydraMPP and display some information on connected clients, available CPUs, and jobs in queue.  
It will immediately quit after displaying the status, for continuous monitoring use a tool like ```watch``` for this purpose.

```bash
watch -n1 hydra-status.py localhost
```

## SLURM

HydraMPP has a built in function to utilize a SLURM environment.  
  
All you need to do is add the flag ```--hydraMPP-slurm $SLURM_JOB_NODELIST``` when executing your python program and Hydra will take care of configuring the host/clients.  
  
make sure to call ```HydraMPP.init()``` once all required methods have been tagged with ```@HydraMPP.remote```  
  
The ```hydraMPP-cpus``` can be used to set the number of CPUs for each node to use. If set to '0' or omitted then HydraMPP will try to guess the number of CPUs available on each node.

```bash
#SBATCH --job-name=My_Slurm_Job
#SBATCH --nodes=3
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=36
#SBATCH --mem=100G
#SBATCH --time=1-0
#SBATCH -o slurm-%x-%j.out

echo "====================================================="
echo "Start Time  : $(date)"
echo "Submit Dir  : $SLURM_SUBMIT_DIR"
echo "Job ID/Name : $SLURM_JOBID / $SLURM_JOB_NAME"
echo "Node List   : $SLURM_JOB_NODELIST"
echo "Num Tasks   : $SLURM_NTASKS total [$SLURM_NNODES nodes @ $SLURM_CPUS_ON_NODE CPUs/node]"
echo "======================================================"
echo ""

path/to/program.py --custom-args --hydraMPP_slurm $SLURM_JOB_NODELIST --hydraMPP-cpus $SLURM_CPUS_ON_NODE

```

## CONTACT

The informatics point-of-contact for this project is [Dr. Richard Allen White III](https://github.com/raw-lab).  
If you have any questions or feedback, please feel free to get in touch by email.  
[Dr. Richard Allen White III](mailto:rwhit101@uncc.edu)  
[Jose Luis Figueroa III](mailto:jlfiguer@uncc.edu)  
Or [open an issue](https://github.com/raw-lab/hydrampp/issues).  

Copyright 2024 Richard Allen White III, Jose Luis Figueroa III

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors
   may be used to endorse or promote products derived from this software
   without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/raw-lab/hydraMPP",
    "name": "hydraMPP",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "Jose L. Figueroa III",
    "author_email": "jlfiguer@charlotte.edu",
    "download_url": "https://files.pythonhosted.org/packages/18/75/5d4f90085b5148882cc6bbc03497955639d52be613e1c4fd6e52be21adec/hydrampp-0.0.4.tar.gz",
    "platform": "Unix",
    "description": "# HydraMPP\n\n## A massive parallel processing library for distributed processing in Python\n\nHydraMPP is a library to make it easyer to create scalable distributed parallel processing applications.  \nIt will function seamlessly from a single computer to a computing cluster environment with multiple nodes.\n\n## Requirements\n\nHydraMPP is designed to be lightweight and requires little dependencies.  \n\nPython >= 3.6\n\n## Install\n\n### pip\n\nHydraMPP can be easily installed from PyPi through pip:\n\n```bash\npip install hydraMPP\n```\n\nIf you don't have administrative permission and get an error, try the --user flag to install HydraMPP in your home folder.\n\n```bash\npip install --user hydraMPP\n```\n\n### Anaconda\n\nHydraMPP is available through the conda-forge channel on Anaconda.\n\n```bash\nconda install -c conda-forge HydraMPP\n```\n\n## Usage\n\n### Step 1: Import the library\n\nThe HydraMPP library can be imported in python using:\n\n```python\nimport hydraMPP\n```\n\n### Step 2: Tag methods\n\nMethods or functions that you would like to use with HydraMPP for parallel processing need to be tagged:\n\n```python\n@hydraMPP.remote\ndef my_slow_function():\n    time.sleep(10)\n    return\n```\n\n### Step 3: Initialize the connection(s)\n\nHydraMPP can run in 3 modes:\n\n1. local\n2. host\n3. client\n\n### Step 4: Call your methods\n\nOnce HydraMPP has been initialized, just call the method you would like with the .remote tag and the library will queue and dispatch when enough CPUs are available either locally or on another node in your setup.\n\n### Step 5: Get return values\n\nUse ```hydraMPP.wait``` to check the status of running jobs.  \nIt will return two lists. The first is a list of job IDs for the jobs that have finished and the second a list of jobs in queue or still running.  \n  \nOnce jobs have finished running, use ```hydraMPP.get``` to get the return value and some stats on the job.  \nThe return value of ```hydraMPP.get``` is a list with the following values:  \n  \n1. Boolean value stating if the job has finished\n2. The method name\n3. The return value\n4. Number of CPUs used for the job\n5. Time to run the job, in seconds\n6. The hostname of the node that the job ran on\n\n## Status monitor\n\nA script is included to monitor the status of HydraMPP while it is running.\n\n```bash\nusage: hydra-status.py [-h] [address] [port]\n\npositional arguments:\n  address     Address of the HydraMPP server to get status from [127.0.0.1]\n  port        Port to connect to [24515]\n\noptions:\n  -h, --help  show this help message and exit\n```\n\nThis will query the status of HydraMPP and display some information on connected clients, available CPUs, and jobs in queue.  \nIt will immediately quit after displaying the status, for continuous monitoring use a tool like ```watch``` for this purpose.\n\n```bash\nwatch -n1 hydra-status.py localhost\n```\n\n## SLURM\n\nHydraMPP has a built in function to utilize a SLURM environment.  \n  \nAll you need to do is add the flag ```--hydraMPP-slurm $SLURM_JOB_NODELIST``` when executing your python program and Hydra will take care of configuring the host/clients.  \n  \nmake sure to call ```HydraMPP.init()``` once all required methods have been tagged with ```@HydraMPP.remote```  \n  \nThe ```hydraMPP-cpus``` can be used to set the number of CPUs for each node to use. If set to '0' or omitted then HydraMPP will try to guess the number of CPUs available on each node.\n\n```bash\n#SBATCH --job-name=My_Slurm_Job\n#SBATCH --nodes=3\n#SBATCH --tasks-per-node=1\n#SBATCH --cpus-per-task=36\n#SBATCH --mem=100G\n#SBATCH --time=1-0\n#SBATCH -o slurm-%x-%j.out\n\necho \"=====================================================\"\necho \"Start Time  : $(date)\"\necho \"Submit Dir  : $SLURM_SUBMIT_DIR\"\necho \"Job ID/Name : $SLURM_JOBID / $SLURM_JOB_NAME\"\necho \"Node List   : $SLURM_JOB_NODELIST\"\necho \"Num Tasks   : $SLURM_NTASKS total [$SLURM_NNODES nodes @ $SLURM_CPUS_ON_NODE CPUs/node]\"\necho \"======================================================\"\necho \"\"\n\npath/to/program.py --custom-args --hydraMPP_slurm $SLURM_JOB_NODELIST --hydraMPP-cpus $SLURM_CPUS_ON_NODE\n\n```\n\n## CONTACT\n\nThe informatics point-of-contact for this project is [Dr. Richard Allen White III](https://github.com/raw-lab).  \nIf you have any questions or feedback, please feel free to get in touch by email.  \n[Dr. Richard Allen White III](mailto:rwhit101@uncc.edu)  \n[Jose Luis Figueroa III](mailto:jlfiguer@uncc.edu)  \nOr [open an issue](https://github.com/raw-lab/hydrampp/issues).  \n\nCopyright 2024 Richard Allen White III, Jose Luis Figueroa III\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n1. Redistributions of source code must retain the above copyright notice, this\n   list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright notice,\n   this list of conditions and the following disclaimer in the documentation\n   and/or other materials provided with the distribution.\n\n3. Neither the name of the copyright holder nor the names of its contributors\n   may be used to endorse or promote products derived from this software\n   without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n",
    "bugtrack_url": null,
    "license": "BSD License",
    "summary": "A simple yet powerfull library for Distributed Computing",
    "version": "0.0.4",
    "project_urls": {
        "Homepage": "https://github.com/raw-lab/hydraMPP"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "427c6feb477b19bfaadda1bfdacf4311578a128972704d0c1e2c22b8f8fdc07a",
                "md5": "ee91b5f2d0f09865093d7c497589c36d",
                "sha256": "e8e8e9330b195cd23738f99e519564f69071d65d941f2f0e8346f9dce47b021e"
            },
            "downloads": -1,
            "filename": "hydraMPP-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ee91b5f2d0f09865093d7c497589c36d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 12251,
            "upload_time": "2024-08-06T18:44:55",
            "upload_time_iso_8601": "2024-08-06T18:44:55.311426Z",
            "url": "https://files.pythonhosted.org/packages/42/7c/6feb477b19bfaadda1bfdacf4311578a128972704d0c1e2c22b8f8fdc07a/hydraMPP-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "18755d4f90085b5148882cc6bbc03497955639d52be613e1c4fd6e52be21adec",
                "md5": "27177be3132cf9e99d8d02e5b072664b",
                "sha256": "7fce714e1c586d6daf2d606ff5f9074ee69007a358284eaebc95465e339779d4"
            },
            "downloads": -1,
            "filename": "hydrampp-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "27177be3132cf9e99d8d02e5b072664b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 10008,
            "upload_time": "2024-08-06T18:44:56",
            "upload_time_iso_8601": "2024-08-06T18:44:56.379176Z",
            "url": "https://files.pythonhosted.org/packages/18/75/5d4f90085b5148882cc6bbc03497955639d52be613e1c4fd6e52be21adec/hydrampp-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-06 18:44:56",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "raw-lab",
    "github_project": "hydraMPP",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "hydrampp"
}
        
Elapsed time: 0.29979s