| Name | octorun JSON |
| Version |
0.2.1
JSON |
| download |
| home_page | None |
| Summary | A command-line tool for distributed parallel execution across multiple GPUs |
| upload_time | 2025-10-25 20:26:36 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.10 |
| license | MIT |
| keywords |
cli
deep-learning
distributed
gpu
parallel
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
<div align="center">
# 🐙 OctoRun
**Distributed Parallel Execution Made Simple**
*A powerful command-line tool for running Python scripts across multiple GPUs with intelligent task management and monitoring*
[](https://pypi.org/project/octorun/)
[](https://www.python.org/downloads/)
[](https://developer.nvidia.com/cuda-downloads)
[](LICENSE)
[](https://github.com/HarborYuan/OctoRun/actions)
---
</div>
## 📋 Overview
**OctoRun** is designed to help you run computationally intensive Python scripts across multiple GPUs efficiently. It automatically manages GPU allocation, chunks your workload, handles failures with retry mechanisms, and provides comprehensive monitoring and logging.
## ✨ Key Features
- 🔍 **Automatic GPU Detection**: Automatically detects and utilizes available GPUs
- 🧩 **Intelligent Chunk Management**: Divides work into chunks and distributes across GPUs
- 🔄 **Failure Recovery**: Automatic retry mechanism for failed chunks
- 📊 **Comprehensive Logging**: Detailed logging for monitoring and debugging
- ⚙️ **Flexible Configuration**: JSON-based configuration with CLI overrides
- 🎯 **Kwargs Support**: Pass custom arguments to your scripts via config or CLI
- 💾 **Memory Monitoring**: Monitor GPU memory usage and thresholds
- 🔒 **Lock Management**: Prevent duplicate processing of chunks
## 🚀 Installation
You can install OctoRun using `pip` or `uv`.
### Via pip
```bash
pip install octorun
```
### Via uv
```bash
# Install globally
uv tool install octorun
# Install in your project
uv add octorun
```
### Optional extras
- Benchmark tooling: `pip install "octorun[benchmark]"` (installs PyTorch with CUDA support)
## ⚡ Quick Start
1. **Create Configuration**:
```bash
octorun save_config --script ./your_script.py
```
2. **Run Your Script**:
```bash
octorun run
```
3. **Monitor GPUs**:
```bash
octorun list_gpus -d
```
## 🎮 Commands
### `run` (r)
Run your script with the specified configuration.
```bash
octorun run --config config.json [--kwargs '{"key": "value"}']
```
### `save_config` (s)
Generate a default configuration file.
```bash
octorun save_config --script ./your_script.py
```
### `list_gpus` (l)
List available GPUs and their current usage.
```bash
octorun list_gpus [--detailed]
```
The `detailed` flag provides a more comprehensive view of GPU stats, including memory usage, temperature, and running processes.
### `benchmark` (b)
Run a benchmark to determine the optimal number of parallel processes for your GPUs.
```bash
octorun benchmark
```
This command runs a series of tests to help you configure the `gpus` parameter in your `config.json` for the best performance.
Requires the optional benchmark extra (`pip install "octorun[benchmark]"`) so PyTorch is available.
## ⚙️ Configuration
OctoRun uses a `config.json` file for configuration. You can generate a default one with `octorun save_config`.
| Option | Description | Default |
| ------------------ | -------------------------------------------- | -------------- |
| `script_path` | Path to your Python script | - |
| `gpus` | "auto" or list of GPU IDs | "auto" |
| `total_chunks` | Number of chunks to divide work into | 128 |
| `log_dir` | Directory for log files | "./logs" |
| `chunk_lock_dir` | Directory for chunk lock files | "./logs/locks" |
| `monitor_interval` | Monitoring interval in seconds | 60 |
| `restart_failed` | Whether to restart failed processes | false |
| `max_retries` | Maximum retries for failed chunks | 3 |
| `memory_threshold` | Memory threshold percentage | 90 |
| `kwargs` | Custom arguments to pass to your script | {} |
## 🎯 Using Kwargs
You can pass custom arguments to your script via the `kwargs` object in your `config.json` or directly through the CLI.
**CLI kwargs will override config file kwargs.**
```bash
octorun run --kwargs '{"batch_size": 128, "learning_rate": 0.005}'
```
## 🔧 Script Implementation
Your script must accept the following arguments:
- `--gpu_id`: GPU device ID (int)
- `--chunk_id`: Current chunk number (int)
- `--total_chunks`: Total number of chunks (int)
Here is an example of how to structure your script:
```python
import argparse
import torch
def main():
parser = argparse.ArgumentParser()
# Required OctoRun arguments
parser.add_argument('--gpu_id', type=int, required=True)
parser.add_argument('--chunk_id', type=int, required=True)
parser.add_argument('--total_chunks', type=int, required=True)
# Your custom arguments
parser.add_argument('--batch_size', type=int, default=32)
parser.add_argument('--learning_rate', type=float, default=0.001)
parser.add_argument('--model_type', type=str, default='default')
parser.add_argument('--epochs', type=int, default=1)
parser.add_argument('--output_dir', type=str, default='./output')
args = parser.parse_args()
# Set the GPU device
if torch.cuda.is_available():
torch.cuda.set_device(args.gpu_id)
print(f"Using GPU {args.gpu_id}")
print(f"Processing chunk {args.chunk_id}/{args.total_chunks}")
# Your logic here
if __name__ == "__main__":
main()
```
## 🤝 Contributing
Contributions are welcome! Please fork the repository, create a feature branch, and submit a pull request.
## 📄 License
This project is licensed under the **MIT License**.
Raw data
{
"_id": null,
"home_page": null,
"name": "octorun",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Haobo Yuan <haoboyuan@ucmerced.edu>",
"keywords": "cli, deep-learning, distributed, gpu, parallel",
"author": null,
"author_email": "Haobo Yuan <haoboyuan@ucmerced.edu>",
"download_url": "https://files.pythonhosted.org/packages/e2/46/50db1fb43a392e68ca2f48d88776a828a866b5fdb22e91e38a6d0af54b23/octorun-0.2.1.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n\n# \ud83d\udc19 OctoRun\n\n**Distributed Parallel Execution Made Simple**\n\n*A powerful command-line tool for running Python scripts across multiple GPUs with intelligent task management and monitoring*\n\n[](https://pypi.org/project/octorun/)\n[](https://www.python.org/downloads/)\n[](https://developer.nvidia.com/cuda-downloads)\n[](LICENSE)\n[](https://github.com/HarborYuan/OctoRun/actions)\n\n---\n\n</div>\n\n## \ud83d\udccb Overview\n\n**OctoRun** is designed to help you run computationally intensive Python scripts across multiple GPUs efficiently. It automatically manages GPU allocation, chunks your workload, handles failures with retry mechanisms, and provides comprehensive monitoring and logging.\n\n## \u2728 Key Features\n\n- \ud83d\udd0d **Automatic GPU Detection**: Automatically detects and utilizes available GPUs\n- \ud83e\udde9 **Intelligent Chunk Management**: Divides work into chunks and distributes across GPUs\n- \ud83d\udd04 **Failure Recovery**: Automatic retry mechanism for failed chunks\n- \ud83d\udcca **Comprehensive Logging**: Detailed logging for monitoring and debugging\n- \u2699\ufe0f **Flexible Configuration**: JSON-based configuration with CLI overrides\n- \ud83c\udfaf **Kwargs Support**: Pass custom arguments to your scripts via config or CLI\n- \ud83d\udcbe **Memory Monitoring**: Monitor GPU memory usage and thresholds\n- \ud83d\udd12 **Lock Management**: Prevent duplicate processing of chunks\n\n## \ud83d\ude80 Installation\n\nYou can install OctoRun using `pip` or `uv`.\n\n### Via pip\n```bash\npip install octorun\n```\n\n### Via uv\n```bash\n# Install globally\nuv tool install octorun\n\n# Install in your project\nuv add octorun\n```\n\n### Optional extras\n- Benchmark tooling: `pip install \"octorun[benchmark]\"` (installs PyTorch with CUDA support)\n\n## \u26a1 Quick Start\n\n1. **Create Configuration**:\n ```bash\n octorun save_config --script ./your_script.py\n ```\n\n2. **Run Your Script**:\n ```bash\n octorun run\n ```\n\n3. **Monitor GPUs**:\n ```bash\n octorun list_gpus -d\n ```\n\n## \ud83c\udfae Commands\n\n### `run` (r)\n\nRun your script with the specified configuration.\n\n```bash\noctorun run --config config.json [--kwargs '{\"key\": \"value\"}']\n```\n\n### `save_config` (s)\n\nGenerate a default configuration file.\n\n```bash\noctorun save_config --script ./your_script.py\n```\n\n### `list_gpus` (l)\n\nList available GPUs and their current usage.\n\n```bash\noctorun list_gpus [--detailed]\n```\n\nThe `detailed` flag provides a more comprehensive view of GPU stats, including memory usage, temperature, and running processes.\n\n### `benchmark` (b)\n\nRun a benchmark to determine the optimal number of parallel processes for your GPUs.\n\n```bash\noctorun benchmark\n```\n\nThis command runs a series of tests to help you configure the `gpus` parameter in your `config.json` for the best performance.\nRequires the optional benchmark extra (`pip install \"octorun[benchmark]\"`) so PyTorch is available.\n\n## \u2699\ufe0f Configuration\n\nOctoRun uses a `config.json` file for configuration. You can generate a default one with `octorun save_config`.\n\n| Option | Description | Default |\n| ------------------ | -------------------------------------------- | -------------- |\n| `script_path` | Path to your Python script | - |\n| `gpus` | \"auto\" or list of GPU IDs | \"auto\" |\n| `total_chunks` | Number of chunks to divide work into | 128 |\n| `log_dir` | Directory for log files | \"./logs\" |\n| `chunk_lock_dir` | Directory for chunk lock files | \"./logs/locks\" |\n| `monitor_interval` | Monitoring interval in seconds | 60 |\n| `restart_failed` | Whether to restart failed processes | false |\n| `max_retries` | Maximum retries for failed chunks | 3 |\n| `memory_threshold` | Memory threshold percentage | 90 |\n| `kwargs` | Custom arguments to pass to your script | {} |\n\n## \ud83c\udfaf Using Kwargs\n\nYou can pass custom arguments to your script via the `kwargs` object in your `config.json` or directly through the CLI.\n\n**CLI kwargs will override config file kwargs.**\n\n```bash\noctorun run --kwargs '{\"batch_size\": 128, \"learning_rate\": 0.005}'\n```\n\n## \ud83d\udd27 Script Implementation\n\nYour script must accept the following arguments:\n\n- `--gpu_id`: GPU device ID (int)\n- `--chunk_id`: Current chunk number (int)\n- `--total_chunks`: Total number of chunks (int)\n\nHere is an example of how to structure your script:\n\n```python\nimport argparse\nimport torch\n\ndef main():\n parser = argparse.ArgumentParser()\n \n # Required OctoRun arguments\n parser.add_argument('--gpu_id', type=int, required=True)\n parser.add_argument('--chunk_id', type=int, required=True)\n parser.add_argument('--total_chunks', type=int, required=True)\n \n # Your custom arguments\n parser.add_argument('--batch_size', type=int, default=32)\n parser.add_argument('--learning_rate', type=float, default=0.001)\n parser.add_argument('--model_type', type=str, default='default')\n parser.add_argument('--epochs', type=int, default=1)\n parser.add_argument('--output_dir', type=str, default='./output')\n \n args = parser.parse_args()\n \n # Set the GPU device\n if torch.cuda.is_available():\n torch.cuda.set_device(args.gpu_id)\n print(f\"Using GPU {args.gpu_id}\")\n \n print(f\"Processing chunk {args.chunk_id}/{args.total_chunks}\")\n \n # Your logic here\n\nif __name__ == \"__main__\":\n main()\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please fork the repository, create a feature branch, and submit a pull request.\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the **MIT License**.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A command-line tool for distributed parallel execution across multiple GPUs",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/HarborYuan/OctoRun",
"Repository": "https://github.com/HarborYuan/OctoRun"
},
"split_keywords": [
"cli",
" deep-learning",
" distributed",
" gpu",
" parallel"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9a7f9cc94bc38f1b111a8dd8eec6cde6eaf41df464b49daff88134a503ec65f3",
"md5": "b106d8097832982c0345c0adeaf36cb4",
"sha256": "240fa6d85a73db0d99d4da3bfff6c40c2d6df785d5e9b0dc9524a3073db431f6"
},
"downloads": -1,
"filename": "octorun-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b106d8097832982c0345c0adeaf36cb4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 20222,
"upload_time": "2025-10-25T20:26:35",
"upload_time_iso_8601": "2025-10-25T20:26:35.392407Z",
"url": "https://files.pythonhosted.org/packages/9a/7f/9cc94bc38f1b111a8dd8eec6cde6eaf41df464b49daff88134a503ec65f3/octorun-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e24650db1fb43a392e68ca2f48d88776a828a866b5fdb22e91e38a6d0af54b23",
"md5": "27f254174c631bfae0c2113dc934f3eb",
"sha256": "8a45abce1f2ccd709a1cd50016565551300839063ded5bb690186c12a1d209b3"
},
"downloads": -1,
"filename": "octorun-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "27f254174c631bfae0c2113dc934f3eb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 44336,
"upload_time": "2025-10-25T20:26:36",
"upload_time_iso_8601": "2025-10-25T20:26:36.981338Z",
"url": "https://files.pythonhosted.org/packages/e2/46/50db1fb43a392e68ca2f48d88776a828a866b5fdb22e91e38a6d0af54b23/octorun-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-25 20:26:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "HarborYuan",
"github_project": "OctoRun",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "octorun"
}