# ml_scheduler
[![PyPI version](https://badge.fury.io/py/ml-scheduler.svg)](http://badge.fury.io/py/ml-scheduler)
[![License](https://img.shields.io/github/license/mashape/apistatus.svg)](https://pypi.python.org/pypi/ml_scheduler/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://timothycrosley.github.io/isort/)
<!--[![Test Status](https://github.com/huyiwen/ml_scheduler/workflows/Test/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ATest)
[![Lint Status](https://github.com/huyiwen/ml_scheduler/workflows/Lint/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ALint)
[![codecov](https://codecov.io/gh/huyiwen/ml_scheduler/branch/main/graph/badge.svg)](https://codecov.io/gh/huyiwen/ml_scheduler)
[![Join the chat at https://gitter.im/huyiwen/ml_scheduler](https://badges.gitter.im/huyiwen/ml_scheduler.svg)](https://gitter.im/huyiwen/ml_scheduler?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![Downloads](https://pepy.tech/badge/ml_scheduler)](https://pepy.tech/project/ml_scheduler)-->
[**ML Scheduler**](https://github.com/huyiwen/ml_scheduler/) is a lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.
## Quick Start
1. Install ml-scheduler
```bash
pip install ml-scheduler
```
or install from the github repository:
```bash
git clone https://github.com/huyiwen/ml_scheduler
cd ml_scheduler
pip install -e .
```
2. Create a Python script:
```python
cuda = ml_scheduler.pools.CUDAPool([0, 2], 90)
disk = ml_scheduler.pools.DiskPool('/one-fs')
@ml_scheduler.exp_func
async def mmlu(exp: ml_scheduler.Exp, model, checkpoint):
source_dir = f"/another-fs/model/{model}/checkpoint-{checkpoint}"
target_dir = f"/one-fs/model/{model}-{checkpoint}"
# resources will be cleaned up after exiting the function
disk_resource = await exp.get(
disk.copy_folder,
source_dir,
target_dir,
cleanup_target=True,
)
cuda_resource = await exp.get(cuda.allocate, 1)
# run inference
args = [
"python", "inference.py", "--model", target_dir, "--dataset", "mmlu", "--cuda", str(cuda_resource[0])
]
stdout = await exp.run(args=args)
await exp.report({'Accuracy', stdout})
mmlu.run_csv("experiments.csv", ['Accuracy'])
```
Mark the function with `@ml_scheduler.exp_func` and `async` to make it an experiment function. The function should take an `exp` argument as the first argument.
Then use `await exp.get` to get resources (non-blocking) and `await exp.run` to run the experiment (also non-blocking). Non-blocking means that when you can run multiple experiments concurrently.
3. Create a CSV file `experiments.csv` with your arguments (`model` and `checkpoint` in this case):
```csv
model,checkpoint
alpacaflan-packing,200
alpacaflan-packing,400
alpacaflan-qlora,200-merged
alpacaflan-qlora,400-merged
```
4. Run the script:
```bash
python run.py
```
The results (`Accuracy` in this case) and some other information will be saved in `results.csv`.
## More Examples
- [Copy and run](/examples/copy_and_run)
Raw data
{
"_id": null,
"home_page": null,
"name": "ml-scheduler",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "artificial intelligence, async, large language model, machine learning, scheduler",
"author": null,
"author_email": "Yiwen Hu <1020030101@qq.com>",
"download_url": "https://files.pythonhosted.org/packages/97/fc/bd8a3ee9af5bf416d1d659d56ffb2959d3a24633dea6665d4dd84cdd90a7/ml_scheduler-1.2.0.tar.gz",
"platform": null,
"description": "# ml_scheduler\n\n[![PyPI version](https://badge.fury.io/py/ml-scheduler.svg)](http://badge.fury.io/py/ml-scheduler)\n[![License](https://img.shields.io/github/license/mashape/apistatus.svg)](https://pypi.python.org/pypi/ml_scheduler/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://timothycrosley.github.io/isort/)\n<!--[![Test Status](https://github.com/huyiwen/ml_scheduler/workflows/Test/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ATest)\n[![Lint Status](https://github.com/huyiwen/ml_scheduler/workflows/Lint/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ALint)\n[![codecov](https://codecov.io/gh/huyiwen/ml_scheduler/branch/main/graph/badge.svg)](https://codecov.io/gh/huyiwen/ml_scheduler)\n[![Join the chat at https://gitter.im/huyiwen/ml_scheduler](https://badges.gitter.im/huyiwen/ml_scheduler.svg)](https://gitter.im/huyiwen/ml_scheduler?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)\n[![Downloads](https://pepy.tech/badge/ml_scheduler)](https://pepy.tech/project/ml_scheduler)-->\n\n\n[**ML Scheduler**](https://github.com/huyiwen/ml_scheduler/) is a lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.\n\n## Quick Start\n\n1. Install ml-scheduler\n\n```bash\npip install ml-scheduler\n```\n\nor install from the github repository:\n\n```bash\ngit clone https://github.com/huyiwen/ml_scheduler\ncd ml_scheduler\npip install -e .\n```\n\n2. Create a Python script:\n\n```python\ncuda = ml_scheduler.pools.CUDAPool([0, 2], 90)\ndisk = ml_scheduler.pools.DiskPool('/one-fs')\n\n\n@ml_scheduler.exp_func\nasync def mmlu(exp: ml_scheduler.Exp, model, checkpoint):\n\n source_dir = f\"/another-fs/model/{model}/checkpoint-{checkpoint}\"\n target_dir = f\"/one-fs/model/{model}-{checkpoint}\"\n\n # resources will be cleaned up after exiting the function\n disk_resource = await exp.get(\n disk.copy_folder,\n source_dir,\n target_dir,\n cleanup_target=True,\n )\n cuda_resource = await exp.get(cuda.allocate, 1)\n\n # run inference\n args = [\n \"python\", \"inference.py\", \"--model\", target_dir, \"--dataset\", \"mmlu\", \"--cuda\", str(cuda_resource[0])\n ]\n stdout = await exp.run(args=args)\n await exp.report({'Accuracy', stdout})\n\n\nmmlu.run_csv(\"experiments.csv\", ['Accuracy'])\n```\n\nMark the function with `@ml_scheduler.exp_func` and `async` to make it an experiment function. The function should take an `exp` argument as the first argument.\n\nThen use `await exp.get` to get resources (non-blocking) and `await exp.run` to run the experiment (also non-blocking). Non-blocking means that when you can run multiple experiments concurrently.\n\n3. Create a CSV file `experiments.csv` with your arguments (`model` and `checkpoint` in this case):\n\n```csv\nmodel,checkpoint\nalpacaflan-packing,200\nalpacaflan-packing,400\nalpacaflan-qlora,200-merged\nalpacaflan-qlora,400-merged\n```\n\n4. Run the script:\n\n```bash\npython run.py\n```\n\nThe results (`Accuracy` in this case) and some other information will be saved in `results.csv`.\n\n## More Examples\n\n- [Copy and run](/examples/copy_and_run)\n",
"bugtrack_url": null,
"license": null,
"summary": "A lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.",
"version": "1.2.0",
"project_urls": null,
"split_keywords": [
"artificial intelligence",
" async",
" large language model",
" machine learning",
" scheduler"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d2a5e0a50ce29145f0bd7d12f032a35a0a7070c1c81b00bdf5aeb183d2652e3a",
"md5": "ef3fb50d45adb2639c6d560cd43f79f4",
"sha256": "d63a56cc329b6b555a663ed3a8bea843d9fd8fe50d808338fc70922c28a123f8"
},
"downloads": -1,
"filename": "ml_scheduler-1.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ef3fb50d45adb2639c6d560cd43f79f4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 14385,
"upload_time": "2024-06-29T14:35:49",
"upload_time_iso_8601": "2024-06-29T14:35:49.920874Z",
"url": "https://files.pythonhosted.org/packages/d2/a5/e0a50ce29145f0bd7d12f032a35a0a7070c1c81b00bdf5aeb183d2652e3a/ml_scheduler-1.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "97fcbd8a3ee9af5bf416d1d659d56ffb2959d3a24633dea6665d4dd84cdd90a7",
"md5": "b741e7e1ebfd00b3d2563f4e93979eeb",
"sha256": "8c55b4d6d5a5d88f34f132e67b60c4eda7bf66bbd5f6d813ca6664ed798b9ee7"
},
"downloads": -1,
"filename": "ml_scheduler-1.2.0.tar.gz",
"has_sig": false,
"md5_digest": "b741e7e1ebfd00b3d2563f4e93979eeb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 1768577,
"upload_time": "2024-06-29T14:35:53",
"upload_time_iso_8601": "2024-06-29T14:35:53.548067Z",
"url": "https://files.pythonhosted.org/packages/97/fc/bd8a3ee9af5bf416d1d659d56ffb2959d3a24633dea6665d4dd84cdd90a7/ml_scheduler-1.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-29 14:35:53",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "ml-scheduler"
}