ml-scheduler


Nameml-scheduler JSON
Version 1.2.0 PyPI version JSON
download
home_pageNone
SummaryA lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.
upload_time2024-06-29 14:35:53
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords artificial intelligence async large language model machine learning scheduler
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ml_scheduler

[![PyPI version](https://badge.fury.io/py/ml-scheduler.svg)](http://badge.fury.io/py/ml-scheduler)
[![License](https://img.shields.io/github/license/mashape/apistatus.svg)](https://pypi.python.org/pypi/ml_scheduler/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://timothycrosley.github.io/isort/)
<!--[![Test Status](https://github.com/huyiwen/ml_scheduler/workflows/Test/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ATest)
[![Lint Status](https://github.com/huyiwen/ml_scheduler/workflows/Lint/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ALint)
[![codecov](https://codecov.io/gh/huyiwen/ml_scheduler/branch/main/graph/badge.svg)](https://codecov.io/gh/huyiwen/ml_scheduler)
[![Join the chat at https://gitter.im/huyiwen/ml_scheduler](https://badges.gitter.im/huyiwen/ml_scheduler.svg)](https://gitter.im/huyiwen/ml_scheduler?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[![Downloads](https://pepy.tech/badge/ml_scheduler)](https://pepy.tech/project/ml_scheduler)-->


[**ML Scheduler**](https://github.com/huyiwen/ml_scheduler/) is a lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.

## Quick Start

1. Install ml-scheduler

```bash
pip install ml-scheduler
```

or install from the github repository:

```bash
git clone https://github.com/huyiwen/ml_scheduler
cd ml_scheduler
pip install -e .
```

2. Create a Python script:

```python
cuda = ml_scheduler.pools.CUDAPool([0, 2], 90)
disk = ml_scheduler.pools.DiskPool('/one-fs')


@ml_scheduler.exp_func
async def mmlu(exp: ml_scheduler.Exp, model, checkpoint):

    source_dir = f"/another-fs/model/{model}/checkpoint-{checkpoint}"
    target_dir = f"/one-fs/model/{model}-{checkpoint}"

    # resources will be cleaned up after exiting the function
    disk_resource = await exp.get(
        disk.copy_folder,
        source_dir,
        target_dir,
        cleanup_target=True,
    )
    cuda_resource = await exp.get(cuda.allocate, 1)

    # run inference
    args = [
        "python", "inference.py", "--model", target_dir, "--dataset", "mmlu", "--cuda",  str(cuda_resource[0])
    ]
    stdout = await exp.run(args=args)
    await exp.report({'Accuracy', stdout})


mmlu.run_csv("experiments.csv", ['Accuracy'])
```

Mark the function with `@ml_scheduler.exp_func` and `async` to make it an experiment function. The function should take an `exp` argument as the first argument.

Then use `await exp.get` to get resources (non-blocking) and `await exp.run` to run the experiment (also non-blocking). Non-blocking means that when you can run multiple experiments concurrently.

3. Create a CSV file `experiments.csv` with your arguments (`model` and `checkpoint` in this case):

```csv
model,checkpoint
alpacaflan-packing,200
alpacaflan-packing,400
alpacaflan-qlora,200-merged
alpacaflan-qlora,400-merged
```

4. Run the script:

```bash
python run.py
```

The results (`Accuracy` in this case) and some other information will be saved in `results.csv`.

## More Examples

- [Copy and run](/examples/copy_and_run)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ml-scheduler",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "artificial intelligence, async, large language model, machine learning, scheduler",
    "author": null,
    "author_email": "Yiwen Hu <1020030101@qq.com>",
    "download_url": "https://files.pythonhosted.org/packages/97/fc/bd8a3ee9af5bf416d1d659d56ffb2959d3a24633dea6665d4dd84cdd90a7/ml_scheduler-1.2.0.tar.gz",
    "platform": null,
    "description": "# ml_scheduler\n\n[![PyPI version](https://badge.fury.io/py/ml-scheduler.svg)](http://badge.fury.io/py/ml-scheduler)\n[![License](https://img.shields.io/github/license/mashape/apistatus.svg)](https://pypi.python.org/pypi/ml_scheduler/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://timothycrosley.github.io/isort/)\n<!--[![Test Status](https://github.com/huyiwen/ml_scheduler/workflows/Test/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ATest)\n[![Lint Status](https://github.com/huyiwen/ml_scheduler/workflows/Lint/badge.svg?branch=develop)](https://github.com/huyiwen/ml_scheduler/actions?query=workflow%3ALint)\n[![codecov](https://codecov.io/gh/huyiwen/ml_scheduler/branch/main/graph/badge.svg)](https://codecov.io/gh/huyiwen/ml_scheduler)\n[![Join the chat at https://gitter.im/huyiwen/ml_scheduler](https://badges.gitter.im/huyiwen/ml_scheduler.svg)](https://gitter.im/huyiwen/ml_scheduler?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)\n[![Downloads](https://pepy.tech/badge/ml_scheduler)](https://pepy.tech/project/ml_scheduler)-->\n\n\n[**ML Scheduler**](https://github.com/huyiwen/ml_scheduler/) is a lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.\n\n## Quick Start\n\n1. Install ml-scheduler\n\n```bash\npip install ml-scheduler\n```\n\nor install from the github repository:\n\n```bash\ngit clone https://github.com/huyiwen/ml_scheduler\ncd ml_scheduler\npip install -e .\n```\n\n2. Create a Python script:\n\n```python\ncuda = ml_scheduler.pools.CUDAPool([0, 2], 90)\ndisk = ml_scheduler.pools.DiskPool('/one-fs')\n\n\n@ml_scheduler.exp_func\nasync def mmlu(exp: ml_scheduler.Exp, model, checkpoint):\n\n    source_dir = f\"/another-fs/model/{model}/checkpoint-{checkpoint}\"\n    target_dir = f\"/one-fs/model/{model}-{checkpoint}\"\n\n    # resources will be cleaned up after exiting the function\n    disk_resource = await exp.get(\n        disk.copy_folder,\n        source_dir,\n        target_dir,\n        cleanup_target=True,\n    )\n    cuda_resource = await exp.get(cuda.allocate, 1)\n\n    # run inference\n    args = [\n        \"python\", \"inference.py\", \"--model\", target_dir, \"--dataset\", \"mmlu\", \"--cuda\",  str(cuda_resource[0])\n    ]\n    stdout = await exp.run(args=args)\n    await exp.report({'Accuracy', stdout})\n\n\nmmlu.run_csv(\"experiments.csv\", ['Accuracy'])\n```\n\nMark the function with `@ml_scheduler.exp_func` and `async` to make it an experiment function. The function should take an `exp` argument as the first argument.\n\nThen use `await exp.get` to get resources (non-blocking) and `await exp.run` to run the experiment (also non-blocking). Non-blocking means that when you can run multiple experiments concurrently.\n\n3. Create a CSV file `experiments.csv` with your arguments (`model` and `checkpoint` in this case):\n\n```csv\nmodel,checkpoint\nalpacaflan-packing,200\nalpacaflan-packing,400\nalpacaflan-qlora,200-merged\nalpacaflan-qlora,400-merged\n```\n\n4. Run the script:\n\n```bash\npython run.py\n```\n\nThe results (`Accuracy` in this case) and some other information will be saved in `results.csv`.\n\n## More Examples\n\n- [Copy and run](/examples/copy_and_run)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A lightweight machine learning experiment scheduler that automates resource management (e.g., GPUs and models) and batch runs experiments with just a few lines of Python code.",
    "version": "1.2.0",
    "project_urls": null,
    "split_keywords": [
        "artificial intelligence",
        " async",
        " large language model",
        " machine learning",
        " scheduler"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d2a5e0a50ce29145f0bd7d12f032a35a0a7070c1c81b00bdf5aeb183d2652e3a",
                "md5": "ef3fb50d45adb2639c6d560cd43f79f4",
                "sha256": "d63a56cc329b6b555a663ed3a8bea843d9fd8fe50d808338fc70922c28a123f8"
            },
            "downloads": -1,
            "filename": "ml_scheduler-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ef3fb50d45adb2639c6d560cd43f79f4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 14385,
            "upload_time": "2024-06-29T14:35:49",
            "upload_time_iso_8601": "2024-06-29T14:35:49.920874Z",
            "url": "https://files.pythonhosted.org/packages/d2/a5/e0a50ce29145f0bd7d12f032a35a0a7070c1c81b00bdf5aeb183d2652e3a/ml_scheduler-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "97fcbd8a3ee9af5bf416d1d659d56ffb2959d3a24633dea6665d4dd84cdd90a7",
                "md5": "b741e7e1ebfd00b3d2563f4e93979eeb",
                "sha256": "8c55b4d6d5a5d88f34f132e67b60c4eda7bf66bbd5f6d813ca6664ed798b9ee7"
            },
            "downloads": -1,
            "filename": "ml_scheduler-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "b741e7e1ebfd00b3d2563f4e93979eeb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 1768577,
            "upload_time": "2024-06-29T14:35:53",
            "upload_time_iso_8601": "2024-06-29T14:35:53.548067Z",
            "url": "https://files.pythonhosted.org/packages/97/fc/bd8a3ee9af5bf416d1d659d56ffb2959d3a24633dea6665d4dd84cdd90a7/ml_scheduler-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-29 14:35:53",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "ml-scheduler"
}
        
Elapsed time: 0.28106s