llm-atc

Name	llm-atc JSON
Version	0.1.4 JSON
	download
home_page
Summary	Tools for fine tuning and serving LLMs
upload_time	2023-09-25 10:35:03
maintainer
docs_url	None
author	Andrew Aikawa
requires_python	==3.10.*
license
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <p align="center">
  <img height='100px' src="https://www.ocf.berkeley.edu/~asai/static/images/trainy.png">
</p>

![GitHub Repo stars](https://img.shields.io/github/stars/Trainy-ai/llm-atc?style=social)
[![](https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/TrainyAI)
[![](https://dcbadge.vercel.app/api/server/d67CMuKY5V)](https://discord.gg/d67CMuKY5V)

LLM-ATC (**A**ir **T**raffic **C**ontroller) is a CLI for fine tuning and serving open source models using your own cloud credentials. We hope that this project can lower the cognitive overhead of orchestration for fine tuning and model serving.

**Refer to the docs for the most up to date usage information. This README is updated less frequently**

## Installation

Follow the instructions here [to install Skypilot and provide cloud credentials](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). We use Skypilot for cloud orchestration. Steps to setup an environment is shown below.

```bash
# create a fresh environment
conda create -n "sky" python=3.10
conda activate sky

# For Macs, macOS >= 10.15 is required to install SkyPilot. For Apple Silicon-based devices (e.g. Apple M1)
pip uninstall grpcio; conda install -c conda-forge grpcio=1.43.0 --force-reinstall

# install the skypilot cli and dependency, for the clouds you want, e.g. GCP
pip install "skypilot[gcp] @ git+https://github.com/skypilot-org/skypilot.git" # for aws, skypilot[aws]


# Configure your cloud credentials. This is a GCP example. See https://skypilot.readthedocs.io/en/latest/getting-started/ installation.html for examples with other cloud providers.
pip install google-api-python-client
conda install -c conda-forge google-cloud-sdk
gcloud init
gcloud auth application-default login

# double check that your credentials are properly set for your desired provider(s)
sky check
```

### From PyPi

```bash
pip install llm-atc
```

### From source

```bash
pip install -e .
```

## Finetuning

Supported fine-tune methods.
- Vicuna-Llama (chat-finetuning)

To start finetuning a model. Use `llm-atc train`. For example

```bash
# start training
llm-atc train --model_type vicuna --finetune_data ./vicuna_test.json --name myvicuna --description "This is a finetuned model that just says its name is vicuna" -c mycluster --cloud gcp --envs "MODEL_SIZE=7 WANDB_API_KEY=<my wandb key>" --accelerator A100-80G:4

# shutdown cluster when done
sky down mycluster
```

If your client disconnects from the train, the train run will continue. You can check it's status with `sky queue mycluster`

When training completes, by default, your model, will be saved to an object store corresponding to the cloud provider which launched the training instance. For instance,

```
# s3 location
s3://llm-atc/myvicuna
# gcp location
g3://llm-atc/myvicuna
```

## Serving

`llm-atc` can serve both models from HuggingFace or that you've trained through `llm-atc serve`. For example

```bash
# serve an llm-atc finetuned model, requires `llm-atc/` prefix and grabs model checkpoint from object store
llm-atc serve --name llm-atc/myvicuna --accelerator A100:1 -c servecluster --cloud gcp --region asia-southeast1 --envs "HF_TOKEN=<HuggingFace_token>"

# serve a HuggingFace model, e.g. `lmsys/vicuna-13b-v1.3`
llm-atc serve --name lmsys/vicuna-13b-v1.3 --accelerator A100:1 -c servecluster --cloud gcp --region asia-southeast1 --envs "HF_TOKEN=<HuggingFace_token>"
```

This creates a OpenAI API server on port 8000 of the cluster head and one model worker.
Make a request from your laptop with.
```bash
# get the ip address of the OpenAI server
ip=$(grep -A1 "Host servecluster" ~/.ssh/config | grep "HostName" | awk '{print $2}')

# test which models are available
curl http://$ip:8000/v1/models

# stop model server cluster
sky stop servecluster
```
and you can connect to this server and
develop your using your finetuned models with other LLM frameworks like [LlamaIndex](https://github.com/jerryjliu/llama_index). Look at `examples/` to see how to interact with your API endpoint.

## Telemetry

By default, LLM-ATC collects anonymized data about when a train or serve request is made with PostHog. Telemetry helps us identify where users are engaging with LLM-ATC. However, if you would like to disable telemetry, set

```bash
export LLM_ATC_DISABLE=1
```

## How does it work?

Training, serving, and orchestration are powered by [SkyPilot](https://github.com/skypilot-org/skypilot), [FastChat](https://github.com/lm-sys/FastChat/), and [vLLM](https://github.com/vllm-project/vllm). We've made this decision since we believe this will allow people to train and deploy custom LLMs without cloud-lockin.

We currently rely on default hyperparameters from other training code repositories, but we will add options to overwrite these so that users have more control over training, but for now, we think the defaults should suffice for most use cases.

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "llm-atc",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "==3.10.*",
    "maintainer_email": "",
    "keywords": "",
    "author": "Andrew Aikawa",
    "author_email": "asai@berkeley.edu",
    "download_url": "https://files.pythonhosted.org/packages/64/53/6f2f769df36a702e70638ab218688129b1090a55a7db05daeab9bacecdc9/llm_atc-0.1.4.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img height='100px' src=\"https://www.ocf.berkeley.edu/~asai/static/images/trainy.png\">\n</p>\n\n![GitHub Repo stars](https://img.shields.io/github/stars/Trainy-ai/llm-atc?style=social)\n[![](https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/TrainyAI)\n[![](https://dcbadge.vercel.app/api/server/d67CMuKY5V)](https://discord.gg/d67CMuKY5V)\n\nLLM-ATC (**A**ir **T**raffic **C**ontroller) is a CLI for fine tuning and serving open source models using your own cloud credentials. We hope that this project can lower the cognitive overhead of orchestration for fine tuning and model serving.\n\n**Refer to the docs for the most up to date usage information. This README is updated less frequently**\n\n## Installation\n\nFollow the instructions here [to install Skypilot and provide cloud credentials](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html). We use Skypilot for cloud orchestration. Steps to setup an environment is shown below.\n\n```bash\n# create a fresh environment\nconda create -n \"sky\" python=3.10\nconda activate sky\n\n# For Macs, macOS >= 10.15 is required to install SkyPilot. For Apple Silicon-based devices (e.g. Apple M1)\npip uninstall grpcio; conda install -c conda-forge grpcio=1.43.0 --force-reinstall\n\n# install the skypilot cli and dependency, for the clouds you want, e.g. GCP\npip install \"skypilot[gcp] @ git+https://github.com/skypilot-org/skypilot.git\" # for aws, skypilot[aws]\n\n\n# Configure your cloud credentials. This is a GCP example. See https://skypilot.readthedocs.io/en/latest/getting-started/ installation.html for examples with other cloud providers.\npip install google-api-python-client\nconda install -c conda-forge google-cloud-sdk\ngcloud init\ngcloud auth application-default login\n\n# double check that your credentials are properly set for your desired provider(s)\nsky check\n```\n\n### From PyPi\n\n```bash\npip install llm-atc\n```\n\n### From source\n\n```bash\npip install -e .\n```\n\n## Finetuning\n\nSupported fine-tune methods.\n- Vicuna-Llama (chat-finetuning)\n\nTo start finetuning a model. Use `llm-atc train`. For example\n\n```bash\n# start training\nllm-atc train --model_type vicuna --finetune_data ./vicuna_test.json --name myvicuna --description \"This is a finetuned model that just says its name is vicuna\" -c mycluster --cloud gcp --envs \"MODEL_SIZE=7 WANDB_API_KEY=<my wandb key>\" --accelerator A100-80G:4\n\n# shutdown cluster when done\nsky down mycluster\n```\n\nIf your client disconnects from the train, the train run will continue. You can check it's status with `sky queue mycluster`\n\nWhen training completes, by default, your model, will be saved to an object store corresponding to the cloud provider which launched the training instance. For instance,\n\n```\n# s3 location\ns3://llm-atc/myvicuna\n# gcp location\ng3://llm-atc/myvicuna\n```\n\n## Serving\n\n`llm-atc` can serve both models from HuggingFace or that you've trained through `llm-atc serve`. For example\n\n```bash\n# serve an llm-atc finetuned model, requires `llm-atc/` prefix and grabs model checkpoint from object store\nllm-atc serve --name llm-atc/myvicuna --accelerator A100:1 -c servecluster --cloud gcp --region asia-southeast1 --envs \"HF_TOKEN=<HuggingFace_token>\"\n\n# serve a HuggingFace model, e.g. `lmsys/vicuna-13b-v1.3`\nllm-atc serve --name lmsys/vicuna-13b-v1.3 --accelerator A100:1 -c servecluster --cloud gcp --region asia-southeast1 --envs \"HF_TOKEN=<HuggingFace_token>\"\n```\n\nThis creates a OpenAI API server on port 8000 of the cluster head and one model worker.\nMake a request from your laptop with.\n```bash\n# get the ip address of the OpenAI server\nip=$(grep -A1 \"Host servecluster\" ~/.ssh/config | grep \"HostName\" | awk '{print $2}')\n\n# test which models are available\ncurl http://$ip:8000/v1/models\n\n# stop model server cluster\nsky stop servecluster\n```\nand you can connect to this server and\ndevelop your using your finetuned models with other LLM frameworks like [LlamaIndex](https://github.com/jerryjliu/llama_index). Look at `examples/` to see how to interact with your API endpoint.\n\n## Telemetry\n\nBy default, LLM-ATC collects anonymized data about when a train or serve request is made with PostHog. Telemetry helps us identify where users are engaging with LLM-ATC. However, if you would like to disable telemetry, set\n\n```bash\nexport LLM_ATC_DISABLE=1\n```\n\n## How does it work?\n\nTraining, serving, and orchestration are powered by [SkyPilot](https://github.com/skypilot-org/skypilot), [FastChat](https://github.com/lm-sys/FastChat/), and [vLLM](https://github.com/vllm-project/vllm). We've made this decision since we believe this will allow people to train and deploy custom LLMs without cloud-lockin.\n\nWe currently rely on default hyperparameters from other training code repositories, but we will add options to overwrite these so that users have more control over training, but for now, we think the defaults should suffice for most use cases. \n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Tools for fine tuning and serving LLMs",
    "version": "0.1.4",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "70011f20e673de89f56dfb7f4bdb4f7a185e3bfb1e697880987f0c72f2fa2968",
                "md5": "c9766d9b41f128d2c2c9d552b0f70ba6",
                "sha256": "bedfd2b19a5718200821dfe3ddcd6aac420db4c1d53e5e579b60ad062e7dee69"
            },
            "downloads": -1,
            "filename": "llm_atc-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c9766d9b41f128d2c2c9d552b0f70ba6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "==3.10.*",
            "size": 18096,
            "upload_time": "2023-09-25T10:35:02",
            "upload_time_iso_8601": "2023-09-25T10:35:02.455557Z",
            "url": "https://files.pythonhosted.org/packages/70/01/1f20e673de89f56dfb7f4bdb4f7a185e3bfb1e697880987f0c72f2fa2968/llm_atc-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "64536f2f769df36a702e70638ab218688129b1090a55a7db05daeab9bacecdc9",
                "md5": "986ed9da418c4d69fde403d7d9f5cc35",
                "sha256": "eb5bff73a4f72669afe1928c71f9b9bf29159d8d33f2142a79c02540aa1805eb"
            },
            "downloads": -1,
            "filename": "llm_atc-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "986ed9da418c4d69fde403d7d9f5cc35",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "==3.10.*",
            "size": 16444,
            "upload_time": "2023-09-25T10:35:03",
            "upload_time_iso_8601": "2023-09-25T10:35:03.903508Z",
            "url": "https://files.pythonhosted.org/packages/64/53/6f2f769df36a702e70638ab218688129b1090a55a7db05daeab9bacecdc9/llm_atc-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-25 10:35:03",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llm-atc"
}

Andrew Aikawa