llm-inference


Namellm-inference JSON
Version 0.0.6 PyPI version JSON
download
home_pagehttps://github.com/aniketmaurya/llm-inference
SummaryLarge Language Models Inference API and Applications
upload_time2023-07-16 15:15:44
maintainer
docs_urlNone
authorAniket Maurya
requires_python>=3.8
licenseApache License 2.0
keywords llm llama gpt falcon
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Large Language Model (LLM) Inference API and Chatbot 🦙

![project banner](https://github.com/aniketmaurya/llm-inference/raw/main/assets/llm-inference-min.png)

Inference API for LLMs like LLaMA and Falcon powered by Lit-GPT from [Lightning AI](https://lightning.ai)

```
pip install llm-inference
```

### Install from main branch
```bash
pip install git+https://github.com/aniketmaurya/llm-inference.git@main
```

> **Note**: You need to manually install [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) and setup the model weights to use this project.

```
pip install lit_gpt@git+https://github.com/aniketmaurya/install-lit-gpt.git@install
```

## For Inference

```python
from llm_inference import LLMInference, prepare_weights
from rich import print

path = prepare_weights("EleutherAI/pythia-70m")
model = LLMInference(checkpoint_dir=path)

print(model("New York is located in"))
```


## How to use the Chatbot

```python
from llm_chain import LitGPTConversationChain, LitGPTLLM
from llm_inference import prepare_weights
from rich import print


path = str(prepare_weights("lmsys/longchat-13b-16k"))
llm = LitGPTLLM(checkpoint_dir=path, quantize="bnb.nf4")  # 8.4GB GPU memory
bot = LitGPTConversationChain.from_llm(llm=llm, verbose=True)

print(bot.send("hi, what is the capital of France?"))
```

## Launch Chatbot App

<video width="320" height="240" controls>
  <source src="/assets/chatbot-demo.mov" type="video/mp4">
</video>

**1. Download weights**
```py
from llm_inference import prepare_weights
path = prepare_weights("lmsys/longchat-13b-16k")
```

**2. Launch Gradio App**

```
python examples/chatbot/gradio_demo.py
```



## For deploying as a REST API

Create a Python file `app.py` and initialize the `ServeLLaMA` App.

```python
# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest

import lightning as L

component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
```

```bash
lightning run app app.py
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aniketmaurya/llm-inference",
    "name": "llm-inference",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "LLM,LLaMA,GPT,Falcon",
    "author": "Aniket Maurya",
    "author_email": "theaniketmaurya@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/54/17/56ce7b12de3af15ae7acf9358760ad9689e36392da2491888fa7299ad6b5/llm_inference-0.0.6.tar.gz",
    "platform": null,
    "description": "# Large Language Model (LLM) Inference API and Chatbot \ud83e\udd99\n\n![project banner](https://github.com/aniketmaurya/llm-inference/raw/main/assets/llm-inference-min.png)\n\nInference API for LLMs like LLaMA and Falcon powered by Lit-GPT from [Lightning AI](https://lightning.ai)\n\n```\npip install llm-inference\n```\n\n### Install from main branch\n```bash\npip install git+https://github.com/aniketmaurya/llm-inference.git@main\n```\n\n> **Note**: You need to manually install [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) and setup the model weights to use this project.\n\n```\npip install lit_gpt@git+https://github.com/aniketmaurya/install-lit-gpt.git@install\n```\n\n## For Inference\n\n```python\nfrom llm_inference import LLMInference, prepare_weights\nfrom rich import print\n\npath = prepare_weights(\"EleutherAI/pythia-70m\")\nmodel = LLMInference(checkpoint_dir=path)\n\nprint(model(\"New York is located in\"))\n```\n\n\n## How to use the Chatbot\n\n```python\nfrom llm_chain import LitGPTConversationChain, LitGPTLLM\nfrom llm_inference import prepare_weights\nfrom rich import print\n\n\npath = str(prepare_weights(\"lmsys/longchat-13b-16k\"))\nllm = LitGPTLLM(checkpoint_dir=path, quantize=\"bnb.nf4\")  # 8.4GB GPU memory\nbot = LitGPTConversationChain.from_llm(llm=llm, verbose=True)\n\nprint(bot.send(\"hi, what is the capital of France?\"))\n```\n\n## Launch Chatbot App\n\n<video width=\"320\" height=\"240\" controls>\n  <source src=\"/assets/chatbot-demo.mov\" type=\"video/mp4\">\n</video>\n\n**1. Download weights**\n```py\nfrom llm_inference import prepare_weights\npath = prepare_weights(\"lmsys/longchat-13b-16k\")\n```\n\n**2. Launch Gradio App**\n\n```\npython examples/chatbot/gradio_demo.py\n```\n\n\n\n## For deploying as a REST API\n\nCreate a Python file `app.py` and initialize the `ServeLLaMA` App.\n\n```python\n# app.py\nfrom llm_inference.serve import ServeLLaMA, Response, PromptRequest\n\nimport lightning as L\n\ncomponent = ServeLLaMA(input_type=PromptRequest, output_type=Response)\napp = L.LightningApp(component)\n```\n\n```bash\nlightning run app app.py\n```\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Large Language Models Inference API and Applications",
    "version": "0.0.6",
    "project_urls": {
        "Homepage": "https://github.com/aniketmaurya/llm-inference"
    },
    "split_keywords": [
        "llm",
        "llama",
        "gpt",
        "falcon"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c6cd4bfeab074dd703d4977641711c16e5f4641d550c55d2890b797c71f90af7",
                "md5": "5a50437efe21dab18314b9a3f2441bea",
                "sha256": "08fe641c4511b1ad465a503c02b58752cf785ee00f4516a3b63e99103d1ea6ac"
            },
            "downloads": -1,
            "filename": "llm_inference-0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5a50437efe21dab18314b9a3f2441bea",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 11980,
            "upload_time": "2023-07-16T15:15:42",
            "upload_time_iso_8601": "2023-07-16T15:15:42.881934Z",
            "url": "https://files.pythonhosted.org/packages/c6/cd/4bfeab074dd703d4977641711c16e5f4641d550c55d2890b797c71f90af7/llm_inference-0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "541756ce7b12de3af15ae7acf9358760ad9689e36392da2491888fa7299ad6b5",
                "md5": "c813fdea738fb735019ce0ee3237a0ef",
                "sha256": "5bc77680d1df2f9af5b4cc9187fd47c473a2c98a32d0c59eb7427949ad19cbfe"
            },
            "downloads": -1,
            "filename": "llm_inference-0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "c813fdea738fb735019ce0ee3237a0ef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 810413,
            "upload_time": "2023-07-16T15:15:44",
            "upload_time_iso_8601": "2023-07-16T15:15:44.223869Z",
            "url": "https://files.pythonhosted.org/packages/54/17/56ce7b12de3af15ae7acf9358760ad9689e36392da2491888fa7299ad6b5/llm_inference-0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-16 15:15:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aniketmaurya",
    "github_project": "llm-inference",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llm-inference"
}
        
Elapsed time: 0.11219s