llama-inference


Namellama-inference JSON
Version 0.0.4 PyPI version JSON
download
home_pagehttps://github.com/aniketmaurya/LLaMA-inference-api
SummaryLarge Language Models Inference API and Chatbot
upload_time2023-07-04 22:02:47
maintainer
docs_urlNone
authorAniket Maurya
requires_python>=3.8
licenseApache License 2.0
keywords llm llama gpt
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Large Language Model (LLM) Inference API and Chatbot 🦙

![project banner](https://github.com/aniketmaurya/LLaMA-Inference-API/raw/main/assets/llama-inference-api-min.png)

Inference API for LLaMA

```
pip install llama-inference

# to use chatbot
pip install llama-inference[chatbot]
```

### Install from main branch
```bash
pip install git+https://github.com/aniketmaurya/llama-inference-api.git@main
```

> **Note**: You need to manually install [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) and setup the model weights to use this project.

```
pip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main
```


## For Inference

```python
from llm_inference import LLMInference
import os

WEIGHTS_PATH = os.environ["WEIGHTS"]

checkpoint_dir = f"checkpoints/tiiuae/falcon-7b"

model = LLMInference(checkpoint_dir=checkpoint_dir, precision="bf16-true")

print(model("New York is located in"))
```


## For deploying as a REST API

Create a Python file `app.py` and initialize the `ServeLLaMA` App.

```python
# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest

import lightning as L

component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
```

```bash
lightning run app app.py
```

## How to use the Chatbot

```python
from chatbot import LLaMAChatBot

checkpoint_dir = f"../../weights"

bot = LLaMAChatBot(
    checkpoint_dir=checkpoint_dir)

print(bot.send("hi, what is the capital of France?"))
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aniketmaurya/LLaMA-inference-api",
    "name": "llama-inference",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "LLM,LLaMA,GPT",
    "author": "Aniket Maurya",
    "author_email": "theaniketmaurya@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/c7/57/6edd35ca1d9e0bc7216daa1ba4f215342316792dc721f17a5c815a8e6c63/llama_inference-0.0.4.tar.gz",
    "platform": null,
    "description": "# Large Language Model (LLM) Inference API and Chatbot \ud83e\udd99\n\n![project banner](https://github.com/aniketmaurya/LLaMA-Inference-API/raw/main/assets/llama-inference-api-min.png)\n\nInference API for LLaMA\n\n```\npip install llama-inference\n\n# to use chatbot\npip install llama-inference[chatbot]\n```\n\n### Install from main branch\n```bash\npip install git+https://github.com/aniketmaurya/llama-inference-api.git@main\n```\n\n> **Note**: You need to manually install [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) and setup the model weights to use this project.\n\n```\npip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main\n```\n\n\n## For Inference\n\n```python\nfrom llm_inference import LLMInference\nimport os\n\nWEIGHTS_PATH = os.environ[\"WEIGHTS\"]\n\ncheckpoint_dir = f\"checkpoints/tiiuae/falcon-7b\"\n\nmodel = LLMInference(checkpoint_dir=checkpoint_dir, precision=\"bf16-true\")\n\nprint(model(\"New York is located in\"))\n```\n\n\n## For deploying as a REST API\n\nCreate a Python file `app.py` and initialize the `ServeLLaMA` App.\n\n```python\n# app.py\nfrom llm_inference.serve import ServeLLaMA, Response, PromptRequest\n\nimport lightning as L\n\ncomponent = ServeLLaMA(input_type=PromptRequest, output_type=Response)\napp = L.LightningApp(component)\n```\n\n```bash\nlightning run app app.py\n```\n\n## How to use the Chatbot\n\n```python\nfrom chatbot import LLaMAChatBot\n\ncheckpoint_dir = f\"../../weights\"\n\nbot = LLaMAChatBot(\n    checkpoint_dir=checkpoint_dir)\n\nprint(bot.send(\"hi, what is the capital of France?\"))\n```\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Large Language Models Inference API and Chatbot",
    "version": "0.0.4",
    "project_urls": {
        "Homepage": "https://github.com/aniketmaurya/LLaMA-inference-api"
    },
    "split_keywords": [
        "llm",
        "llama",
        "gpt"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1904d25914eeef300b51bc1d19ec0ada1d734efeb8853e235cd204d3e75a3574",
                "md5": "403342aae2636901878ba1e70bba3978",
                "sha256": "e05169a78b4b4e859517596c0ae7888e18d86320289a8c5c2775c5a8369f47ec"
            },
            "downloads": -1,
            "filename": "llama_inference-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "403342aae2636901878ba1e70bba3978",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 10423,
            "upload_time": "2023-07-04T22:02:45",
            "upload_time_iso_8601": "2023-07-04T22:02:45.957619Z",
            "url": "https://files.pythonhosted.org/packages/19/04/d25914eeef300b51bc1d19ec0ada1d734efeb8853e235cd204d3e75a3574/llama_inference-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c7576edd35ca1d9e0bc7216daa1ba4f215342316792dc721f17a5c815a8e6c63",
                "md5": "cf3dbac5cce5c06ac7c7c7454e2f851b",
                "sha256": "222db34bb8ac0180bd049b5e43640f64f6dcb2c654fa86aa82fce6850999b268"
            },
            "downloads": -1,
            "filename": "llama_inference-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "cf3dbac5cce5c06ac7c7c7454e2f851b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 414621,
            "upload_time": "2023-07-04T22:02:47",
            "upload_time_iso_8601": "2023-07-04T22:02:47.717947Z",
            "url": "https://files.pythonhosted.org/packages/c7/57/6edd35ca1d9e0bc7216daa1ba4f215342316792dc721f17a5c815a8e6c63/llama_inference-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-04 22:02:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aniketmaurya",
    "github_project": "LLaMA-inference-api",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llama-inference"
}
        
Elapsed time: 0.10860s