# Large Language Model (LLM) Inference API and Chatbot 🦙
![project banner](https://github.com/aniketmaurya/LLaMA-Inference-API/raw/main/assets/llama-inference-api-min.png)
Inference API for LLaMA
```
pip install llama-inference
# to use chatbot
pip install llama-inference[chatbot]
```
### Install from main branch
```bash
pip install git+https://github.com/aniketmaurya/llama-inference-api.git@main
```
> **Note**: You need to manually install [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) and setup the model weights to use this project.
```
pip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main
```
## For Inference
```python
from llm_inference import LLMInference
import os
WEIGHTS_PATH = os.environ["WEIGHTS"]
checkpoint_dir = f"checkpoints/tiiuae/falcon-7b"
model = LLMInference(checkpoint_dir=checkpoint_dir, precision="bf16-true")
print(model("New York is located in"))
```
## For deploying as a REST API
Create a Python file `app.py` and initialize the `ServeLLaMA` App.
```python
# app.py
from llm_inference.serve import ServeLLaMA, Response, PromptRequest
import lightning as L
component = ServeLLaMA(input_type=PromptRequest, output_type=Response)
app = L.LightningApp(component)
```
```bash
lightning run app app.py
```
## How to use the Chatbot
```python
from chatbot import LLaMAChatBot
checkpoint_dir = f"../../weights"
bot = LLaMAChatBot(
checkpoint_dir=checkpoint_dir)
print(bot.send("hi, what is the capital of France?"))
```
Raw data
{
"_id": null,
"home_page": "https://github.com/aniketmaurya/LLaMA-inference-api",
"name": "llama-inference",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "LLM,LLaMA,GPT",
"author": "Aniket Maurya",
"author_email": "theaniketmaurya@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/c7/57/6edd35ca1d9e0bc7216daa1ba4f215342316792dc721f17a5c815a8e6c63/llama_inference-0.0.4.tar.gz",
"platform": null,
"description": "# Large Language Model (LLM) Inference API and Chatbot \ud83e\udd99\n\n![project banner](https://github.com/aniketmaurya/LLaMA-Inference-API/raw/main/assets/llama-inference-api-min.png)\n\nInference API for LLaMA\n\n```\npip install llama-inference\n\n# to use chatbot\npip install llama-inference[chatbot]\n```\n\n### Install from main branch\n```bash\npip install git+https://github.com/aniketmaurya/llama-inference-api.git@main\n```\n\n> **Note**: You need to manually install [Lit-GPT](https://github.com/Lightning-AI/lit-gpt) and setup the model weights to use this project.\n\n```\npip install lit-gpt@git+https://github.com/Lightning-AI/lit-gpt.git@main\n```\n\n\n## For Inference\n\n```python\nfrom llm_inference import LLMInference\nimport os\n\nWEIGHTS_PATH = os.environ[\"WEIGHTS\"]\n\ncheckpoint_dir = f\"checkpoints/tiiuae/falcon-7b\"\n\nmodel = LLMInference(checkpoint_dir=checkpoint_dir, precision=\"bf16-true\")\n\nprint(model(\"New York is located in\"))\n```\n\n\n## For deploying as a REST API\n\nCreate a Python file `app.py` and initialize the `ServeLLaMA` App.\n\n```python\n# app.py\nfrom llm_inference.serve import ServeLLaMA, Response, PromptRequest\n\nimport lightning as L\n\ncomponent = ServeLLaMA(input_type=PromptRequest, output_type=Response)\napp = L.LightningApp(component)\n```\n\n```bash\nlightning run app app.py\n```\n\n## How to use the Chatbot\n\n```python\nfrom chatbot import LLaMAChatBot\n\ncheckpoint_dir = f\"../../weights\"\n\nbot = LLaMAChatBot(\n checkpoint_dir=checkpoint_dir)\n\nprint(bot.send(\"hi, what is the capital of France?\"))\n```\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "Large Language Models Inference API and Chatbot",
"version": "0.0.4",
"project_urls": {
"Homepage": "https://github.com/aniketmaurya/LLaMA-inference-api"
},
"split_keywords": [
"llm",
"llama",
"gpt"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1904d25914eeef300b51bc1d19ec0ada1d734efeb8853e235cd204d3e75a3574",
"md5": "403342aae2636901878ba1e70bba3978",
"sha256": "e05169a78b4b4e859517596c0ae7888e18d86320289a8c5c2775c5a8369f47ec"
},
"downloads": -1,
"filename": "llama_inference-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "403342aae2636901878ba1e70bba3978",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 10423,
"upload_time": "2023-07-04T22:02:45",
"upload_time_iso_8601": "2023-07-04T22:02:45.957619Z",
"url": "https://files.pythonhosted.org/packages/19/04/d25914eeef300b51bc1d19ec0ada1d734efeb8853e235cd204d3e75a3574/llama_inference-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c7576edd35ca1d9e0bc7216daa1ba4f215342316792dc721f17a5c815a8e6c63",
"md5": "cf3dbac5cce5c06ac7c7c7454e2f851b",
"sha256": "222db34bb8ac0180bd049b5e43640f64f6dcb2c654fa86aa82fce6850999b268"
},
"downloads": -1,
"filename": "llama_inference-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "cf3dbac5cce5c06ac7c7c7454e2f851b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 414621,
"upload_time": "2023-07-04T22:02:47",
"upload_time_iso_8601": "2023-07-04T22:02:47.717947Z",
"url": "https://files.pythonhosted.org/packages/c7/57/6edd35ca1d9e0bc7216daa1ba4f215342316792dc721f17a5c815a8e6c63/llama_inference-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-04 22:02:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "aniketmaurya",
"github_project": "LLaMA-inference-api",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llama-inference"
}