# LLaMA Server
[![PyPI version](https://img.shields.io/pypi/v/llama-server)](https://pypi.org/project/llama-server/)[![Unit test](https://github.com/nuance1979/llama-server/actions/workflows/test.yml/badge.svg?branch=main&&event=push)](https://github.com/nuance1979/llama-server/actions)[![GitHub stars](https://img.shields.io/github/stars/nuance1979/llama-server)](https://star-history.com/#nuance1979/llama-server&Date)[![GitHub license](https://img.shields.io/github/license/nuance1979/llama-server)](https://github.com/nuance1979/llama-server/blob/master/LICENSE)
LLaMA Server combines the power of [LLaMA C++](https://github.com/ggerganov/llama.cpp) (via [PyLLaMACpp](https://github.com/nomic-ai/pyllamacpp)) with the beauty of [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui).
🦙LLaMA C++ (via 🐍PyLLaMACpp) ➕ 🤖Chatbot UI ➕ 🔗LLaMA Server 🟰 😊
**UPDATE**: Now supports better streaming through [PyLLaMACpp](https://github.com/nomic-ai/pyllamacpp)!
**UPDATE**: Now supports streaming!
## Demo
- Streaming
https://user-images.githubusercontent.com/10931178/229980159-61546fa6-2985-4cdc-8230-5dcb6a69c559.mov
- Non-streaming
https://user-images.githubusercontent.com/10931178/229408428-5b6ef72d-28d0-427f-ae83-e23972e2dcff.mov
## Setup
- Get your favorite LLaMA models by
- Download from [🤗Hugging Face](https://huggingface.co/models?sort=downloads&search=ggml);
- Or follow instructions at [LLaMA C++](https://github.com/ggerganov/llama.cpp);
- Make sure models are converted and quantized;
- Create a `models.yml` file to provide your `model_home` directory and add your favorite [South American camelids](https://en.wikipedia.org/wiki/Lama_(genus)), e.g.:
```yaml
model_home: /path/to/your/models
models:
llama-7b:
name: LLAMA-7B
path: 7B/ggml-model-q4_0.bin # relative to `model_home` or an absolute path
```
See [models.yml](https://github.com/nuance1979/llama-server/blob/main/models.yml) for an example.
- Set up python environment:
```bash
conda create -n llama python=3.9
conda activate llama
```
- Install LLaMA Server:
- From PyPI:
```bash
python -m pip install llama-server
```
- Or from source:
```bash
python -m pip install git+https://github.com/nuance1979/llama-server.git
```
- Install a patched version of PyLLaMACpp: (*Note:* this step will not be needed **after** PyLLaMACpp makes a new release.)
```bash
python -m pip install git+https://github.com/nuance1979/pyllamacpp.git@dev --upgrade
```
- Start LLaMA Server with your `models.yml` file:
```bash
llama-server --models-yml models.yml --model-id llama-7b
```
- Check out [my fork](https://github.com/nuance1979/chatbot-ui) of Chatbot UI and start the app;
```bash
git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev
```
- Open the link http://localhost:3000 in your browser;
- Click "OpenAI API Key" at the bottom left corner and enter your [OpenAI API Key](https://platform.openai.com/account/api-keys);
- Or follow instructions at [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui) to put your key into a `.env.local` file and restart;
```bash
cp .env.local.example .env.local
<edit .env.local to add your OPENAI_API_KEY>
```
- Enjoy!
## More
- Try a larger model if you have it:
```bash
llama-server --models-yml models.yml --model-id llama-13b # or any `model_id` defined in `models.yml`
```
- Try non-streaming mode by restarting Chatbot UI:
```bash
export LLAMA_STREAM_MODE=1 # 0 to disable streaming
npm run dev
```
## Limitations
- "Regenerate response" is currently not working;
- IMHO, the prompt/reverse-prompt machanism of LLaMA C++'s interactive mode needs an overhaul. I tried very hard to dance around it but the whole thing is still a hack.
## Fun facts
I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with [ChatGPT](https://chat.openai.com); no more StackOverflow.
Raw data
{
"_id": null,
"home_page": "https://github.com/nuance1979/llama-server",
"name": "llama-server",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "llama llama.cpp chatbot-ui chatbot",
"author": "Yi Su",
"author_email": "nuance@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/42/e2/7b4639685461407cac6f5366344d7b3cacfae5b221718206c8657a13ab1f/llama-server-0.1.0.tar.gz",
"platform": null,
"description": "# LLaMA Server\n\n[![PyPI version](https://img.shields.io/pypi/v/llama-server)](https://pypi.org/project/llama-server/)[![Unit test](https://github.com/nuance1979/llama-server/actions/workflows/test.yml/badge.svg?branch=main&&event=push)](https://github.com/nuance1979/llama-server/actions)[![GitHub stars](https://img.shields.io/github/stars/nuance1979/llama-server)](https://star-history.com/#nuance1979/llama-server&Date)[![GitHub license](https://img.shields.io/github/license/nuance1979/llama-server)](https://github.com/nuance1979/llama-server/blob/master/LICENSE)\n\nLLaMA Server combines the power of [LLaMA C++](https://github.com/ggerganov/llama.cpp) (via [PyLLaMACpp](https://github.com/nomic-ai/pyllamacpp)) with the beauty of [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui).\n\n\ud83e\udd99LLaMA C++ (via \ud83d\udc0dPyLLaMACpp) \u2795 \ud83e\udd16Chatbot UI \u2795 \ud83d\udd17LLaMA Server \ud83d\udff0 \ud83d\ude0a\n\n**UPDATE**: Now supports better streaming through [PyLLaMACpp](https://github.com/nomic-ai/pyllamacpp)!\n\n**UPDATE**: Now supports streaming!\n\n## Demo\n- Streaming\n\nhttps://user-images.githubusercontent.com/10931178/229980159-61546fa6-2985-4cdc-8230-5dcb6a69c559.mov\n\n- Non-streaming\n\nhttps://user-images.githubusercontent.com/10931178/229408428-5b6ef72d-28d0-427f-ae83-e23972e2dcff.mov\n\n\n## Setup\n\n- Get your favorite LLaMA models by\n - Download from [\ud83e\udd17Hugging Face](https://huggingface.co/models?sort=downloads&search=ggml);\n - Or follow instructions at [LLaMA C++](https://github.com/ggerganov/llama.cpp);\n - Make sure models are converted and quantized;\n\n- Create a `models.yml` file to provide your `model_home` directory and add your favorite [South American camelids](https://en.wikipedia.org/wiki/Lama_(genus)), e.g.:\n```yaml\nmodel_home: /path/to/your/models\nmodels:\n llama-7b:\n name: LLAMA-7B\n path: 7B/ggml-model-q4_0.bin # relative to `model_home` or an absolute path\n```\nSee [models.yml](https://github.com/nuance1979/llama-server/blob/main/models.yml) for an example.\n\n- Set up python environment:\n```bash\nconda create -n llama python=3.9\nconda activate llama\n```\n\n- Install LLaMA Server:\n - From PyPI:\n ```bash\n python -m pip install llama-server\n ```\n - Or from source:\n ```bash\n python -m pip install git+https://github.com/nuance1979/llama-server.git\n ```\n\n- Install a patched version of PyLLaMACpp: (*Note:* this step will not be needed **after** PyLLaMACpp makes a new release.)\n```bash\npython -m pip install git+https://github.com/nuance1979/pyllamacpp.git@dev --upgrade\n```\n\n- Start LLaMA Server with your `models.yml` file:\n```bash\nllama-server --models-yml models.yml --model-id llama-7b\n```\n\n- Check out [my fork](https://github.com/nuance1979/chatbot-ui) of Chatbot UI and start the app;\n```bash\ngit clone https://github.com/nuance1979/chatbot-ui\ncd chatbot-ui\ngit checkout llama\nnpm i\nnpm run dev\n```\n- Open the link http://localhost:3000 in your browser;\n - Click \"OpenAI API Key\" at the bottom left corner and enter your [OpenAI API Key](https://platform.openai.com/account/api-keys);\n - Or follow instructions at [Chatbot UI](https://github.com/mckaywrigley/chatbot-ui) to put your key into a `.env.local` file and restart;\n ```bash\n cp .env.local.example .env.local\n <edit .env.local to add your OPENAI_API_KEY>\n ```\n- Enjoy!\n\n## More\n\n- Try a larger model if you have it:\n```bash\nllama-server --models-yml models.yml --model-id llama-13b # or any `model_id` defined in `models.yml`\n```\n\n- Try non-streaming mode by restarting Chatbot UI:\n```bash\nexport LLAMA_STREAM_MODE=1 # 0 to disable streaming\nnpm run dev\n```\n\n## Limitations\n\n- \"Regenerate response\" is currently not working;\n- IMHO, the prompt/reverse-prompt machanism of LLaMA C++'s interactive mode needs an overhaul. I tried very hard to dance around it but the whole thing is still a hack.\n\n## Fun facts\n\nI am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with [ChatGPT](https://chat.openai.com); no more StackOverflow.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.",
"version": "0.1.0",
"split_keywords": [
"llama",
"llama.cpp",
"chatbot-ui",
"chatbot"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "616e299dba44839fe2b046f828ce70c00e59c712fca2391d91d20f9f0b915a77",
"md5": "1c6b55bbe8f34730c40dc7f73d6dce63",
"sha256": "ab957e231c795b77951ec1e595b01e292f788f979c71b4736a925386d5ef2c7e"
},
"downloads": -1,
"filename": "llama_server-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1c6b55bbe8f34730c40dc7f73d6dce63",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 9324,
"upload_time": "2023-04-12T04:40:04",
"upload_time_iso_8601": "2023-04-12T04:40:04.948642Z",
"url": "https://files.pythonhosted.org/packages/61/6e/299dba44839fe2b046f828ce70c00e59c712fca2391d91d20f9f0b915a77/llama_server-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "42e27b4639685461407cac6f5366344d7b3cacfae5b221718206c8657a13ab1f",
"md5": "2844ac890182774275b55ebcf6c70113",
"sha256": "baaa67cb326e3a5276435caf484fc36fdf167f789da347284c17f61285b6c318"
},
"downloads": -1,
"filename": "llama-server-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "2844ac890182774275b55ebcf6c70113",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 10388,
"upload_time": "2023-04-12T04:40:06",
"upload_time_iso_8601": "2023-04-12T04:40:06.476112Z",
"url": "https://files.pythonhosted.org/packages/42/e2/7b4639685461407cac6f5366344d7b3cacfae5b221718206c8657a13ab1f/llama-server-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-12 04:40:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "nuance1979",
"github_project": "llama-server",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llama-server"
}