Name | vllm-haystack JSON |
Version |
0.1.2
JSON |
| download |
home_page | |
Summary | A simple adapter to use vLLM in your Haystack pipelines. |
upload_time | 2023-12-04 15:23:52 |
maintainer | |
docs_url | None |
author | |
requires_python | >=3.7 |
license | |
keywords |
ai
haystack
llm
nlp
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# vLLM-haystack-adapter
[![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)
Simply use [vLLM](https://github.com/vllm-project/vllm) in your haystack pipeline, to utilize fast, self-hosted LLMs.
<p align="center">
<img alt="vLLM" src="https://raw.githubusercontent.com/vllm-project/vllm/main/docs/source/assets/logos/vllm-logo-text-light.png" width="45%" style="vertical-align: middle;">
<a href="https://www.deepset.ai/haystack/">
<img src="https://raw.githubusercontent.com/deepset-ai/haystack/main/docs/img/haystack_logo_colored.png" alt="Haystack" width="45%" style="vertical-align: middle;">
</a>
</p>
## Installation
Install the wrapper via pip: `pip install vllm-haystack`
## Usage
This integration provides two invocation layers:
- `vLLMInvocationLayer`: To use models hosted on a vLLM server (or any other OpenAI compatible server)
- `vLLMLocalInvocationLayer`: To use locally hosted vLLM models
### Use a Model Hosted on a vLLM Server
To utilize the wrapper the `vLLMInvocationLayer` has to be used.
Here is a simple example of how a `PromptNode` can be created with the wrapper.
```python
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMInvocationLayer
model = PromptModel(model_name_or_path="", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key="EMPTY", model_kwargs={
"api_base" : API, # Replace this with your API-URL
"maximum_context_length": 2048,
})
prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
```
The model will be inferred based on the model served on the vLLM server.
For more configuration examples, take a look at the unit-tests.
#### Hosting a vLLM Server
To create an *OpenAI-Compatible Server* via vLLM you can follow the steps in the
Quickstart section of their [documentation](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server).
### Use a Model Hosted Locally
⚠️To run `vLLM` locally you need to have `vllm` installed and a supported GPU.
If you don't want to use an API-Server this wrapper also provides a `vLLMLocalInvocationLayer` which executes the vLLM on the same node Haystack is running on.
Here is a simple example of how a `PromptNode` can be created with the `vLLMLocalInvocationLayer`.
```python
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMLocalInvocationLayer
model = PromptModel(model_name_or_path=MODEL, invocation_layer_class=vLLMLocalInvocationLayer, max_length=256, model_kwargs={
"maximum_context_length": 2048,
})
prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
```
Raw data
{
"_id": null,
"home_page": "",
"name": "vllm-haystack",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "AI,Haystack,LLM,NLP",
"author": "",
"author_email": "Lukas Kreussel <65088241+LLukas22@users.noreply.github.com>",
"download_url": "https://files.pythonhosted.org/packages/16/35/df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9/vllm_haystack-0.1.2.tar.gz",
"platform": null,
"description": "# vLLM-haystack-adapter\n[![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)\n\nSimply use [vLLM](https://github.com/vllm-project/vllm) in your haystack pipeline, to utilize fast, self-hosted LLMs. \n\n<p align=\"center\">\n <img alt=\"vLLM\" src=\"https://raw.githubusercontent.com/vllm-project/vllm/main/docs/source/assets/logos/vllm-logo-text-light.png\" width=\"45%\" style=\"vertical-align: middle;\">\n <a href=\"https://www.deepset.ai/haystack/\">\n <img src=\"https://raw.githubusercontent.com/deepset-ai/haystack/main/docs/img/haystack_logo_colored.png\" alt=\"Haystack\" width=\"45%\" style=\"vertical-align: middle;\">\n </a>\n</p>\n\n## Installation\nInstall the wrapper via pip: `pip install vllm-haystack`\n\n## Usage\nThis integration provides two invocation layers:\n- `vLLMInvocationLayer`: To use models hosted on a vLLM server (or any other OpenAI compatible server)\n- `vLLMLocalInvocationLayer`: To use locally hosted vLLM models\n\n### Use a Model Hosted on a vLLM Server\nTo utilize the wrapper the `vLLMInvocationLayer` has to be used. \n\nHere is a simple example of how a `PromptNode` can be created with the wrapper.\n```python\nfrom haystack.nodes import PromptNode, PromptModel\nfrom vllm_haystack import vLLMInvocationLayer\n\n\nmodel = PromptModel(model_name_or_path=\"\", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key=\"EMPTY\", model_kwargs={\n \"api_base\" : API, # Replace this with your API-URL\n \"maximum_context_length\": 2048,\n })\n\nprompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)\n```\nThe model will be inferred based on the model served on the vLLM server.\nFor more configuration examples, take a look at the unit-tests.\n\n#### Hosting a vLLM Server\n\nTo create an *OpenAI-Compatible Server* via vLLM you can follow the steps in the \nQuickstart section of their [documentation](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server).\n\n### Use a Model Hosted Locally\n\u26a0\ufe0fTo run `vLLM` locally you need to have `vllm` installed and a supported GPU.\n\nIf you don't want to use an API-Server this wrapper also provides a `vLLMLocalInvocationLayer` which executes the vLLM on the same node Haystack is running on. \n\nHere is a simple example of how a `PromptNode` can be created with the `vLLMLocalInvocationLayer`.\n```python\nfrom haystack.nodes import PromptNode, PromptModel\nfrom vllm_haystack import vLLMLocalInvocationLayer\n\nmodel = PromptModel(model_name_or_path=MODEL, invocation_layer_class=vLLMLocalInvocationLayer, max_length=256, model_kwargs={\n \"maximum_context_length\": 2048,\n })\n\nprompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)\n```\n\n\n",
"bugtrack_url": null,
"license": "",
"summary": "A simple adapter to use vLLM in your Haystack pipelines.",
"version": "0.1.2",
"project_urls": {
"Documentation": "https://github.com/LLukas22/vLLM-haystack-adapter#readme",
"Issues": "https://github.com/LLukas22/vLLM-haystack-adapter/issues",
"Source": "https://github.com/LLukas22/vLLM-haystack-adapter"
},
"split_keywords": [
"ai",
"haystack",
"llm",
"nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4a24b561ec26e5b4b7aeb54c851157f624b2aa4c739948f6d41a9c2734bdfc96",
"md5": "d34f78e43441b7647c5b0c8da81dde66",
"sha256": "93993735949ab201a70802e31d9d2aad6a98c2c3618bca990e00d829c0868950"
},
"downloads": -1,
"filename": "vllm_haystack-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d34f78e43441b7647c5b0c8da81dde66",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 7854,
"upload_time": "2023-12-04T15:23:51",
"upload_time_iso_8601": "2023-12-04T15:23:51.313138Z",
"url": "https://files.pythonhosted.org/packages/4a/24/b561ec26e5b4b7aeb54c851157f624b2aa4c739948f6d41a9c2734bdfc96/vllm_haystack-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1635df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9",
"md5": "86357bb0ef46c7699ea85df9d106332f",
"sha256": "530015c6518a60114ce02a534addd09dfa035f69dc88d63e1fafd7c43364a692"
},
"downloads": -1,
"filename": "vllm_haystack-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "86357bb0ef46c7699ea85df9d106332f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 10462,
"upload_time": "2023-12-04T15:23:52",
"upload_time_iso_8601": "2023-12-04T15:23:52.489455Z",
"url": "https://files.pythonhosted.org/packages/16/35/df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9/vllm_haystack-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-04 15:23:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "LLukas22",
"github_project": "vLLM-haystack-adapter#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "vllm-haystack"
}