vllm-haystack


Namevllm-haystack JSON
Version 0.1.2 PyPI version JSON
download
home_page
SummaryA simple adapter to use vLLM in your Haystack pipelines.
upload_time2023-12-04 15:23:52
maintainer
docs_urlNone
author
requires_python>=3.7
license
keywords ai haystack llm nlp
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # vLLM-haystack-adapter
[![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)

Simply use [vLLM](https://github.com/vllm-project/vllm) in your haystack pipeline, to utilize fast, self-hosted LLMs. 

<p align="center">
    <img alt="vLLM" src="https://raw.githubusercontent.com/vllm-project/vllm/main/docs/source/assets/logos/vllm-logo-text-light.png" width="45%" style="vertical-align: middle;">
    <a href="https://www.deepset.ai/haystack/">
        <img src="https://raw.githubusercontent.com/deepset-ai/haystack/main/docs/img/haystack_logo_colored.png" alt="Haystack" width="45%" style="vertical-align: middle;">
    </a>
</p>

## Installation
Install the wrapper via pip:  `pip install vllm-haystack`

## Usage
This integration provides two invocation layers:
- `vLLMInvocationLayer`: To use models hosted on a vLLM server (or any other OpenAI compatible server)
- `vLLMLocalInvocationLayer`: To use locally hosted vLLM models

### Use a Model Hosted on a vLLM Server
To utilize the wrapper the `vLLMInvocationLayer` has to be used. 

Here is a simple example of how a `PromptNode` can be created with the wrapper.
```python
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMInvocationLayer


model = PromptModel(model_name_or_path="", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key="EMPTY", model_kwargs={
        "api_base" : API, # Replace this with your API-URL
        "maximum_context_length": 2048,
    })

prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
```
The model will be inferred based on the model served on the vLLM server.
For more configuration examples, take a look at the unit-tests.

#### Hosting a vLLM Server

To create an *OpenAI-Compatible Server* via vLLM you can follow the steps in the 
Quickstart section of their [documentation](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server).

### Use a Model Hosted Locally
⚠️To run `vLLM` locally you need to have `vllm` installed and a supported GPU.

If you don't want to use an API-Server this wrapper also provides a `vLLMLocalInvocationLayer` which executes the vLLM on the same node Haystack is running on. 

Here is a simple example of how a `PromptNode` can be created with the `vLLMLocalInvocationLayer`.
```python
from haystack.nodes import PromptNode, PromptModel
from vllm_haystack import vLLMLocalInvocationLayer

model = PromptModel(model_name_or_path=MODEL, invocation_layer_class=vLLMLocalInvocationLayer, max_length=256, model_kwargs={
        "maximum_context_length": 2048,
    })

prompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)
```



            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "vllm-haystack",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "AI,Haystack,LLM,NLP",
    "author": "",
    "author_email": "Lukas Kreussel <65088241+LLukas22@users.noreply.github.com>",
    "download_url": "https://files.pythonhosted.org/packages/16/35/df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9/vllm_haystack-0.1.2.tar.gz",
    "platform": null,
    "description": "# vLLM-haystack-adapter\n[![PyPI - Version](https://img.shields.io/pypi/v/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/vllm-haystack.svg)](https://pypi.org/project/vllm-haystack)\n\nSimply use [vLLM](https://github.com/vllm-project/vllm) in your haystack pipeline, to utilize fast, self-hosted LLMs. \n\n<p align=\"center\">\n    <img alt=\"vLLM\" src=\"https://raw.githubusercontent.com/vllm-project/vllm/main/docs/source/assets/logos/vllm-logo-text-light.png\" width=\"45%\" style=\"vertical-align: middle;\">\n    <a href=\"https://www.deepset.ai/haystack/\">\n        <img src=\"https://raw.githubusercontent.com/deepset-ai/haystack/main/docs/img/haystack_logo_colored.png\" alt=\"Haystack\" width=\"45%\" style=\"vertical-align: middle;\">\n    </a>\n</p>\n\n## Installation\nInstall the wrapper via pip:  `pip install vllm-haystack`\n\n## Usage\nThis integration provides two invocation layers:\n- `vLLMInvocationLayer`: To use models hosted on a vLLM server (or any other OpenAI compatible server)\n- `vLLMLocalInvocationLayer`: To use locally hosted vLLM models\n\n### Use a Model Hosted on a vLLM Server\nTo utilize the wrapper the `vLLMInvocationLayer` has to be used. \n\nHere is a simple example of how a `PromptNode` can be created with the wrapper.\n```python\nfrom haystack.nodes import PromptNode, PromptModel\nfrom vllm_haystack import vLLMInvocationLayer\n\n\nmodel = PromptModel(model_name_or_path=\"\", invocation_layer_class=vLLMInvocationLayer, max_length=256, api_key=\"EMPTY\", model_kwargs={\n        \"api_base\" : API, # Replace this with your API-URL\n        \"maximum_context_length\": 2048,\n    })\n\nprompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)\n```\nThe model will be inferred based on the model served on the vLLM server.\nFor more configuration examples, take a look at the unit-tests.\n\n#### Hosting a vLLM Server\n\nTo create an *OpenAI-Compatible Server* via vLLM you can follow the steps in the \nQuickstart section of their [documentation](https://vllm.readthedocs.io/en/latest/getting_started/quickstart.html#openai-compatible-server).\n\n### Use a Model Hosted Locally\n\u26a0\ufe0fTo run `vLLM` locally you need to have `vllm` installed and a supported GPU.\n\nIf you don't want to use an API-Server this wrapper also provides a `vLLMLocalInvocationLayer` which executes the vLLM on the same node Haystack is running on. \n\nHere is a simple example of how a `PromptNode` can be created with the `vLLMLocalInvocationLayer`.\n```python\nfrom haystack.nodes import PromptNode, PromptModel\nfrom vllm_haystack import vLLMLocalInvocationLayer\n\nmodel = PromptModel(model_name_or_path=MODEL, invocation_layer_class=vLLMLocalInvocationLayer, max_length=256, model_kwargs={\n        \"maximum_context_length\": 2048,\n    })\n\nprompt_node = PromptNode(model_name_or_path=model, top_k=1, max_length=256)\n```\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A simple adapter to use vLLM in your Haystack pipelines.",
    "version": "0.1.2",
    "project_urls": {
        "Documentation": "https://github.com/LLukas22/vLLM-haystack-adapter#readme",
        "Issues": "https://github.com/LLukas22/vLLM-haystack-adapter/issues",
        "Source": "https://github.com/LLukas22/vLLM-haystack-adapter"
    },
    "split_keywords": [
        "ai",
        "haystack",
        "llm",
        "nlp"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4a24b561ec26e5b4b7aeb54c851157f624b2aa4c739948f6d41a9c2734bdfc96",
                "md5": "d34f78e43441b7647c5b0c8da81dde66",
                "sha256": "93993735949ab201a70802e31d9d2aad6a98c2c3618bca990e00d829c0868950"
            },
            "downloads": -1,
            "filename": "vllm_haystack-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d34f78e43441b7647c5b0c8da81dde66",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 7854,
            "upload_time": "2023-12-04T15:23:51",
            "upload_time_iso_8601": "2023-12-04T15:23:51.313138Z",
            "url": "https://files.pythonhosted.org/packages/4a/24/b561ec26e5b4b7aeb54c851157f624b2aa4c739948f6d41a9c2734bdfc96/vllm_haystack-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1635df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9",
                "md5": "86357bb0ef46c7699ea85df9d106332f",
                "sha256": "530015c6518a60114ce02a534addd09dfa035f69dc88d63e1fafd7c43364a692"
            },
            "downloads": -1,
            "filename": "vllm_haystack-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "86357bb0ef46c7699ea85df9d106332f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 10462,
            "upload_time": "2023-12-04T15:23:52",
            "upload_time_iso_8601": "2023-12-04T15:23:52.489455Z",
            "url": "https://files.pythonhosted.org/packages/16/35/df875e43b92273f0b9157f029bf6747f674d51da39bb64e65af7f392a7b9/vllm_haystack-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-04 15:23:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "LLukas22",
    "github_project": "vLLM-haystack-adapter#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "vllm-haystack"
}
        
Elapsed time: 0.17204s