llama-index-llms-optimum-intel


Namellama-index-llms-optimum-intel JSON
Version 0.2.1 PyPI version JSON
download
home_pageNone
Summaryllama-index llms optimum intel integration
upload_time2024-10-08 22:35:42
maintainerNone
docs_urlNone
authorYour Name
requires_python<4.0,>=3.9
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # LlamaIndex Llms Integration: Optimum Intel IPEX backend

## Installation

To install the required packages, run:

```bash
%pip install llama-index-llms-optimum-intel
!pip install llama-index
```

## Setup

### Define Functions for Prompt Handling

You will need functions to convert messages and completions into prompts:

```python
from llama_index.llms.optimum_intel import OptimumIntelLLM


def messages_to_prompt(messages):
    prompt = ""
    for message in messages:
        if message.role == "system":
            prompt += f"<|system|>\n{message.content}</s>\n"
        elif message.role == "user":
            prompt += f"<|user|>\n{message.content}</s>\n"
        elif message.role == "assistant":
            prompt += f"<|assistant|>\n{message.content}</s>\n"

    # Ensure we start with a system prompt, insert blank if needed
    if not prompt.startswith("<|system|>\n"):
        prompt = "<|system|>\n</s>\n" + prompt

    # Add final assistant prompt
    prompt = prompt + "<|assistant|>\n"

    return prompt


def completion_to_prompt(completion):
    return f"<|system|>\n</s>\n<|user|>\n{completion}</s>\n<|assistant|>\n"
```

### Model Loading

Models can be loaded by specifying parameters using the `OptimumIntelLLM` method:

```python
oi_llm = OptimumIntelLLM(
    model_name="Intel/neural-chat-7b-v3-3",
    tokenizer_name="Intel/neural-chat-7b-v3-3",
    context_window=3900,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    device_map="cpu",
)

response = oi_llm.complete("What is the meaning of life?")
print(str(response))
```

### Streaming Responses

To use the streaming capabilities, you can use the `stream_complete` and `stream_chat` methods:

#### Using `stream_complete`

```python
response = oi_llm.stream_complete("Who is Mother Teresa?")
for r in response:
    print(r.delta, end="")
```

#### Using `stream_chat`

```python
from llama_index.core.llms import ChatMessage

messages = [
    ChatMessage(
        role="system",
        content="You are an American chef in a small restaurant in New Orleans",
    ),
    ChatMessage(role="user", content="What is your dish of the day?"),
]

resp = oi_llm.stream_chat(messages)

for r in resp:
    print(r.delta, end="")
```

### LLM Implementation example

https://docs.llamaindex.ai/en/stable/examples/llm/optimum_intel/

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-index-llms-optimum-intel",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Your Name",
    "author_email": "you@example.com",
    "download_url": "https://files.pythonhosted.org/packages/f8/1a/da74d5f56101a397d618046a92a65e1061252b5d5f374ec0c48bbe12c053/llama_index_llms_optimum_intel-0.2.1.tar.gz",
    "platform": null,
    "description": "# LlamaIndex Llms Integration: Optimum Intel IPEX backend\n\n## Installation\n\nTo install the required packages, run:\n\n```bash\n%pip install llama-index-llms-optimum-intel\n!pip install llama-index\n```\n\n## Setup\n\n### Define Functions for Prompt Handling\n\nYou will need functions to convert messages and completions into prompts:\n\n```python\nfrom llama_index.llms.optimum_intel import OptimumIntelLLM\n\n\ndef messages_to_prompt(messages):\n    prompt = \"\"\n    for message in messages:\n        if message.role == \"system\":\n            prompt += f\"<|system|>\\n{message.content}</s>\\n\"\n        elif message.role == \"user\":\n            prompt += f\"<|user|>\\n{message.content}</s>\\n\"\n        elif message.role == \"assistant\":\n            prompt += f\"<|assistant|>\\n{message.content}</s>\\n\"\n\n    # Ensure we start with a system prompt, insert blank if needed\n    if not prompt.startswith(\"<|system|>\\n\"):\n        prompt = \"<|system|>\\n</s>\\n\" + prompt\n\n    # Add final assistant prompt\n    prompt = prompt + \"<|assistant|>\\n\"\n\n    return prompt\n\n\ndef completion_to_prompt(completion):\n    return f\"<|system|>\\n</s>\\n<|user|>\\n{completion}</s>\\n<|assistant|>\\n\"\n```\n\n### Model Loading\n\nModels can be loaded by specifying parameters using the `OptimumIntelLLM` method:\n\n```python\noi_llm = OptimumIntelLLM(\n    model_name=\"Intel/neural-chat-7b-v3-3\",\n    tokenizer_name=\"Intel/neural-chat-7b-v3-3\",\n    context_window=3900,\n    max_new_tokens=256,\n    generate_kwargs={\"temperature\": 0.7, \"top_k\": 50, \"top_p\": 0.95},\n    messages_to_prompt=messages_to_prompt,\n    completion_to_prompt=completion_to_prompt,\n    device_map=\"cpu\",\n)\n\nresponse = oi_llm.complete(\"What is the meaning of life?\")\nprint(str(response))\n```\n\n### Streaming Responses\n\nTo use the streaming capabilities, you can use the `stream_complete` and `stream_chat` methods:\n\n#### Using `stream_complete`\n\n```python\nresponse = oi_llm.stream_complete(\"Who is Mother Teresa?\")\nfor r in response:\n    print(r.delta, end=\"\")\n```\n\n#### Using `stream_chat`\n\n```python\nfrom llama_index.core.llms import ChatMessage\n\nmessages = [\n    ChatMessage(\n        role=\"system\",\n        content=\"You are an American chef in a small restaurant in New Orleans\",\n    ),\n    ChatMessage(role=\"user\", content=\"What is your dish of the day?\"),\n]\n\nresp = oi_llm.stream_chat(messages)\n\nfor r in resp:\n    print(r.delta, end=\"\")\n```\n\n### LLM Implementation example\n\nhttps://docs.llamaindex.ai/en/stable/examples/llm/optimum_intel/\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "llama-index llms optimum intel integration",
    "version": "0.2.1",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b16489662dea5081e0744e76085239962ece714abe880a40ddcc830618baccba",
                "md5": "33c7d56cc439fdebf34ba2fbd0261e2f",
                "sha256": "e675db79b821b4b0aa1ce695da91aba6cf231acfad5a7d248fa29ca406f6c68e"
            },
            "downloads": -1,
            "filename": "llama_index_llms_optimum_intel-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "33c7d56cc439fdebf34ba2fbd0261e2f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 4378,
            "upload_time": "2024-10-08T22:35:40",
            "upload_time_iso_8601": "2024-10-08T22:35:40.920589Z",
            "url": "https://files.pythonhosted.org/packages/b1/64/89662dea5081e0744e76085239962ece714abe880a40ddcc830618baccba/llama_index_llms_optimum_intel-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f81ada74d5f56101a397d618046a92a65e1061252b5d5f374ec0c48bbe12c053",
                "md5": "1f6203be3f2a2628297810edeccd8b9c",
                "sha256": "8e441c4f466a9422b7d79fdbb7f71574c388e429c14d4023a71fb1fd73fe5c70"
            },
            "downloads": -1,
            "filename": "llama_index_llms_optimum_intel-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "1f6203be3f2a2628297810edeccd8b9c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 3757,
            "upload_time": "2024-10-08T22:35:42",
            "upload_time_iso_8601": "2024-10-08T22:35:42.360520Z",
            "url": "https://files.pythonhosted.org/packages/f8/1a/da74d5f56101a397d618046a92a65e1061252b5d5f374ec0c48bbe12c053/llama_index_llms_optimum_intel-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-08 22:35:42",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "llama-index-llms-optimum-intel"
}
        
Elapsed time: 0.36439s