Name | llama-index-llms-optimum-intel JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | llama-index llms optimum intel integration |
upload_time | 2024-10-08 22:35:42 |
maintainer | None |
docs_url | None |
author | Your Name |
requires_python | <4.0,>=3.9 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# LlamaIndex Llms Integration: Optimum Intel IPEX backend
## Installation
To install the required packages, run:
```bash
%pip install llama-index-llms-optimum-intel
!pip install llama-index
```
## Setup
### Define Functions for Prompt Handling
You will need functions to convert messages and completions into prompts:
```python
from llama_index.llms.optimum_intel import OptimumIntelLLM
def messages_to_prompt(messages):
prompt = ""
for message in messages:
if message.role == "system":
prompt += f"<|system|>\n{message.content}</s>\n"
elif message.role == "user":
prompt += f"<|user|>\n{message.content}</s>\n"
elif message.role == "assistant":
prompt += f"<|assistant|>\n{message.content}</s>\n"
# Ensure we start with a system prompt, insert blank if needed
if not prompt.startswith("<|system|>\n"):
prompt = "<|system|>\n</s>\n" + prompt
# Add final assistant prompt
prompt = prompt + "<|assistant|>\n"
return prompt
def completion_to_prompt(completion):
return f"<|system|>\n</s>\n<|user|>\n{completion}</s>\n<|assistant|>\n"
```
### Model Loading
Models can be loaded by specifying parameters using the `OptimumIntelLLM` method:
```python
oi_llm = OptimumIntelLLM(
model_name="Intel/neural-chat-7b-v3-3",
tokenizer_name="Intel/neural-chat-7b-v3-3",
context_window=3900,
max_new_tokens=256,
generate_kwargs={"temperature": 0.7, "top_k": 50, "top_p": 0.95},
messages_to_prompt=messages_to_prompt,
completion_to_prompt=completion_to_prompt,
device_map="cpu",
)
response = oi_llm.complete("What is the meaning of life?")
print(str(response))
```
### Streaming Responses
To use the streaming capabilities, you can use the `stream_complete` and `stream_chat` methods:
#### Using `stream_complete`
```python
response = oi_llm.stream_complete("Who is Mother Teresa?")
for r in response:
print(r.delta, end="")
```
#### Using `stream_chat`
```python
from llama_index.core.llms import ChatMessage
messages = [
ChatMessage(
role="system",
content="You are an American chef in a small restaurant in New Orleans",
),
ChatMessage(role="user", content="What is your dish of the day?"),
]
resp = oi_llm.stream_chat(messages)
for r in resp:
print(r.delta, end="")
```
### LLM Implementation example
https://docs.llamaindex.ai/en/stable/examples/llm/optimum_intel/
Raw data
{
"_id": null,
"home_page": null,
"name": "llama-index-llms-optimum-intel",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Your Name",
"author_email": "you@example.com",
"download_url": "https://files.pythonhosted.org/packages/f8/1a/da74d5f56101a397d618046a92a65e1061252b5d5f374ec0c48bbe12c053/llama_index_llms_optimum_intel-0.2.1.tar.gz",
"platform": null,
"description": "# LlamaIndex Llms Integration: Optimum Intel IPEX backend\n\n## Installation\n\nTo install the required packages, run:\n\n```bash\n%pip install llama-index-llms-optimum-intel\n!pip install llama-index\n```\n\n## Setup\n\n### Define Functions for Prompt Handling\n\nYou will need functions to convert messages and completions into prompts:\n\n```python\nfrom llama_index.llms.optimum_intel import OptimumIntelLLM\n\n\ndef messages_to_prompt(messages):\n prompt = \"\"\n for message in messages:\n if message.role == \"system\":\n prompt += f\"<|system|>\\n{message.content}</s>\\n\"\n elif message.role == \"user\":\n prompt += f\"<|user|>\\n{message.content}</s>\\n\"\n elif message.role == \"assistant\":\n prompt += f\"<|assistant|>\\n{message.content}</s>\\n\"\n\n # Ensure we start with a system prompt, insert blank if needed\n if not prompt.startswith(\"<|system|>\\n\"):\n prompt = \"<|system|>\\n</s>\\n\" + prompt\n\n # Add final assistant prompt\n prompt = prompt + \"<|assistant|>\\n\"\n\n return prompt\n\n\ndef completion_to_prompt(completion):\n return f\"<|system|>\\n</s>\\n<|user|>\\n{completion}</s>\\n<|assistant|>\\n\"\n```\n\n### Model Loading\n\nModels can be loaded by specifying parameters using the `OptimumIntelLLM` method:\n\n```python\noi_llm = OptimumIntelLLM(\n model_name=\"Intel/neural-chat-7b-v3-3\",\n tokenizer_name=\"Intel/neural-chat-7b-v3-3\",\n context_window=3900,\n max_new_tokens=256,\n generate_kwargs={\"temperature\": 0.7, \"top_k\": 50, \"top_p\": 0.95},\n messages_to_prompt=messages_to_prompt,\n completion_to_prompt=completion_to_prompt,\n device_map=\"cpu\",\n)\n\nresponse = oi_llm.complete(\"What is the meaning of life?\")\nprint(str(response))\n```\n\n### Streaming Responses\n\nTo use the streaming capabilities, you can use the `stream_complete` and `stream_chat` methods:\n\n#### Using `stream_complete`\n\n```python\nresponse = oi_llm.stream_complete(\"Who is Mother Teresa?\")\nfor r in response:\n print(r.delta, end=\"\")\n```\n\n#### Using `stream_chat`\n\n```python\nfrom llama_index.core.llms import ChatMessage\n\nmessages = [\n ChatMessage(\n role=\"system\",\n content=\"You are an American chef in a small restaurant in New Orleans\",\n ),\n ChatMessage(role=\"user\", content=\"What is your dish of the day?\"),\n]\n\nresp = oi_llm.stream_chat(messages)\n\nfor r in resp:\n print(r.delta, end=\"\")\n```\n\n### LLM Implementation example\n\nhttps://docs.llamaindex.ai/en/stable/examples/llm/optimum_intel/\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "llama-index llms optimum intel integration",
"version": "0.2.1",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b16489662dea5081e0744e76085239962ece714abe880a40ddcc830618baccba",
"md5": "33c7d56cc439fdebf34ba2fbd0261e2f",
"sha256": "e675db79b821b4b0aa1ce695da91aba6cf231acfad5a7d248fa29ca406f6c68e"
},
"downloads": -1,
"filename": "llama_index_llms_optimum_intel-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "33c7d56cc439fdebf34ba2fbd0261e2f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 4378,
"upload_time": "2024-10-08T22:35:40",
"upload_time_iso_8601": "2024-10-08T22:35:40.920589Z",
"url": "https://files.pythonhosted.org/packages/b1/64/89662dea5081e0744e76085239962ece714abe880a40ddcc830618baccba/llama_index_llms_optimum_intel-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f81ada74d5f56101a397d618046a92a65e1061252b5d5f374ec0c48bbe12c053",
"md5": "1f6203be3f2a2628297810edeccd8b9c",
"sha256": "8e441c4f466a9422b7d79fdbb7f71574c388e429c14d4023a71fb1fd73fe5c70"
},
"downloads": -1,
"filename": "llama_index_llms_optimum_intel-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "1f6203be3f2a2628297810edeccd8b9c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 3757,
"upload_time": "2024-10-08T22:35:42",
"upload_time_iso_8601": "2024-10-08T22:35:42.360520Z",
"url": "https://files.pythonhosted.org/packages/f8/1a/da74d5f56101a397d618046a92a65e1061252b5d5f374ec0c48bbe12c053/llama_index_llms_optimum_intel-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-08 22:35:42",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "llama-index-llms-optimum-intel"
}