| Name | llama-index-llms-huggingface JSON |
| Version |
0.6.0
JSON |
| download |
| home_page | None |
| Summary | llama-index llms huggingface integration |
| upload_time | 2025-07-30 23:11:41 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | <4.0,>=3.9 |
| license | None |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# LlamaIndex Llms Integration: Huggingface
## Installation
1. Install the required Python packages:
```bash
%pip install llama-index-llms-huggingface
%pip install llama-index-llms-huggingface-api
!pip install "transformers[torch]" "huggingface_hub[inference]"
!pip install llama-index
```
2. Set the Hugging Face API token as an environment variable:
```bash
export HUGGING_FACE_TOKEN=your_token_here
```
## Usage
### Import Required Libraries
```python
import os
from typing import List, Optional
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI
```
### Run a Model Locally
To run the model locally on your machine:
```python
locally_run = HuggingFaceLLM(model_name="HuggingFaceH4/zephyr-7b-alpha")
```
### Run a Model Remotely
To run the model remotely using Hugging Face's Inference API:
```python
HF_TOKEN: Optional[str] = os.getenv("HUGGING_FACE_TOKEN")
remotely_run = HuggingFaceInferenceAPI(
model_name="HuggingFaceH4/zephyr-7b-alpha", token=HF_TOKEN
)
```
### Anonymous Remote Execution
You can also use the Inference API anonymously without providing a token:
```python
remotely_run_anon = HuggingFaceInferenceAPI(
model_name="HuggingFaceH4/zephyr-7b-alpha"
)
```
### Use Recommended Model
If you do not provide a model name, Hugging Face's recommended model is used:
```python
remotely_run_recommended = HuggingFaceInferenceAPI(token=HF_TOKEN)
```
### Generate Text Completion
To generate a text completion using the remote model:
```python
completion_response = remotely_run_recommended.complete("To infinity, and")
print(completion_response)
```
### Set Global Tokenizer
If you modify the LLM, ensure you change the global tokenizer to match:
```python
from llama_index.core import set_global_tokenizer
from transformers import AutoTokenizer
set_global_tokenizer(
AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-alpha").encode
)
```
### LLM Implementation example
https://docs.llamaindex.ai/en/stable/examples/llm/huggingface/
Raw data
{
"_id": null,
"home_page": null,
"name": "llama-index-llms-huggingface",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Your Name <you@example.com>",
"download_url": "https://files.pythonhosted.org/packages/82/1d/27888e4e8e29b904865e9c31d3dd215d4bad2bb893a8ad2d2037337134a5/llama_index_llms_huggingface-0.6.0.tar.gz",
"platform": null,
"description": "# LlamaIndex Llms Integration: Huggingface\n\n## Installation\n\n1. Install the required Python packages:\n\n ```bash\n %pip install llama-index-llms-huggingface\n %pip install llama-index-llms-huggingface-api\n !pip install \"transformers[torch]\" \"huggingface_hub[inference]\"\n !pip install llama-index\n ```\n\n2. Set the Hugging Face API token as an environment variable:\n\n ```bash\n export HUGGING_FACE_TOKEN=your_token_here\n ```\n\n## Usage\n\n### Import Required Libraries\n\n```python\nimport os\nfrom typing import List, Optional\nfrom llama_index.llms.huggingface import HuggingFaceLLM\nfrom llama_index.llms.huggingface_api import HuggingFaceInferenceAPI\n```\n\n### Run a Model Locally\n\nTo run the model locally on your machine:\n\n```python\nlocally_run = HuggingFaceLLM(model_name=\"HuggingFaceH4/zephyr-7b-alpha\")\n```\n\n### Run a Model Remotely\n\nTo run the model remotely using Hugging Face's Inference API:\n\n```python\nHF_TOKEN: Optional[str] = os.getenv(\"HUGGING_FACE_TOKEN\")\nremotely_run = HuggingFaceInferenceAPI(\n model_name=\"HuggingFaceH4/zephyr-7b-alpha\", token=HF_TOKEN\n)\n```\n\n### Anonymous Remote Execution\n\nYou can also use the Inference API anonymously without providing a token:\n\n```python\nremotely_run_anon = HuggingFaceInferenceAPI(\n model_name=\"HuggingFaceH4/zephyr-7b-alpha\"\n)\n```\n\n### Use Recommended Model\n\nIf you do not provide a model name, Hugging Face's recommended model is used:\n\n```python\nremotely_run_recommended = HuggingFaceInferenceAPI(token=HF_TOKEN)\n```\n\n### Generate Text Completion\n\nTo generate a text completion using the remote model:\n\n```python\ncompletion_response = remotely_run_recommended.complete(\"To infinity, and\")\nprint(completion_response)\n```\n\n### Set Global Tokenizer\n\nIf you modify the LLM, ensure you change the global tokenizer to match:\n\n```python\nfrom llama_index.core import set_global_tokenizer\nfrom transformers import AutoTokenizer\n\nset_global_tokenizer(\n AutoTokenizer.from_pretrained(\"HuggingFaceH4/zephyr-7b-alpha\").encode\n)\n```\n\n### LLM Implementation example\n\nhttps://docs.llamaindex.ai/en/stable/examples/llm/huggingface/\n",
"bugtrack_url": null,
"license": null,
"summary": "llama-index llms huggingface integration",
"version": "0.6.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "de506692b1f8320de49ca50027d321e6fedff101f3d39ef308a088079c6db107",
"md5": "7b44ca492c776fbfd7d2d46028e59d80",
"sha256": "77f866415c9ec42dac54e33408d0d329aefde077143dadd1888697aaf5b01b86"
},
"downloads": -1,
"filename": "llama_index_llms_huggingface-0.6.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7b44ca492c776fbfd7d2d46028e59d80",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 7760,
"upload_time": "2025-07-30T23:11:40",
"upload_time_iso_8601": "2025-07-30T23:11:40.067558Z",
"url": "https://files.pythonhosted.org/packages/de/50/6692b1f8320de49ca50027d321e6fedff101f3d39ef308a088079c6db107/llama_index_llms_huggingface-0.6.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "821d27888e4e8e29b904865e9c31d3dd215d4bad2bb893a8ad2d2037337134a5",
"md5": "a25ca041f4f78e7ff858519fa5cdd7cb",
"sha256": "e4190b23955bcf11791e24514b3d89d46bd24bc3a607641a2f26205d876df9b3"
},
"downloads": -1,
"filename": "llama_index_llms_huggingface-0.6.0.tar.gz",
"has_sig": false,
"md5_digest": "a25ca041f4f78e7ff858519fa5cdd7cb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 7762,
"upload_time": "2025-07-30T23:11:41",
"upload_time_iso_8601": "2025-07-30T23:11:41.041227Z",
"url": "https://files.pythonhosted.org/packages/82/1d/27888e4e8e29b904865e9c31d3dd215d4bad2bb893a8ad2d2037337134a5/llama_index_llms_huggingface-0.6.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-30 23:11:41",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "llama-index-llms-huggingface"
}