# LLMlight
[](https://img.shields.io/pypi/pyversions/LLMlight)
[](https://pypi.org/project/LLMlight/)
[](https://erdogant.github.io/LLMlight/)
[](https://github.com/erdogant/LLMlight/)
[](https://pepy.tech/project/LLMlight)
[](https://pepy.tech/project/LLMlight)
[](https://github.com/erdogant/LLMlight/blob/master/LICENSE)
[](https://github.com/erdogant/LLMlight/network)
[](https://github.com/erdogant/LLMlight/issues)
[](http://www.repostatus.org/#active)
[](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#medium-blog)
[](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#colab-notebook)
[](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#)
<div>
LLMlight is a Python package for running Large Language Models (LLMs) locally with minimal dependencies. It provides a simple interface to interact with various LLM models, including support for GGUF models and local API endpoints. ⭐️Star it if you like it⭐️
</div>
---
<p align="left">
<a href="https://erdogant.github.io/llmlight/">
<img src="https://raw.githubusercontent.com/erdogant/llmlight/master/docs/figs/schematic_overview.png" width="600" alt="Schematic Overview" />
</a>
</p>
### Key Features
| Feature | Description |
|---------|-------------|
| [**Local LLM Support**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#get-available-llm-models) | Run LLMs locally with minimal dependencies. |
| [**Full Prompt Control**]() | Fine-grained control over prompts including Query, Instructions, System, Context, Response Format, Automatic formatting, Temperature, and Top P. |
| [**Single Endpoint for All Local Models**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#get-available-llm-models) | One unified endpoint to connect different local models, supporting Hermes-3-Llama-3.2-3B, Mistral-7B-Grok, OpenHermes-2.5-Mistral-7B, Gemma-2-9B-IT, and more. |
| [**Flexible Embedding Methods**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#preprocessing-chunking) | Multiple embedding strategies: TF-IDF for structured documents, Bag of Words (BOW), BERT for free text, BGE-Small. |
| [**Advanced Retrieval Methods**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#rag-with-statistical-validation) | Supports Naive RAG with fixed chunking and RSE (Relevant Segment Extraction). |
| [**Context Strategies**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#rag-with-statistical-validation) | Advanced reasoning for complex queries using Global-reasoning and Chunk-wise approaches. |
| [**Local Memory**](https://erdogant.github.io/LLMlight/pages/html/Saving%20and%20Loading.html) | Video memory storage for efficient local use. |
| [**PDF Processing**](https://erdogant.github.io/LLMlight/pages/html/Saving%20and%20Loading.html) | Native support for reading and processing PDF documents. |
---
## Documentation & Resources
- [Documentation](https://erdogant.github.io/LLMlight)
- [Blog Posts](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#medium-blog)
- [GitHub Issues](https://github.com/erdogant/LLMlight/issues)
## Quick Start
### Installation
```bash
# Install from PyPI
pip install LLMlight
# Install from GitHub
pip install git+https://github.com/erdogant/LLMlight
```
### Basic Usage with Endpoint
```python
from LLMlight import LLMlight
# Initialize with default settings
client = LLMlight(endpoint='http://localhost:1234/v1/chat/completions')
# Run a simple query
response = client.prompt('What is the capital of France?',
context='The capital of France is Amsterdam.',
instructions='Do not argue with the information in the context. Only return the information from the context.')
print(response)
# According to the provided context, the capital of France is Amsterdam.
```
## Examples
### 1. Check Available Models at Endpoint
```python
from LLMlight import LLMlight
# Initialize client
from LLMlight import LLMlight
# Initialize with LM Studio endpoint
client = LLMlight(model='mistralai/mistral-small-3.2',
endpoint="http://localhost:1234/v1/chat/completions")
modelnames = client.get_available_models(validate=False)
print(modelnames)
```
### 2. Basic Usage with Local GGUF
```python
from LLMlight import LLMlight
# Use with a local GGUF client
client = LLMlight(endpoint='path/to/your/client.gguf')
# Run a simple query
response = client.prompt('What is the capital of France?',
context='The capital of France is Amsterdam.',
instructions='Do not argue with the information in the context. Only return the information from the context.')
print(response)
# According to the provided context, the capital of France is Amsterdam.
```
### 3. Using with LM Studio
```python
# Import library
from LLMlight import LLMlight
# Initialize with LM Studio endpoint
client = LLMlight(model='mistralai/mistral-small-3.2',
endpoint="http://localhost:1234/v1/chat/completions")
# Run queries
response = client.prompt('Explain quantum computing in simple terms')
```
### 3. Query against PDF files
```python
# Load library
from LLMlight import LLMlight
# Initialize with default settings
client = LLMlight(model='mistralai/mistral-small-3.2',
context_strategy='chunk-wise',
retrieval_method='naive_rag',
embedding={'memory': 'memvid', 'context': 'bert'},
top_chunks=5)
# Read pdf
path = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'
pdf_text = client.read_pdf(path, return_type='text')
context = pdf_text + '\n More text can be appended in this manner'
# Make a prompt
response = client.prompt('What is an attention network?',
context=context,
instructions='Answer the question using only the information from the context. If the answer can not be found, tell that.')
print(response)
```
### 4. Global Reasoning
```python
from LLMlight import LLMlight
client = LLMlight(model='microsoft/phi-4', context_strategy='global-reasoning')
path = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'
pdf_text = client.read_pdf(path, return_type='text')
# Make a prompt
response = client.prompt('What is an attention network?',
context=context,
instructions='Answer the question using only the information from the context. If the answer can not be found, tell that.')
print(response)
```
### 5. Creating Local Memory Database
```python
# Load library
from LLMlight import LLMlight
# Initialize with default settings
client = LLMlight(model='mistralai/mistral-small-3.2', file_path='local_database.mp4')
url1 = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'
url2 = 'https://erdogant.github.io/publications/papers/2020%20-%20Taskesen%20et%20al%20-%20HNet%20Hypergeometric%20Networks.pdf'
# Add multiple PDF files to the database
client.memory_add(files=[url1, url2])
# Add more chunks of information
client.memory_add(text=['Small chunk that is also added to the database.',
'The capital of France is Amsterdam.'],
overwrite=True)
# Add all file types from a directory
client.memory_add(dirpath='c:/my_documents/',
filetypes = ['.pdf', '.txt', '.epub', '.md', '.doc', '.docx', '.rtf', '.html', '.htm'],
)
# Store to disk
client.memory_save()
# =============================================================================
# Load from database
# =============================================================================
# Import
from LLMlight import LLMlight
# Initialize with local database
client = LLMlight(model='mistralai/mistral-small-3.2', file_path='local_database.mp4')
# Get the top 5 chunks
client.memory_chunks(n=5)
# Search through the chunks using a query
out1 = client.memory.retriever.search('Attention Is All You Need', top_k=3)
out2 = client.memory.retriever.search('Enrichment analysis, Hypergeometric Networks', top_k=3)
out3 = client.memory.retriever.search('Capital of Amsterdam', top_k=3)
```
### 6. Load Local Memory Database
```python
# Import library
from LLMlight import LLMlight
# Initialize with default settings
client = LLMlight(preprocessing=None, retrieval_method=None, path_to_memory="knowledge_base.mp4")
# Create queries
response = client.prompt('What do apes like?', instructions='Only return the information from the context. Answer with maximum of 3 words, and starts with "Apes like: "')
print(response)
```
### Contributors
Thank the contributors!
<p align="left">
<a href="https://github.com/erdogant/llmlight/graphs/contributors">
<img src="https://contrib.rocks/image?repo=erdogant/llmlight" />
</a>
</p>
### Maintainer
* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)
* Contributions are welcome.
* Yes! This library is entirely **free** but it runs on coffee! :) Feel free to support with a <a href="https://erdogant.github.io/donate/?currency=USD&amount=5">Coffee</a>.
[](https://www.buymeacoffee.com/erdogant)
Raw data
{
"_id": null,
"home_page": null,
"name": "LLMlight",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3",
"maintainer_email": null,
"keywords": "Python, LLM, large language model, NLP, text generation, embedding, sentence transformer, document analysis, AI, machine learning",
"author": null,
"author_email": "Erdogan Taskesen <erdogant@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/1f/7e/54ff6a8668c9df508bce691c35bc56480bc0cb64f3199acaa48e507bb4b9/llmlight-0.4.0.tar.gz",
"platform": null,
"description": "# LLMlight\r\n\r\n[](https://img.shields.io/pypi/pyversions/LLMlight)\r\n[](https://pypi.org/project/LLMlight/)\r\n[](https://erdogant.github.io/LLMlight/)\r\n[](https://github.com/erdogant/LLMlight/)\r\n[](https://pepy.tech/project/LLMlight)\r\n[](https://pepy.tech/project/LLMlight)\r\n[](https://github.com/erdogant/LLMlight/blob/master/LICENSE)\r\n[](https://github.com/erdogant/LLMlight/network)\r\n[](https://github.com/erdogant/LLMlight/issues)\r\n[](http://www.repostatus.org/#active)\r\n[](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#medium-blog)\r\n[](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#colab-notebook)\r\n[](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#)\r\n\r\n<div>\r\nLLMlight is a Python package for running Large Language Models (LLMs) locally with minimal dependencies. It provides a simple interface to interact with various LLM models, including support for GGUF models and local API endpoints. \u2b50\ufe0fStar it if you like it\u2b50\ufe0f\r\n</div>\r\n\r\n---\r\n<p align=\"left\">\r\n <a href=\"https://erdogant.github.io/llmlight/\">\r\n <img src=\"https://raw.githubusercontent.com/erdogant/llmlight/master/docs/figs/schematic_overview.png\" width=\"600\" alt=\"Schematic Overview\" />\r\n </a>\r\n</p>\r\n\r\n\r\n### Key Features\r\n\r\n| Feature | Description |\r\n|---------|-------------|\r\n| [**Local LLM Support**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#get-available-llm-models) | Run LLMs locally with minimal dependencies. |\r\n| [**Full Prompt Control**]() | Fine-grained control over prompts including Query, Instructions, System, Context, Response Format, Automatic formatting, Temperature, and Top P. |\r\n| [**Single Endpoint for All Local Models**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#get-available-llm-models) | One unified endpoint to connect different local models, supporting Hermes-3-Llama-3.2-3B, Mistral-7B-Grok, OpenHermes-2.5-Mistral-7B, Gemma-2-9B-IT, and more. |\r\n| [**Flexible Embedding Methods**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#preprocessing-chunking) | Multiple embedding strategies: TF-IDF for structured documents, Bag of Words (BOW), BERT for free text, BGE-Small. |\r\n| [**Advanced Retrieval Methods**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#rag-with-statistical-validation) | Supports Naive RAG with fixed chunking and RSE (Relevant Segment Extraction). |\r\n| [**Context Strategies**](https://erdogant.github.io/LLMlight/pages/html/Algorithm.html#rag-with-statistical-validation) | Advanced reasoning for complex queries using Global-reasoning and Chunk-wise approaches. |\r\n| [**Local Memory**](https://erdogant.github.io/LLMlight/pages/html/Saving%20and%20Loading.html) | Video memory storage for efficient local use. |\r\n| [**PDF Processing**](https://erdogant.github.io/LLMlight/pages/html/Saving%20and%20Loading.html) | Native support for reading and processing PDF documents. |\r\n\r\n---\r\n\r\n## Documentation & Resources\r\n\r\n- [Documentation](https://erdogant.github.io/LLMlight)\r\n- [Blog Posts](https://erdogant.github.io/LLMlight/pages/html/Documentation.html#medium-blog)\r\n- [GitHub Issues](https://github.com/erdogant/LLMlight/issues)\r\n\r\n## Quick Start\r\n\r\n### Installation\r\n\r\n```bash\r\n# Install from PyPI\r\npip install LLMlight\r\n\r\n# Install from GitHub\r\npip install git+https://github.com/erdogant/LLMlight\r\n```\r\n\r\n### Basic Usage with Endpoint\r\n\r\n```python\r\nfrom LLMlight import LLMlight\r\n\r\n# Initialize with default settings\r\nclient = LLMlight(endpoint='http://localhost:1234/v1/chat/completions')\r\n\r\n# Run a simple query\r\nresponse = client.prompt('What is the capital of France?',\r\n context='The capital of France is Amsterdam.',\r\n instructions='Do not argue with the information in the context. Only return the information from the context.')\r\nprint(response)\r\n# According to the provided context, the capital of France is Amsterdam.\r\n\r\n```\r\n\r\n## Examples\r\n\r\n\r\n### 1. Check Available Models at Endpoint\r\n\r\n```python\r\nfrom LLMlight import LLMlight\r\n\r\n# Initialize client\r\nfrom LLMlight import LLMlight\r\n# Initialize with LM Studio endpoint\r\nclient = LLMlight(model='mistralai/mistral-small-3.2',\r\n endpoint=\"http://localhost:1234/v1/chat/completions\")\r\n\r\nmodelnames = client.get_available_models(validate=False)\r\nprint(modelnames)\r\n\r\n```\r\n\r\n### 2. Basic Usage with Local GGUF\r\n\r\n```python\r\nfrom LLMlight import LLMlight\r\n\r\n# Use with a local GGUF client\r\nclient = LLMlight(endpoint='path/to/your/client.gguf')\r\n\r\n# Run a simple query\r\nresponse = client.prompt('What is the capital of France?',\r\n context='The capital of France is Amsterdam.',\r\n instructions='Do not argue with the information in the context. Only return the information from the context.')\r\nprint(response)\r\n# According to the provided context, the capital of France is Amsterdam.\r\n\r\n```\r\n\r\n### 3. Using with LM Studio\r\n\r\n```python\r\n# Import library\r\nfrom LLMlight import LLMlight\r\n\r\n# Initialize with LM Studio endpoint\r\nclient = LLMlight(model='mistralai/mistral-small-3.2',\r\n endpoint=\"http://localhost:1234/v1/chat/completions\")\r\n\r\n# Run queries\r\nresponse = client.prompt('Explain quantum computing in simple terms')\r\n\r\n```\r\n\r\n### 3. Query against PDF files\r\n\r\n```python\r\n\r\n# Load library\r\nfrom LLMlight import LLMlight\r\n\r\n# Initialize with default settings\r\nclient = LLMlight(model='mistralai/mistral-small-3.2',\r\n context_strategy='chunk-wise',\r\n retrieval_method='naive_rag',\r\n embedding={'memory': 'memvid', 'context': 'bert'},\r\n top_chunks=5)\r\n\r\n# Read pdf\r\npath = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'\r\npdf_text = client.read_pdf(path, return_type='text')\r\ncontext = pdf_text + '\\n More text can be appended in this manner'\r\n\r\n# Make a prompt\r\nresponse = client.prompt('What is an attention network?',\r\n context=context,\r\n instructions='Answer the question using only the information from the context. If the answer can not be found, tell that.')\r\n\r\nprint(response)\r\n\r\n```\r\n\r\n### 4. Global Reasoning\r\n\r\n```python\r\nfrom LLMlight import LLMlight\r\nclient = LLMlight(model='microsoft/phi-4', context_strategy='global-reasoning')\r\n\r\npath = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'\r\npdf_text = client.read_pdf(path, return_type='text')\r\n\r\n# Make a prompt\r\nresponse = client.prompt('What is an attention network?',\r\n context=context,\r\n instructions='Answer the question using only the information from the context. If the answer can not be found, tell that.')\r\nprint(response)\r\n\r\n\r\n```\r\n\r\n\r\n### 5. Creating Local Memory Database\r\n\r\n```python\r\n\r\n# Load library\r\nfrom LLMlight import LLMlight\r\n\r\n# Initialize with default settings\r\nclient = LLMlight(model='mistralai/mistral-small-3.2', file_path='local_database.mp4')\r\n\r\nurl1 = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'\r\nurl2 = 'https://erdogant.github.io/publications/papers/2020%20-%20Taskesen%20et%20al%20-%20HNet%20Hypergeometric%20Networks.pdf'\r\n\r\n# Add multiple PDF files to the database\r\nclient.memory_add(files=[url1, url2])\r\n\r\n# Add more chunks of information\r\nclient.memory_add(text=['Small chunk that is also added to the database.',\r\n 'The capital of France is Amsterdam.'],\r\n overwrite=True)\r\n\r\n# Add all file types from a directory\r\nclient.memory_add(dirpath='c:/my_documents/',\r\n filetypes = ['.pdf', '.txt', '.epub', '.md', '.doc', '.docx', '.rtf', '.html', '.htm'],\r\n )\r\n\r\n# Store to disk\r\nclient.memory_save()\r\n\r\n\r\n# =============================================================================\r\n# Load from database\r\n# =============================================================================\r\n\r\n# Import\r\nfrom LLMlight import LLMlight\r\n# Initialize with local database\r\nclient = LLMlight(model='mistralai/mistral-small-3.2', file_path='local_database.mp4')\r\n\r\n# Get the top 5 chunks\r\nclient.memory_chunks(n=5)\r\n\r\n# Search through the chunks using a query\r\nout1 = client.memory.retriever.search('Attention Is All You Need', top_k=3)\r\nout2 = client.memory.retriever.search('Enrichment analysis, Hypergeometric Networks', top_k=3)\r\nout3 = client.memory.retriever.search('Capital of Amsterdam', top_k=3)\r\n\r\n```\r\n\r\n### 6. Load Local Memory Database\r\n\r\n```python\r\n\r\n# Import library\r\nfrom LLMlight import LLMlight\r\n\r\n# Initialize with default settings\r\nclient = LLMlight(preprocessing=None, retrieval_method=None, path_to_memory=\"knowledge_base.mp4\")\r\n\r\n# Create queries\r\nresponse = client.prompt('What do apes like?', instructions='Only return the information from the context. Answer with maximum of 3 words, and starts with \"Apes like: \"')\r\nprint(response)\r\n\r\n```\r\n\r\n### Contributors\r\nThank the contributors!\r\n\r\n<p align=\"left\">\r\n <a href=\"https://github.com/erdogant/llmlight/graphs/contributors\">\r\n <img src=\"https://contrib.rocks/image?repo=erdogant/llmlight\" />\r\n </a>\r\n</p>\r\n\r\n### Maintainer\r\n* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)\r\n* Contributions are welcome.\r\n* Yes! This library is entirely **free** but it runs on coffee! :) Feel free to support with a <a href=\"https://erdogant.github.io/donate/?currency=USD&amount=5\">Coffee</a>.\r\n\r\n[](https://www.buymeacoffee.com/erdogant)\r\n",
"bugtrack_url": null,
"license": null,
"summary": "LLMlight is a Python library for ...",
"version": "0.4.0",
"project_urls": {
"Download": "https://github.com/erdogant/LLMlight/archive/{version}.tar.gz",
"Homepage": "https://erdogant.github.io/LLMlight"
},
"split_keywords": [
"python",
" llm",
" large language model",
" nlp",
" text generation",
" embedding",
" sentence transformer",
" document analysis",
" ai",
" machine learning"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "6a47c4858cac2f360569a24d6476f704ad5017a7620482ed1715af42556167db",
"md5": "846e705dd7c8ae18382ca967689cde5a",
"sha256": "71df36eecf8b9ade8f0cacdd4a5a92d8aa8ca31722a9a7b71ab6b82c0452d9d4"
},
"downloads": -1,
"filename": "llmlight-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "846e705dd7c8ae18382ca967689cde5a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3",
"size": 49959,
"upload_time": "2025-09-13T14:27:22",
"upload_time_iso_8601": "2025-09-13T14:27:22.107012Z",
"url": "https://files.pythonhosted.org/packages/6a/47/c4858cac2f360569a24d6476f704ad5017a7620482ed1715af42556167db/llmlight-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1f7e54ff6a8668c9df508bce691c35bc56480bc0cb64f3199acaa48e507bb4b9",
"md5": "6a7ec5bb51c0318cbc3ec0a1bb74274a",
"sha256": "c444280ce36b17d98d24d26ac2222e7876278eb3c71d1586f93ab4ad70a9e246"
},
"downloads": -1,
"filename": "llmlight-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "6a7ec5bb51c0318cbc3ec0a1bb74274a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3",
"size": 47274,
"upload_time": "2025-09-13T14:27:23",
"upload_time_iso_8601": "2025-09-13T14:27:23.224815Z",
"url": "https://files.pythonhosted.org/packages/1f/7e/54ff6a8668c9df508bce691c35bc56480bc0cb64f3199acaa48e507bb4b9/llmlight-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-13 14:27:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "erdogant",
"github_project": "LLMlight",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "sentence_transformers",
"specs": []
},
{
"name": "scikit-learn",
"specs": []
},
{
"name": "json-repair",
"specs": []
},
{
"name": "pymupdf",
"specs": []
},
{
"name": "accelerate",
"specs": []
},
{
"name": "bitsandbytes",
"specs": []
},
{
"name": "torch",
"specs": []
},
{
"name": "requests",
"specs": []
},
{
"name": "memvid",
"specs": []
},
{
"name": "distfit",
"specs": []
},
{
"name": "ipywidgets",
"specs": []
},
{
"name": "llama-cpp-python",
"specs": []
}
],
"lcname": "llmlight"
}