# LLMocal: Open Source Local AI Client
> ## β οΈ **DEVELOPMENT NOTICE** β οΈ
> **This software is currently under active development and is NOT intended for production use.**
> Features may change, break, or be removed without notice. Use at your own risk in development/testing environments only.
 <!-- Replace with your own demo GIF -->
**LLMocal** is a professional-grade, open-source client for running large language models locally. Built on the principle that open source models deserve open source clients, LLMocal provides a complete local AI solution without vendor lock-in or proprietary restrictions.
π― **Why Open Source?** The AI ecosystem thrives when both models and clients are open. LLMocal ensures you have full control over your AI infrastructureβno subscriptions, no data collection, no proprietary dependencies.
Optimized for Apple Silicon (M1/M2/M3/M4) with cross-platform support for Linux and Windows.
---
[](https://github.com/alexnicita/llmocal/actions/workflows/ci.yml)
[](https://opensource.org/licenses/MIT)
## π Features
- **100% Private & Offline**: Your conversations never leave your machine. No APIs, no data collection.
- **High-Performance**: Optimized for Apple Silicon, providing fast, streaming responses.
- **State-of-the-Art Models**: Comes pre-configured with `Mistral-7B-Instruct`, a top-tier open-source model.
- **Easy to Use**: A simple, clean, and intuitive command-line interface.
- **Customizable**: Easily swap out models, adjust performance settings, and extend functionality.
- **Reproducible Setup**: Uses `uv` for fast and reliable dependency management.
- **Cross-Platform**: Works on macOS, Linux, and Windows (WSL recommended).
## π Quick Start
LLMocal can be used in two ways: as a **pip-installable package** for easy integration into your projects, or by **cloning the repository** for development.
### Option 1: Install as a Python Package (Recommended)
**Prerequisites:** Python 3.11+
```bash
# Install with pip
pip install llmocal
# Or install with uv (faster)
uv add llmocal
```
#### Programmatic Usage
```python
import llmocal
# First time setup - explicit model download
client = llmocal.LLMocal()
client.download_model() # Downloads ~4.4GB model
client.setup() # Load the model
# Chat with the AI
response = client.chat("Explain quantum computing in simple terms")
print(response)
# Or start an interactive session
client.start_interactive_chat()
# Alternative: auto-download if needed
client = llmocal.LLMocal()
client.setup(auto_download=True) # Downloads if model doesn't exist
# Use a custom model
custom_client = llmocal.LLMocal(
repo_id="TheBloke/CodeLlama-7B-Instruct-GGUF",
filename="codellama-7b-instruct.Q4_K_M.gguf"
)
custom_client.download_model() # Explicit download
custom_client.setup() # Load the model
code_response = custom_client.chat("Write a Python function to sort a list")
print(code_response)
```
#### Advanced Programmatic Usage
```python
import llmocal
from llmocal import LLMocalConfig
# Advanced configuration
config = LLMocalConfig(
n_ctx=8192, # Larger context window
n_threads=8, # More CPU threads
n_gpu_layers=35 # Use GPU acceleration (if available)
)
client = llmocal.LLMocal(config=config)
client.setup()
# Access lower-level components
engine = client.engine
model_manager = client.model_manager
# Direct model management
model_path = model_manager.get_model_path(
"microsoft/DialoGPT-medium",
"model.gguf"
)
```
#### Command Line Usage
After installation, you can use the `llmocal` command:
```bash
# Start interactive chat
llmocal chat
# Use a different model
llmocal chat --repo-id "TheBloke/Llama-2-7B-Chat-GGUF" --filename "llama-2-7b-chat.Q4_K_M.gguf"
```
### Option 2: Development Setup
For development or if you want to modify the code:
**Prerequisites:**
- **Python 3.11+**
- **`uv`**: A fast Python package installer. [Installation guide](https://github.com/astral-sh/uv#installation).
```bash
# Clone the repository
git clone https://github.com/alexnicita/llmocal.git
cd llmocal
# Run the startup script
./scripts/start.sh
```
The first time you run `start.sh`, it will download the model (approx. 4.4 GB), so it may take some time.
## π¬ How to Use
Once the application is running, you'll be greeted by the AI assistant. Just type your questions and press Enter.
### Sample Conversation
```
π€ Welcome to LLMocal!
You are now chatting with an AI model running entirely on your machine.
- Model: TheBloke/Mistral-7B-Instruct-v0.2-GGUF
- Privacy: 100% offline and private. No data leaves your computer.
Type /exit or /quit to end the chat. Use /help for more commands.
You: Can you explain the concept of zero-knowledge proofs in simple terms?
AI: Of course! Imagine you have a secret, like the password to a treasure chest, but you want to prove to a friend that you know the password without actually revealing it. A zero-knowledge proof is a cryptographic method that lets you do just that. You can convince your friend you have the secret key without them ever learning what it is. It's a fundamental concept in modern cryptography, enabling privacy and security in digital transactions.
You: Write a Python function to find the factorial of a number.
AI: Certainly! Here is a simple and efficient Python function to calculate the factorial of a non-negative integer using recursion:
```python
def factorial(n):
"""
Calculates the factorial of a non-negative integer.
Args:
n: The number to calculate the factorial of.
Returns:
The factorial of n.
"""
if n < 0:
raise ValueError("Factorial is not defined for negative numbers")
elif n == 0:
return 1
else:
return n * factorial(n - 1)
# Example usage:
print(f"The factorial of 5 is: {factorial(5)}") # Output: 120
```
```
### Special Commands
- `/exit` or `/quit`: Exit the chat application.
- `/help`: Display a list of available commands.
- `/model`: Show details about the currently loaded AI model.
## π§ Customization: Changing the Model
This project is designed to be model-agnostic. You can easily switch to any GGUF-compatible model from [Hugging Face](https://huggingface.co/models?search=gguf).
**To change the model, you can either:**
1. **Use command-line arguments (easiest):**
```bash
uv run python -m llmocal.cli chat --repo-id "TheBloke/Llama-2-7B-Chat-GGUF" --filename "llama-2-7b-chat.Q4_K_M.gguf"
```
2. **Set environment variables:**
```bash
export MODEL_REPO_ID="TheBloke/Llama-2-7B-Chat-GGUF"
export MODEL_FILENAME="llama-2-7b-chat.Q4_K_M.gguf"
./scripts/start.sh
```
3. **Edit the configuration:**
Change the `DEFAULT_REPO_ID` and `DEFAULT_FILENAME` variables in `llmocal/core/config.py`.
## π¬ Running Tests
A full suite of unit tests is included to ensure everything is working as expected. To run the tests:
```bash
uv run python -m tests.test_core
```
## π οΈ Project Structure
```
llmocal/
βββ .github/workflows/ci.yml # GitHub Actions CI/CD workflow
βββ .gitignore # Files to ignore for Git
βββ LICENSE # MIT License
βββ README.md # This file
βββ pyproject.toml # Project dependencies and metadata
βββ scripts/
β βββ start.sh # Easy startup script
βββ llmocal/ # Main package
β βββ __init__.py # Package initialization
β βββ cli.py # Command-line interface
β βββ core/ # Core functionality
β β βββ __init__.py # Core module initialization
β β βββ config.py # Configuration management
β β βββ engine.py # AI engine and model loading
β β βββ chat.py # Chat interface
β βββ models/ # Model management
β β βββ __init__.py # Models module initialization
β β βββ manager.py # Model downloading and management
β βββ api/ # API server components
β βββ ui/ # User interface components
β βββ utils/ # Utility functions
βββ tests/ # Test suite
β βββ test_core.py # Core functionality tests
βββ docs/ # Documentation
βββ examples/ # Usage examples
```
## π€ Contributing
Contributions are welcome! Whether it's bug fixes, feature additions, or documentation improvements, please feel free to open a pull request. Please make sure all tests pass before submitting.
## π License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "llmocal",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": "Alex Nicita <alex@llmocal.dev>",
"keywords": "llm, ai, local, language-model, chat, offline, privacy, llama, mistral, gguf",
"author": null,
"author_email": "Alex Nicita <alex@llmocal.dev>",
"download_url": "https://files.pythonhosted.org/packages/87/e9/d77ea84090f500e7a276b5b333e4fc533997460cf9b8e01c5a8817c31e3e/llmocal-4.1.0.tar.gz",
"platform": null,
"description": "# LLMocal: Open Source Local AI Client\n\n> ## \u26a0\ufe0f **DEVELOPMENT NOTICE** \u26a0\ufe0f\n> **This software is currently under active development and is NOT intended for production use.** \n> Features may change, break, or be removed without notice. Use at your own risk in development/testing environments only.\n\n <!-- Replace with your own demo GIF -->\n\n**LLMocal** is a professional-grade, open-source client for running large language models locally. Built on the principle that open source models deserve open source clients, LLMocal provides a complete local AI solution without vendor lock-in or proprietary restrictions.\n\n\ud83c\udfaf **Why Open Source?** The AI ecosystem thrives when both models and clients are open. LLMocal ensures you have full control over your AI infrastructure\u2014no subscriptions, no data collection, no proprietary dependencies.\n\nOptimized for Apple Silicon (M1/M2/M3/M4) with cross-platform support for Linux and Windows.\n\n---\n\n[](https://github.com/alexnicita/llmocal/actions/workflows/ci.yml)\n[](https://opensource.org/licenses/MIT)\n\n## \ud83c\udf1f Features\n\n- **100% Private & Offline**: Your conversations never leave your machine. No APIs, no data collection.\n- **High-Performance**: Optimized for Apple Silicon, providing fast, streaming responses.\n- **State-of-the-Art Models**: Comes pre-configured with `Mistral-7B-Instruct`, a top-tier open-source model.\n- **Easy to Use**: A simple, clean, and intuitive command-line interface.\n- **Customizable**: Easily swap out models, adjust performance settings, and extend functionality.\n- **Reproducible Setup**: Uses `uv` for fast and reliable dependency management.\n- **Cross-Platform**: Works on macOS, Linux, and Windows (WSL recommended).\n\n## \ud83d\ude80 Quick Start\n\nLLMocal can be used in two ways: as a **pip-installable package** for easy integration into your projects, or by **cloning the repository** for development.\n\n### Option 1: Install as a Python Package (Recommended)\n\n**Prerequisites:** Python 3.11+\n\n```bash\n# Install with pip\npip install llmocal\n\n# Or install with uv (faster)\nuv add llmocal\n```\n\n#### Programmatic Usage\n\n```python\nimport llmocal\n\n# First time setup - explicit model download\nclient = llmocal.LLMocal()\nclient.download_model() # Downloads ~4.4GB model\nclient.setup() # Load the model\n\n# Chat with the AI\nresponse = client.chat(\"Explain quantum computing in simple terms\")\nprint(response)\n\n# Or start an interactive session\nclient.start_interactive_chat()\n\n# Alternative: auto-download if needed\nclient = llmocal.LLMocal()\nclient.setup(auto_download=True) # Downloads if model doesn't exist\n\n# Use a custom model\ncustom_client = llmocal.LLMocal(\n repo_id=\"TheBloke/CodeLlama-7B-Instruct-GGUF\",\n filename=\"codellama-7b-instruct.Q4_K_M.gguf\"\n)\ncustom_client.download_model() # Explicit download\ncustom_client.setup() # Load the model\ncode_response = custom_client.chat(\"Write a Python function to sort a list\")\nprint(code_response)\n```\n\n#### Advanced Programmatic Usage\n\n```python\nimport llmocal\nfrom llmocal import LLMocalConfig\n\n# Advanced configuration\nconfig = LLMocalConfig(\n n_ctx=8192, # Larger context window\n n_threads=8, # More CPU threads\n n_gpu_layers=35 # Use GPU acceleration (if available)\n)\n\nclient = llmocal.LLMocal(config=config)\nclient.setup()\n\n# Access lower-level components\nengine = client.engine\nmodel_manager = client.model_manager\n\n# Direct model management\nmodel_path = model_manager.get_model_path(\n \"microsoft/DialoGPT-medium\", \n \"model.gguf\"\n)\n```\n\n#### Command Line Usage\n\nAfter installation, you can use the `llmocal` command:\n\n```bash\n# Start interactive chat\nllmocal chat\n\n# Use a different model\nllmocal chat --repo-id \"TheBloke/Llama-2-7B-Chat-GGUF\" --filename \"llama-2-7b-chat.Q4_K_M.gguf\"\n```\n\n### Option 2: Development Setup\n\nFor development or if you want to modify the code:\n\n**Prerequisites:**\n- **Python 3.11+**\n- **`uv`**: A fast Python package installer. [Installation guide](https://github.com/astral-sh/uv#installation).\n\n```bash\n# Clone the repository\ngit clone https://github.com/alexnicita/llmocal.git\ncd llmocal\n\n# Run the startup script\n./scripts/start.sh\n```\n\nThe first time you run `start.sh`, it will download the model (approx. 4.4 GB), so it may take some time.\n\n## \ud83d\udcac How to Use\n\nOnce the application is running, you'll be greeted by the AI assistant. Just type your questions and press Enter.\n\n### Sample Conversation\n\n```\n\ud83e\udd16 Welcome to LLMocal!\n\nYou are now chatting with an AI model running entirely on your machine.\n- Model: TheBloke/Mistral-7B-Instruct-v0.2-GGUF\n- Privacy: 100% offline and private. No data leaves your computer.\n\nType /exit or /quit to end the chat. Use /help for more commands.\n\nYou: Can you explain the concept of zero-knowledge proofs in simple terms?\n\nAI: Of course! Imagine you have a secret, like the password to a treasure chest, but you want to prove to a friend that you know the password without actually revealing it. A zero-knowledge proof is a cryptographic method that lets you do just that. You can convince your friend you have the secret key without them ever learning what it is. It's a fundamental concept in modern cryptography, enabling privacy and security in digital transactions.\n\nYou: Write a Python function to find the factorial of a number.\n\nAI: Certainly! Here is a simple and efficient Python function to calculate the factorial of a non-negative integer using recursion:\n\n```python\ndef factorial(n):\n \"\"\"\n Calculates the factorial of a non-negative integer.\n \n Args:\n n: The number to calculate the factorial of.\n \n Returns:\n The factorial of n.\n \"\"\"\n if n < 0:\n raise ValueError(\"Factorial is not defined for negative numbers\")\n elif n == 0:\n return 1\n else:\n return n * factorial(n - 1)\n\n# Example usage:\nprint(f\"The factorial of 5 is: {factorial(5)}\") # Output: 120\n```\n```\n\n### Special Commands\n\n- `/exit` or `/quit`: Exit the chat application.\n- `/help`: Display a list of available commands.\n- `/model`: Show details about the currently loaded AI model.\n\n## \ud83d\udd27 Customization: Changing the Model\n\nThis project is designed to be model-agnostic. You can easily switch to any GGUF-compatible model from [Hugging Face](https://huggingface.co/models?search=gguf).\n\n**To change the model, you can either:**\n\n1. **Use command-line arguments (easiest):**\n\n ```bash\n uv run python -m llmocal.cli chat --repo-id \"TheBloke/Llama-2-7B-Chat-GGUF\" --filename \"llama-2-7b-chat.Q4_K_M.gguf\"\n ```\n\n2. **Set environment variables:**\n\n ```bash\n export MODEL_REPO_ID=\"TheBloke/Llama-2-7B-Chat-GGUF\"\n export MODEL_FILENAME=\"llama-2-7b-chat.Q4_K_M.gguf\"\n ./scripts/start.sh\n ```\n\n3. **Edit the configuration:**\n\n Change the `DEFAULT_REPO_ID` and `DEFAULT_FILENAME` variables in `llmocal/core/config.py`.\n\n## \ud83d\udd2c Running Tests\n\nA full suite of unit tests is included to ensure everything is working as expected. To run the tests:\n\n```bash\nuv run python -m tests.test_core\n```\n\n## \ud83d\udee0\ufe0f Project Structure\n\n```\nllmocal/\n\u251c\u2500\u2500 .github/workflows/ci.yml # GitHub Actions CI/CD workflow\n\u251c\u2500\u2500 .gitignore # Files to ignore for Git\n\u251c\u2500\u2500 LICENSE # MIT License\n\u251c\u2500\u2500 README.md # This file\n\u251c\u2500\u2500 pyproject.toml # Project dependencies and metadata\n\u251c\u2500\u2500 scripts/\n\u2502 \u2514\u2500\u2500 start.sh # Easy startup script\n\u251c\u2500\u2500 llmocal/ # Main package\n\u2502 \u251c\u2500\u2500 __init__.py # Package initialization\n\u2502 \u251c\u2500\u2500 cli.py # Command-line interface\n\u2502 \u251c\u2500\u2500 core/ # Core functionality\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py # Core module initialization\n\u2502 \u2502 \u251c\u2500\u2500 config.py # Configuration management\n\u2502 \u2502 \u251c\u2500\u2500 engine.py # AI engine and model loading\n\u2502 \u2502 \u2514\u2500\u2500 chat.py # Chat interface\n\u2502 \u251c\u2500\u2500 models/ # Model management\n\u2502 \u2502 \u251c\u2500\u2500 __init__.py # Models module initialization\n\u2502 \u2502 \u2514\u2500\u2500 manager.py # Model downloading and management\n\u2502 \u251c\u2500\u2500 api/ # API server components\n\u2502 \u251c\u2500\u2500 ui/ # User interface components\n\u2502 \u2514\u2500\u2500 utils/ # Utility functions\n\u251c\u2500\u2500 tests/ # Test suite\n\u2502 \u2514\u2500\u2500 test_core.py # Core functionality tests\n\u251c\u2500\u2500 docs/ # Documentation\n\u2514\u2500\u2500 examples/ # Usage examples\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Whether it's bug fixes, feature additions, or documentation improvements, please feel free to open a pull request. Please make sure all tests pass before submitting.\n\n## \ud83d\udcdc License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n",
"bugtrack_url": null,
"license": null,
"summary": "Professional open source client for running large language models locally",
"version": "4.1.0",
"project_urls": {
"Changelog": "https://github.com/alexnicita/llmocal/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/alexnicita/llmocal#readme",
"Homepage": "https://github.com/alexnicita/llmocal",
"Issues": "https://github.com/alexnicita/llmocal/issues",
"Repository": "https://github.com/alexnicita/llmocal.git"
},
"split_keywords": [
"llm",
" ai",
" local",
" language-model",
" chat",
" offline",
" privacy",
" llama",
" mistral",
" gguf"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "cb1f214b1e49ebbb3bf8819e32675b540d336e4f5d9f45b6c6327152b622a7e1",
"md5": "63be1f976e848045a76ac1cd91f7ec8c",
"sha256": "4ae96fd2d58da5b9e16428bbd8034589374078eed34aae19ec8bea3bfd964832"
},
"downloads": -1,
"filename": "llmocal-4.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "63be1f976e848045a76ac1cd91f7ec8c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 19231,
"upload_time": "2025-08-06T21:38:06",
"upload_time_iso_8601": "2025-08-06T21:38:06.319376Z",
"url": "https://files.pythonhosted.org/packages/cb/1f/214b1e49ebbb3bf8819e32675b540d336e4f5d9f45b6c6327152b622a7e1/llmocal-4.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "87e9d77ea84090f500e7a276b5b333e4fc533997460cf9b8e01c5a8817c31e3e",
"md5": "5d2a2b1618ebbbe31536453f4feb9d0c",
"sha256": "104571d379e45ec3a1cc6e46522d0b0fe1280803870a9910b5480b5a8fd4f1de"
},
"downloads": -1,
"filename": "llmocal-4.1.0.tar.gz",
"has_sig": false,
"md5_digest": "5d2a2b1618ebbbe31536453f4feb9d0c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 20626,
"upload_time": "2025-08-06T21:38:07",
"upload_time_iso_8601": "2025-08-06T21:38:07.639249Z",
"url": "https://files.pythonhosted.org/packages/87/e9/d77ea84090f500e7a276b5b333e4fc533997460cf9b8e01c5a8817c31e3e/llmocal-4.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-06 21:38:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "alexnicita",
"github_project": "llmocal",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llmocal"
}