# localbench
<p align="center">
<img width="1280" alt="localbench Banner" src="./images/robo-bench1.jpg">
</p>
<div align="center">
<table>
<tr>
<td align="center">
<strong>🚧 EARLY DEVELOPMENT WARNING 🚧</strong>
<br />
<br />
<span>
This tool is currently about as stable as a house of cards in a wind tunnel.
<br />
Very early alpha. Bugs aren't just expected - they've signed a lease.
<br />
<br />
<code>Status: Proceed with optimism ☕</code>
</span>
</td>
</tr>
</table>
</div>
A benchmarking tool for Local LLMs. Currently keeping an eye on [Cortex.cpp](https://github.com/janhq/cortex.cpp)
but with plans to judge other frameworks equally in the future.
## What is this?
`localbench` measures performance metrics, resource utilization, and stability characteristics of your LLM deployments. Rather comprehensive, really.
## Features
- Model initialization metrics
- Runtime performance
- Resource utilization
- Advanced processing scenarios
- Workload-specific benchmarks
- System integration metrics
- Stability analysis
## Installation
Using `uvx`:
```bash
uvx install localbench
```
Using pip:
```bash
pip install localbench
```
## Usage
### Basic Benchmarking
```bash
# Standard benchmark
localbench "llama3.2:3b-gguf-q2-k"
# With detailed metrics
localbench "llama3.2:3b-gguf-q2-k" --verbose
```
### Specific Benchmarks
```bash
# Initialization only
localbench "llama3.2:3b-gguf-q2-k" --type init
# Runtime metrics
localbench "llama3.2:3b-gguf-q2-k" --type runtime
# Long-running stability test
localbench "llama3.2:3b-gguf-q2-k" --type stability --stability-duration 24
```
### Advanced Usage
```bash
# Custom benchmark prompts
localbench "llama3.2:3b-gguf-q2-k" --type workload --prompts my_prompts.json
# Multi-model benchmarking
localbench "llama3.2:3b-gguf-q2-k" --type advanced \
--secondary-models "tinyllama:1b-gguf-q4" "phi2:3b-gguf-q4"
# Export results
localbench "llama3.2:3b-gguf-q2-k" --json results.json
```
## Status
Under active development. Support for additional frameworks is planned.
## Roadmap
- Framework-agnostic benchmarking
- Additional performance metrics
- Enhanced visualizations
- Extended stability testing
- local server
## Development
### Setup
1. Clone the repository:
```bash
git clone https://github.com/username/localbench.git
cd localbench
```
2. Create and activate a virtual environment:
```bash
# Using uv (recommended)
uv venv .venv --python 3.12
source .venv/bin/activate
```
3. Install development dependencies:
```bash
# Install project in editable mode with test dependencies
uv pip install -e ".[test]"
# Install development tools
uv add --dev ruff pytest pytest-cov pytest-asyncio hypothesis
```
### Code Quality
#### Linting and Formatting
Run Ruff linter:
```bash
# Check code
ruff check .
# Auto-fix issues
ruff check --fix .
# Format code
ruff format .
# Check formatting without changes
ruff format --check .
```
#### Testing
Run tests:
```bash
# All tests
pytest
# With coverage
pytest --cov=localbench --cov-report=html
# Specific test file
pytest src/tests/test_utils.py
# With hypothesis verbose output
pytest -v src/tests/test_utils.py
```
### Pre-commit Checks
Before submitting a PR:
```bash
# Format code
ruff format .
# Run linter
ruff check .
# Run tests with coverage
pytest --cov=localbench --cov-report=term-missing
# Show coverage report in browser (optional)
python -m http.server -d htmlcov
```
### Code Style
The project uses:
- Type hints
- Some docstrings for public functions and classes
### Project Structure
```
src/
├── localbench/
│ ├── core/
│ │ ├── initialization.py # Model initialization metrics
│ │ ├── runtime.py # Runtime performance metrics
│ │ ├── resources.py # Resource utilization metrics
│ │ ├── integration.py # System integration metrics
│ │ ├── workloads.py # Workload-specific metrics
│ │ ├── stability.py # Stability metrics
│ │ └── utils.py # Shared utilities
│ ├── cli.py # Command-line interface
│ └── __init__.py
└── tests/
├── conftest.py # Shared test fixtures
├── test_initialization.py
├── test_runtime.py
├── test_resources.py
├── test_integration.py
└── test_utils.py
```
### Pre-commit Checks
Before submitting a PR:
1. Run all tests
2. Check test coverage
3. Verify type hints with mypy (coming soon)
4. Ensure docstrings are up to date
## Contributing
Issues and pull requests welcome. Do have a look at the existing ones first, though.
Raw data
{
"_id": null,
"home_page": null,
"name": "localbench",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "benchmark, cortex, llm, machine-learning",
"author": null,
"author_email": "Ramon Perez <ramon@menlo.ai>, Minh Nguyen <minh@menlo.ai>",
"download_url": "https://files.pythonhosted.org/packages/f5/f3/af83d7e07a60f96c283f70810ad33a9ad226e7ec08dd1e3e36d1c73845cc/localbench-0.0.2.tar.gz",
"platform": null,
"description": "# localbench\n\n<p align=\"center\">\n <img width=\"1280\" alt=\"localbench Banner\" src=\"./images/robo-bench1.jpg\">\n</p>\n\n<div align=\"center\">\n <table>\n <tr>\n <td align=\"center\">\n <strong>\ud83d\udea7 EARLY DEVELOPMENT WARNING \ud83d\udea7</strong>\n <br />\n <br />\n <span>\n This tool is currently about as stable as a house of cards in a wind tunnel.\n <br />\n Very early alpha. Bugs aren't just expected - they've signed a lease.\n <br />\n <br />\n <code>Status: Proceed with optimism \u2615</code>\n </span>\n </td>\n </tr>\n </table>\n</div>\n\nA benchmarking tool for Local LLMs. Currently keeping an eye on [Cortex.cpp](https://github.com/janhq/cortex.cpp)\nbut with plans to judge other frameworks equally in the future.\n\n## What is this?\n\n`localbench` measures performance metrics, resource utilization, and stability characteristics of your LLM deployments. Rather comprehensive, really.\n\n## Features\n\n- Model initialization metrics\n- Runtime performance\n- Resource utilization\n- Advanced processing scenarios\n- Workload-specific benchmarks\n- System integration metrics\n- Stability analysis\n\n## Installation\n\nUsing `uvx`:\n```bash\nuvx install localbench\n```\n\nUsing pip:\n```bash\npip install localbench\n```\n\n## Usage\n\n### Basic Benchmarking\n```bash\n# Standard benchmark\nlocalbench \"llama3.2:3b-gguf-q2-k\"\n\n# With detailed metrics\nlocalbench \"llama3.2:3b-gguf-q2-k\" --verbose\n```\n\n### Specific Benchmarks\n```bash\n# Initialization only\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type init\n\n# Runtime metrics\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type runtime\n\n# Long-running stability test\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type stability --stability-duration 24\n```\n\n### Advanced Usage\n```bash\n# Custom benchmark prompts\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type workload --prompts my_prompts.json\n\n# Multi-model benchmarking\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type advanced \\\n --secondary-models \"tinyllama:1b-gguf-q4\" \"phi2:3b-gguf-q4\"\n\n# Export results\nlocalbench \"llama3.2:3b-gguf-q2-k\" --json results.json\n```\n\n## Status\n\nUnder active development. Support for additional frameworks is planned.\n\n## Roadmap\n\n- Framework-agnostic benchmarking\n- Additional performance metrics\n- Enhanced visualizations\n- Extended stability testing\n- local server\n\n## Development\n\n### Setup\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/username/localbench.git\ncd localbench\n```\n\n2. Create and activate a virtual environment:\n```bash\n# Using uv (recommended)\nuv venv .venv --python 3.12\nsource .venv/bin/activate\n```\n\n3. Install development dependencies:\n```bash\n# Install project in editable mode with test dependencies\nuv pip install -e \".[test]\"\n\n# Install development tools\nuv add --dev ruff pytest pytest-cov pytest-asyncio hypothesis\n```\n\n### Code Quality\n\n#### Linting and Formatting\n\nRun Ruff linter:\n```bash\n# Check code\nruff check .\n\n# Auto-fix issues\nruff check --fix .\n\n# Format code\nruff format .\n\n# Check formatting without changes\nruff format --check .\n```\n\n#### Testing\n\nRun tests:\n```bash\n# All tests\npytest\n\n# With coverage\npytest --cov=localbench --cov-report=html\n\n# Specific test file\npytest src/tests/test_utils.py\n\n# With hypothesis verbose output\npytest -v src/tests/test_utils.py\n```\n\n### Pre-commit Checks\n\nBefore submitting a PR:\n```bash\n# Format code\nruff format .\n\n# Run linter\nruff check .\n\n# Run tests with coverage\npytest --cov=localbench --cov-report=term-missing\n\n# Show coverage report in browser (optional)\npython -m http.server -d htmlcov\n```\n\n### Code Style\n\nThe project uses:\n- Type hints\n- Some docstrings for public functions and classes\n\n### Project Structure\n```\nsrc/\n\u251c\u2500\u2500 localbench/\n\u2502 \u251c\u2500\u2500 core/\n\u2502 \u2502 \u251c\u2500\u2500 initialization.py # Model initialization metrics\n\u2502 \u2502 \u251c\u2500\u2500 runtime.py # Runtime performance metrics\n\u2502 \u2502 \u251c\u2500\u2500 resources.py # Resource utilization metrics\n\u2502 \u2502 \u251c\u2500\u2500 integration.py # System integration metrics\n\u2502 \u2502 \u251c\u2500\u2500 workloads.py # Workload-specific metrics\n\u2502 \u2502 \u251c\u2500\u2500 stability.py # Stability metrics\n\u2502 \u2502 \u2514\u2500\u2500 utils.py # Shared utilities\n\u2502 \u251c\u2500\u2500 cli.py # Command-line interface\n\u2502 \u2514\u2500\u2500 __init__.py\n\u2514\u2500\u2500 tests/\n \u251c\u2500\u2500 conftest.py # Shared test fixtures\n \u251c\u2500\u2500 test_initialization.py\n \u251c\u2500\u2500 test_runtime.py\n \u251c\u2500\u2500 test_resources.py\n \u251c\u2500\u2500 test_integration.py\n \u2514\u2500\u2500 test_utils.py\n```\n\n### Pre-commit Checks\n\nBefore submitting a PR:\n1. Run all tests\n2. Check test coverage\n3. Verify type hints with mypy (coming soon)\n4. Ensure docstrings are up to date\n\n\n## Contributing\n\nIssues and pull requests welcome. Do have a look at the existing ones first, though.\n",
"bugtrack_url": null,
"license": null,
"summary": "A benchmarking tool for Local LLMs.",
"version": "0.0.2",
"project_urls": null,
"split_keywords": [
"benchmark",
" cortex",
" llm",
" machine-learning"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0a01d536e0bced1bb965337f580f875d180275a58dfbfc0c14e2a64106b20bc9",
"md5": "21e5d6ba2667aa183260eb103a933106",
"sha256": "14894344be7f9d5e8744a71cdbe0d90a10513b5381172d54c5e218dc45342c61"
},
"downloads": -1,
"filename": "localbench-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "21e5d6ba2667aa183260eb103a933106",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 30610,
"upload_time": "2025-02-09T01:39:32",
"upload_time_iso_8601": "2025-02-09T01:39:32.963672Z",
"url": "https://files.pythonhosted.org/packages/0a/01/d536e0bced1bb965337f580f875d180275a58dfbfc0c14e2a64106b20bc9/localbench-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f5f3af83d7e07a60f96c283f70810ad33a9ad226e7ec08dd1e3e36d1c73845cc",
"md5": "4c3bce955e30798af8b09cb4443f2683",
"sha256": "bd8f7f6228d63cf11761520adbb6fd2ec08ef98264f5a643e42726dd22f30a3d"
},
"downloads": -1,
"filename": "localbench-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "4c3bce955e30798af8b09cb4443f2683",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 223554,
"upload_time": "2025-02-09T01:39:35",
"upload_time_iso_8601": "2025-02-09T01:39:35.736661Z",
"url": "https://files.pythonhosted.org/packages/f5/f3/af83d7e07a60f96c283f70810ad33a9ad226e7ec08dd1e3e36d1c73845cc/localbench-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-09 01:39:35",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "localbench"
}