localbench


Namelocalbench JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryA benchmarking tool for Local LLMs.
upload_time2025-02-09 01:39:35
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseNone
keywords benchmark cortex llm machine-learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # localbench

<p align="center">
  <img width="1280" alt="localbench Banner" src="./images/robo-bench1.jpg">
</p>

<div align="center">
  <table>
    <tr>
      <td align="center">
        <strong>🚧 EARLY DEVELOPMENT WARNING 🚧</strong>
        <br />
        <br />
        <span>
          This tool is currently about as stable as a house of cards in a wind tunnel.
          <br />
          Very early alpha. Bugs aren't just expected - they've signed a lease.
          <br />
          <br />
          <code>Status: Proceed with optimism ☕</code>
        </span>
      </td>
    </tr>
  </table>
</div>

A benchmarking tool for Local LLMs. Currently keeping an eye on [Cortex.cpp](https://github.com/janhq/cortex.cpp)
but with plans to judge other frameworks equally in the future.

## What is this?

`localbench` measures performance metrics, resource utilization, and stability characteristics of your LLM deployments. Rather comprehensive, really.

## Features

- Model initialization metrics
- Runtime performance
- Resource utilization
- Advanced processing scenarios
- Workload-specific benchmarks
- System integration metrics
- Stability analysis

## Installation

Using `uvx`:
```bash
uvx install localbench
```

Using pip:
```bash
pip install localbench
```

## Usage

### Basic Benchmarking
```bash
# Standard benchmark
localbench "llama3.2:3b-gguf-q2-k"

# With detailed metrics
localbench "llama3.2:3b-gguf-q2-k" --verbose
```

### Specific Benchmarks
```bash
# Initialization only
localbench "llama3.2:3b-gguf-q2-k" --type init

# Runtime metrics
localbench "llama3.2:3b-gguf-q2-k" --type runtime

# Long-running stability test
localbench "llama3.2:3b-gguf-q2-k" --type stability --stability-duration 24
```

### Advanced Usage
```bash
# Custom benchmark prompts
localbench "llama3.2:3b-gguf-q2-k" --type workload --prompts my_prompts.json

# Multi-model benchmarking
localbench "llama3.2:3b-gguf-q2-k" --type advanced \
    --secondary-models "tinyllama:1b-gguf-q4" "phi2:3b-gguf-q4"

# Export results
localbench "llama3.2:3b-gguf-q2-k" --json results.json
```

## Status

Under active development. Support for additional frameworks is planned.

## Roadmap

- Framework-agnostic benchmarking
- Additional performance metrics
- Enhanced visualizations
- Extended stability testing
- local server

## Development

### Setup

1. Clone the repository:
```bash
git clone https://github.com/username/localbench.git
cd localbench
```

2. Create and activate a virtual environment:
```bash
# Using uv (recommended)
uv venv .venv --python 3.12
source .venv/bin/activate
```

3. Install development dependencies:
```bash
# Install project in editable mode with test dependencies
uv pip install -e ".[test]"

# Install development tools
uv add --dev ruff pytest pytest-cov pytest-asyncio hypothesis
```

### Code Quality

#### Linting and Formatting

Run Ruff linter:
```bash
# Check code
ruff check .

# Auto-fix issues
ruff check --fix .

# Format code
ruff format .

# Check formatting without changes
ruff format --check .
```

#### Testing

Run tests:
```bash
# All tests
pytest

# With coverage
pytest --cov=localbench --cov-report=html

# Specific test file
pytest src/tests/test_utils.py

# With hypothesis verbose output
pytest -v src/tests/test_utils.py
```

### Pre-commit Checks

Before submitting a PR:
```bash
# Format code
ruff format .

# Run linter
ruff check .

# Run tests with coverage
pytest --cov=localbench --cov-report=term-missing

# Show coverage report in browser (optional)
python -m http.server -d htmlcov
```

### Code Style

The project uses:
- Type hints
- Some docstrings for public functions and classes

### Project Structure
```
src/
├── localbench/
│   ├── core/
│   │   ├── initialization.py   # Model initialization metrics
│   │   ├── runtime.py         # Runtime performance metrics
│   │   ├── resources.py       # Resource utilization metrics
│   │   ├── integration.py     # System integration metrics
│   │   ├── workloads.py      # Workload-specific metrics
│   │   ├── stability.py       # Stability metrics
│   │   └── utils.py          # Shared utilities
│   ├── cli.py                # Command-line interface
│   └── __init__.py
└── tests/
    ├── conftest.py           # Shared test fixtures
    ├── test_initialization.py
    ├── test_runtime.py
    ├── test_resources.py
    ├── test_integration.py
    └── test_utils.py
```

### Pre-commit Checks

Before submitting a PR:
1. Run all tests
2. Check test coverage
3. Verify type hints with mypy (coming soon)
4. Ensure docstrings are up to date


## Contributing

Issues and pull requests welcome. Do have a look at the existing ones first, though.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "localbench",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "benchmark, cortex, llm, machine-learning",
    "author": null,
    "author_email": "Ramon Perez <ramon@menlo.ai>, Minh Nguyen <minh@menlo.ai>",
    "download_url": "https://files.pythonhosted.org/packages/f5/f3/af83d7e07a60f96c283f70810ad33a9ad226e7ec08dd1e3e36d1c73845cc/localbench-0.0.2.tar.gz",
    "platform": null,
    "description": "# localbench\n\n<p align=\"center\">\n  <img width=\"1280\" alt=\"localbench Banner\" src=\"./images/robo-bench1.jpg\">\n</p>\n\n<div align=\"center\">\n  <table>\n    <tr>\n      <td align=\"center\">\n        <strong>\ud83d\udea7 EARLY DEVELOPMENT WARNING \ud83d\udea7</strong>\n        <br />\n        <br />\n        <span>\n          This tool is currently about as stable as a house of cards in a wind tunnel.\n          <br />\n          Very early alpha. Bugs aren't just expected - they've signed a lease.\n          <br />\n          <br />\n          <code>Status: Proceed with optimism \u2615</code>\n        </span>\n      </td>\n    </tr>\n  </table>\n</div>\n\nA benchmarking tool for Local LLMs. Currently keeping an eye on [Cortex.cpp](https://github.com/janhq/cortex.cpp)\nbut with plans to judge other frameworks equally in the future.\n\n## What is this?\n\n`localbench` measures performance metrics, resource utilization, and stability characteristics of your LLM deployments. Rather comprehensive, really.\n\n## Features\n\n- Model initialization metrics\n- Runtime performance\n- Resource utilization\n- Advanced processing scenarios\n- Workload-specific benchmarks\n- System integration metrics\n- Stability analysis\n\n## Installation\n\nUsing `uvx`:\n```bash\nuvx install localbench\n```\n\nUsing pip:\n```bash\npip install localbench\n```\n\n## Usage\n\n### Basic Benchmarking\n```bash\n# Standard benchmark\nlocalbench \"llama3.2:3b-gguf-q2-k\"\n\n# With detailed metrics\nlocalbench \"llama3.2:3b-gguf-q2-k\" --verbose\n```\n\n### Specific Benchmarks\n```bash\n# Initialization only\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type init\n\n# Runtime metrics\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type runtime\n\n# Long-running stability test\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type stability --stability-duration 24\n```\n\n### Advanced Usage\n```bash\n# Custom benchmark prompts\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type workload --prompts my_prompts.json\n\n# Multi-model benchmarking\nlocalbench \"llama3.2:3b-gguf-q2-k\" --type advanced \\\n    --secondary-models \"tinyllama:1b-gguf-q4\" \"phi2:3b-gguf-q4\"\n\n# Export results\nlocalbench \"llama3.2:3b-gguf-q2-k\" --json results.json\n```\n\n## Status\n\nUnder active development. Support for additional frameworks is planned.\n\n## Roadmap\n\n- Framework-agnostic benchmarking\n- Additional performance metrics\n- Enhanced visualizations\n- Extended stability testing\n- local server\n\n## Development\n\n### Setup\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/username/localbench.git\ncd localbench\n```\n\n2. Create and activate a virtual environment:\n```bash\n# Using uv (recommended)\nuv venv .venv --python 3.12\nsource .venv/bin/activate\n```\n\n3. Install development dependencies:\n```bash\n# Install project in editable mode with test dependencies\nuv pip install -e \".[test]\"\n\n# Install development tools\nuv add --dev ruff pytest pytest-cov pytest-asyncio hypothesis\n```\n\n### Code Quality\n\n#### Linting and Formatting\n\nRun Ruff linter:\n```bash\n# Check code\nruff check .\n\n# Auto-fix issues\nruff check --fix .\n\n# Format code\nruff format .\n\n# Check formatting without changes\nruff format --check .\n```\n\n#### Testing\n\nRun tests:\n```bash\n# All tests\npytest\n\n# With coverage\npytest --cov=localbench --cov-report=html\n\n# Specific test file\npytest src/tests/test_utils.py\n\n# With hypothesis verbose output\npytest -v src/tests/test_utils.py\n```\n\n### Pre-commit Checks\n\nBefore submitting a PR:\n```bash\n# Format code\nruff format .\n\n# Run linter\nruff check .\n\n# Run tests with coverage\npytest --cov=localbench --cov-report=term-missing\n\n# Show coverage report in browser (optional)\npython -m http.server -d htmlcov\n```\n\n### Code Style\n\nThe project uses:\n- Type hints\n- Some docstrings for public functions and classes\n\n### Project Structure\n```\nsrc/\n\u251c\u2500\u2500 localbench/\n\u2502   \u251c\u2500\u2500 core/\n\u2502   \u2502   \u251c\u2500\u2500 initialization.py   # Model initialization metrics\n\u2502   \u2502   \u251c\u2500\u2500 runtime.py         # Runtime performance metrics\n\u2502   \u2502   \u251c\u2500\u2500 resources.py       # Resource utilization metrics\n\u2502   \u2502   \u251c\u2500\u2500 integration.py     # System integration metrics\n\u2502   \u2502   \u251c\u2500\u2500 workloads.py      # Workload-specific metrics\n\u2502   \u2502   \u251c\u2500\u2500 stability.py       # Stability metrics\n\u2502   \u2502   \u2514\u2500\u2500 utils.py          # Shared utilities\n\u2502   \u251c\u2500\u2500 cli.py                # Command-line interface\n\u2502   \u2514\u2500\u2500 __init__.py\n\u2514\u2500\u2500 tests/\n    \u251c\u2500\u2500 conftest.py           # Shared test fixtures\n    \u251c\u2500\u2500 test_initialization.py\n    \u251c\u2500\u2500 test_runtime.py\n    \u251c\u2500\u2500 test_resources.py\n    \u251c\u2500\u2500 test_integration.py\n    \u2514\u2500\u2500 test_utils.py\n```\n\n### Pre-commit Checks\n\nBefore submitting a PR:\n1. Run all tests\n2. Check test coverage\n3. Verify type hints with mypy (coming soon)\n4. Ensure docstrings are up to date\n\n\n## Contributing\n\nIssues and pull requests welcome. Do have a look at the existing ones first, though.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A benchmarking tool for Local LLMs.",
    "version": "0.0.2",
    "project_urls": null,
    "split_keywords": [
        "benchmark",
        " cortex",
        " llm",
        " machine-learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0a01d536e0bced1bb965337f580f875d180275a58dfbfc0c14e2a64106b20bc9",
                "md5": "21e5d6ba2667aa183260eb103a933106",
                "sha256": "14894344be7f9d5e8744a71cdbe0d90a10513b5381172d54c5e218dc45342c61"
            },
            "downloads": -1,
            "filename": "localbench-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "21e5d6ba2667aa183260eb103a933106",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 30610,
            "upload_time": "2025-02-09T01:39:32",
            "upload_time_iso_8601": "2025-02-09T01:39:32.963672Z",
            "url": "https://files.pythonhosted.org/packages/0a/01/d536e0bced1bb965337f580f875d180275a58dfbfc0c14e2a64106b20bc9/localbench-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f5f3af83d7e07a60f96c283f70810ad33a9ad226e7ec08dd1e3e36d1c73845cc",
                "md5": "4c3bce955e30798af8b09cb4443f2683",
                "sha256": "bd8f7f6228d63cf11761520adbb6fd2ec08ef98264f5a643e42726dd22f30a3d"
            },
            "downloads": -1,
            "filename": "localbench-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "4c3bce955e30798af8b09cb4443f2683",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 223554,
            "upload_time": "2025-02-09T01:39:35",
            "upload_time_iso_8601": "2025-02-09T01:39:35.736661Z",
            "url": "https://files.pythonhosted.org/packages/f5/f3/af83d7e07a60f96c283f70810ad33a9ad226e7ec08dd1e3e36d1c73845cc/localbench-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-09 01:39:35",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "localbench"
}
        
Elapsed time: 6.07351s