llm-benchmark

Name	llm-benchmark JSON
Version	0.4.9 JSON
	download
home_page	https://github.com/aidatatools/ollama-benchmark
Summary	LLM Benchmark
upload_time	2025-08-11 20:41:38
maintainer	None
docs_url	None
author	Jason Chuang
requires_python	>=3.9
license	None
keywords
VCS
bugtrack_url
requirements	anyio certifi charset-normalizer cli-exit-tools click colorama exceptiongroup fake-winreg GPUtil h11 httpcore httpx idna lib-detect-testenv lib-platform lib-registry markdown-it-py mdurl ollama psutil Pygments PyYAML requests rich shellingham sniffio typer typing_extensions urllib3 wrapt
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # llm-benchmark (ollama-benchmark)

LLM Benchmark for Throughput via Ollama (Local LLMs)

Measure how fast your local LLMs *really* are—with a simple, cross-platform CLI tool that tells you the tokens-per-second truth.

## Installation prerequisites

Working [Ollama](https://ollama.com) installation.

## Installation Steps

Depending on your python setup either

```bash
pip install llm-benchmark
```

or

```bash
pipx install llm-benchmark
```

## Usage for general users directly

```bash
llm_benchmark run
```

## Installation and Usage in Video format

![llm-benchmark](https://github.com/aidatatools/ollama-benchmark/blob/main/llm-benchmark.gif)

It's tested on Python 3.9 and above.

## ollama installation with the following models installed

7B model can be run on machines with 8GB of RAM

13B model can be run on machines with 16GB of RAM

## Usage explaination

On Windows, Linux, and macOS, it will detect memory RAM size to first download required LLM models.

When memory RAM size is greater than or equal to 4GB, but less than 7GB, it will check if gemma:2b exist. The program implicitly pull the model.

```bash
ollama pull deepseek-r1:1.5b
ollama pull gemma:2b
ollama pull phi:2.7b
ollama pull phi3:3.8b
```

When memory RAM size is greater than 7GB, but less than 15GB, it will check if these models exist. The program implicitly pull these models

```bash
ollama pull phi3:3.8b
ollama pull gemma2:9b
ollama pull mistral:7b
ollama pull llama3.1:8b
ollama pull deepseek-r1:8b
ollama pull llava:7b
```

When memory RAM size is greater than 15GB, but less than 31GB, it will check if these models exist. The program implicitly pull these models

```bash
ollama pull gemma2:9b
ollama pull mistral:7b
ollama pull phi4:14b
ollama pull deepseek-r1:8b
ollama pull deepseek-r1:14b
ollama pull llava:7b
ollama pull llava:13b
```

When memory RAM size is greater than 31GB, it will check if these models exist. The program implicitly pull these models

```bash
ollama pull phi4:14b
ollama pull deepseek-r1:14b
ollama pull gpt-oss:20b
```

## Python Poetry manually(advanced) installation

<https://python-poetry.org/docs/#installing-manually>

## For developers to develop new features on Windows Powershell or on Ubuntu Linux or macOS

```bash
python3 -m venv .venv
. ./.venv/bin/activate
pip install -U pip setuptools
pip install poetry
```

## Usage in Python virtual environment

```bash
poetry shell
poetry install
llm_benchmark hello jason
```

### Example #1 send systeminfo and benchmark results to a remote server

```bash
llm_benchmark run
```

### Example #2 Do not send systeminfo and benchmark results to a remote server

```bash
llm_benchmark run --no-sendinfo
```

### Example #3 Benchmark run on explicitly given the path to the ollama executable (When you built your own developer version of ollama)

```bash
llm_benchmark run --ollamabin=~/code/ollama/ollama
```

### Example #4 run custom benchmark models

1. Create a custom benchmark file like following yaml format, replace with your own benchmark models, remember to use double quote for your model name

```yaml
file_name: "custombenchmarkmodels.yml"
version: 2.0.custom
models:
  - model: "deepseek-r1:1.5b"
  - model: "qwen:0.5b"
```

2. run with the flag and point to the path of custombenchmarkmodels.yml

```bash
llm_benchmark run --custombenchmark=path/to/custombenchmarkmodels.yml
```

## Reference

[Ollama](https://ollama.com)

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aidatatools/ollama-benchmark",
    "name": "llm-benchmark",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Jason Chuang",
    "author_email": "chuangtcee@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/46/d5/e9925f641630c89ccfeecbe4c5797d152eda837b069034ff7c8d8f4e74fe/llm_benchmark-0.4.9.tar.gz",
    "platform": null,
    "description": "# llm-benchmark (ollama-benchmark)\n\nLLM Benchmark for Throughput via Ollama (Local LLMs)\n\nMeasure how fast your local LLMs *really* are\u2014with a simple, cross-platform CLI tool that tells you the tokens-per-second truth.\n\n## Installation prerequisites\n\nWorking [Ollama](https://ollama.com) installation.\n\n## Installation Steps\n\nDepending on your python setup either\n\n```bash\npip install llm-benchmark\n```\n\nor\n\n```bash\npipx install llm-benchmark\n```\n\n## Usage for general users directly\n\n```bash\nllm_benchmark run\n```\n\n## Installation and Usage in Video format\n\n![llm-benchmark](https://github.com/aidatatools/ollama-benchmark/blob/main/llm-benchmark.gif)\n\nIt's tested on Python 3.9 and above.\n\n## ollama installation with the following models installed\n\n7B model can be run on machines with 8GB of RAM\n\n13B model can be run on machines with 16GB of RAM\n\n## Usage explaination\n\nOn Windows, Linux, and macOS, it will detect memory RAM size to first download required LLM models.\n\nWhen memory RAM size is greater than or equal to 4GB, but less than 7GB, it will check if gemma:2b exist. The program implicitly pull the model.\n\n```bash\nollama pull deepseek-r1:1.5b\nollama pull gemma:2b\nollama pull phi:2.7b\nollama pull phi3:3.8b\n```\n\nWhen memory RAM size is greater than 7GB, but less than 15GB, it will check if these models exist. The program implicitly pull these models\n\n```bash\nollama pull phi3:3.8b\nollama pull gemma2:9b\nollama pull mistral:7b\nollama pull llama3.1:8b\nollama pull deepseek-r1:8b\nollama pull llava:7b\n```\n\nWhen memory RAM size is greater than 15GB, but less than 31GB, it will check if these models exist. The program implicitly pull these models\n\n```bash\nollama pull gemma2:9b\nollama pull mistral:7b\nollama pull phi4:14b\nollama pull deepseek-r1:8b\nollama pull deepseek-r1:14b\nollama pull llava:7b\nollama pull llava:13b\n```\n\nWhen memory RAM size is greater than 31GB, it will check if these models exist. The program implicitly pull these models\n\n```bash\nollama pull phi4:14b\nollama pull deepseek-r1:14b\nollama pull gpt-oss:20b\n```\n\n## Python Poetry manually(advanced) installation\n\n<https://python-poetry.org/docs/#installing-manually>\n\n## For developers to develop new features on Windows Powershell or on Ubuntu Linux or macOS\n\n```bash\npython3 -m venv .venv\n. ./.venv/bin/activate\npip install -U pip setuptools\npip install poetry\n```\n\n## Usage in Python virtual environment\n\n```bash\npoetry shell\npoetry install\nllm_benchmark hello jason\n```\n\n### Example #1 send systeminfo and benchmark results to a remote server\n\n```bash\nllm_benchmark run\n```\n\n### Example #2 Do not send systeminfo and benchmark results to a remote server\n\n```bash\nllm_benchmark run --no-sendinfo\n```\n\n### Example #3 Benchmark run on explicitly given the path to the ollama executable (When you built your own developer version of ollama)\n\n```bash\nllm_benchmark run --ollamabin=~/code/ollama/ollama\n```\n\n### Example #4 run custom benchmark models\n\n1. Create a custom benchmark file like following yaml format, replace with your own benchmark models, remember to use double quote for your model name\n\n```yaml\nfile_name: \"custombenchmarkmodels.yml\"\nversion: 2.0.custom\nmodels:\n  - model: \"deepseek-r1:1.5b\"\n  - model: \"qwen:0.5b\"\n```\n\n2. run with the flag and point to the path of custombenchmarkmodels.yml\n\n```bash\nllm_benchmark run --custombenchmark=path/to/custombenchmarkmodels.yml\n```\n\n## Reference\n\n[Ollama](https://ollama.com)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "LLM Benchmark",
    "version": "0.4.9",
    "project_urls": {
        "Homepage": "https://github.com/aidatatools/ollama-benchmark"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "397ed9e38ce013c9d93aae3b88082f505c1dda6561d1f6856bd52f3dbfa36fa1",
                "md5": "ff275dad0a103e1f5ee784ef2d009e06",
                "sha256": "ac194972b6ced5f934b94f256f9b31cf1665cf000f0326cd00444fdc912afdd9"
            },
            "downloads": -1,
            "filename": "llm_benchmark-0.4.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ff275dad0a103e1f5ee784ef2d009e06",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 2134173,
            "upload_time": "2025-08-11T20:41:36",
            "upload_time_iso_8601": "2025-08-11T20:41:36.754078Z",
            "url": "https://files.pythonhosted.org/packages/39/7e/d9e38ce013c9d93aae3b88082f505c1dda6561d1f6856bd52f3dbfa36fa1/llm_benchmark-0.4.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "46d5e9925f641630c89ccfeecbe4c5797d152eda837b069034ff7c8d8f4e74fe",
                "md5": "9a99df3de664a70073d67033c430d608",
                "sha256": "ecdeb5b7c908e5f2bddc48d383559a46277fa89a1c5f9b74bab6c2d85cad244a"
            },
            "downloads": -1,
            "filename": "llm_benchmark-0.4.9.tar.gz",
            "has_sig": false,
            "md5_digest": "9a99df3de664a70073d67033c430d608",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 2132156,
            "upload_time": "2025-08-11T20:41:38",
            "upload_time_iso_8601": "2025-08-11T20:41:38.970050Z",
            "url": "https://files.pythonhosted.org/packages/46/d5/e9925f641630c89ccfeecbe4c5797d152eda837b069034ff7c8d8f4e74fe/llm_benchmark-0.4.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-11 20:41:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aidatatools",
    "github_project": "ollama-benchmark",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "anyio",
            "specs": [
                [
                    "==",
                    "4.3.0"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": [
                [
                    "==",
                    "2024.07.04"
                ]
            ]
        },
        {
            "name": "charset-normalizer",
            "specs": [
                [
                    "==",
                    "3.3.2"
                ]
            ]
        },
        {
            "name": "cli-exit-tools",
            "specs": [
                [
                    "==",
                    "1.2.6"
                ]
            ]
        },
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.7"
                ]
            ]
        },
        {
            "name": "colorama",
            "specs": [
                [
                    "==",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "exceptiongroup",
            "specs": [
                [
                    "==",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "fake-winreg",
            "specs": [
                [
                    "==",
                    "1.6.3"
                ]
            ]
        },
        {
            "name": "GPUtil",
            "specs": [
                [
                    "==",
                    "1.4.0"
                ]
            ]
        },
        {
            "name": "h11",
            "specs": [
                [
                    "==",
                    "0.16.0"
                ]
            ]
        },
        {
            "name": "httpcore",
            "specs": [
                [
                    "==",
                    "1.0.9"
                ]
            ]
        },
        {
            "name": "httpx",
            "specs": [
                [
                    "==",
                    "0.27.0"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    "==",
                    "3.7"
                ]
            ]
        },
        {
            "name": "lib-detect-testenv",
            "specs": [
                [
                    "==",
                    "2.0.8"
                ]
            ]
        },
        {
            "name": "lib-platform",
            "specs": [
                [
                    "==",
                    "1.2.10"
                ]
            ]
        },
        {
            "name": "lib-registry",
            "specs": [
                [
                    "==",
                    "2.0.10"
                ]
            ]
        },
        {
            "name": "markdown-it-py",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "mdurl",
            "specs": [
                [
                    "==",
                    "0.1.2"
                ]
            ]
        },
        {
            "name": "ollama",
            "specs": [
                [
                    "==",
                    "0.5.1"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": [
                [
                    "==",
                    "5.9.8"
                ]
            ]
        },
        {
            "name": "Pygments",
            "specs": [
                [
                    "==",
                    "2.17.2"
                ]
            ]
        },
        {
            "name": "PyYAML",
            "specs": [
                [
                    "==",
                    "6.0.1"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.32.4"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    "==",
                    "13.7.1"
                ]
            ]
        },
        {
            "name": "shellingham",
            "specs": [
                [
                    "==",
                    "1.5.4"
                ]
            ]
        },
        {
            "name": "sniffio",
            "specs": [
                [
                    "==",
                    "1.3.1"
                ]
            ]
        },
        {
            "name": "typer",
            "specs": [
                [
                    "==",
                    "0.9.4"
                ]
            ]
        },
        {
            "name": "typing_extensions",
            "specs": [
                [
                    "==",
                    "4.10.0"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "==",
                    "2.5.0"
                ]
            ]
        },
        {
            "name": "wrapt",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        }
    ],
    "lcname": "llm-benchmark"
}

Jason Chuang