gpu-benchmark-tool

Name	gpu-benchmark-tool JSON
Version	0.2.8 JSON
	download
home_page	https://github.com/yourusername/gpu-benchmark-tool
Summary	Multi-vendor GPU health monitoring supporting old GPUs for e-waste reduction
upload_time	2025-07-29 22:08:09
maintainer	None
docs_url	None
author	Yousuf Rajput
requires_python	>=3.8
license	None
keywords	gpu benchmark monitoring cuda rocm intel nvidia amd old-gpu ewaste recycling sustainability
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            A comprehensive multi-vendor GPU health monitoring and optimization tool that helps users assess GPU performance and select optimal hardware for their workloads.

🚀 Features

🔥 Comprehensive GPU Health Monitoring: Temperature, power, utilization, and throttling detection

⚡ Advanced Stress Testing: Compute, memory bandwidth, VRAM, and mixed-precision tests

📊 Detailed Health Scoring: 100-point scoring system with actionable recommendations

🖥️ Multi-GPU Support: Test and compare multiple GPUs simultaneously

🧪 Mock Mode: Test on any computer without GPUs (perfect for development)

🔌 Multi-Vendor Support: NVIDIA, AMD, Intel, and Mock mode

☁️ Cloud-Ready: Designed to help select optimal GPUs for cloud deployment (coming soon!)

## Installation

Basic Installation (Mock Mode Only)

# For systems without GPUs or for testing
pip install gpu-benchmark-tool
Installation with GPU Support
bash
# For NVIDIA GPUs (most common)
pip install gpu-benchmark-tool[nvidia]

# For AMD GPUs
pip install gpu-benchmark-tool[amd]

# For Intel GPUs
pip install gpu-benchmark-tool[intel]

# For all GPU vendors
pip install gpu-benchmark-tool[all]

🎯 Quick Start
1. Check Available GPUs
gpu-benchmark list

2. Run Benchmark

# Benchmark all GPUs
gpu-benchmark benchmark

# Benchmark specific GPU (recommended)
gpu-benchmark benchmark --gpu-id 0

# Quick 30-second test
gpu-benchmark benchmark --gpu-id 0 --duration 30

# Export results to JSON
gpu-benchmark benchmark --gpu-id 0 --export results.json

3. Mock Mode (No GPU Required)

# Perfect for development or systems without GPUs
gpu-benchmark benchmark --mock --duration 30

📊 Google Colab Quick Start

# Run in a Colab notebook (Runtime > Change runtime type > GPU)
!pip install gpu-benchmark-tool[nvidia]
!gpu-benchmark benchmark --gpu-id 0 --duration 30

# Understanding Results

Health Score (0-100 points)
85-100: 🟢 Healthy - Safe for all workloads including AI training
70-84: 🟢 Good - Suitable for most workloads
55-69: 🟡 Degraded - Limit to inference or light compute
40-54: 🟡 Warning - Monitor closely, avoid heavy workloads
0-39: 🔴 Critical - Do not use for production

### Score Components

Each component contributes to the total 100-point score:

**Temperature (20 points)**
- Peak temperature during stress test
- Under 80°C: Full points
- 80-85°C: 15 points
- 85-90°C: 10 points
- Over 90°C: 5 points

**Baseline Temperature (10 points)**
- GPU temperature at idle
- Under 50°C: Full points
- 50-60°C: 5 points
- Over 60°C: 0 points

**Power Efficiency (10 points)**
- Power consumption optimization
- Within optimal range: Full points
- Slightly outside range: 5 points
- Far from optimal: 0 points

**GPU Utilization (10 points)**
- How well the GPU is utilized during tests
- 99%+: Full points
- 90-98%: 5 points
- Under 90%: 0 points

**Throttling (20 points)**
- Thermal or power throttling detection
- No throttling: Full points
- Occasional throttling: 10-15 points
- Frequent throttling: 0-5 points

**Errors (20 points)**
- Stability during stress tests
- No errors: Full points
- Few errors: 10-15 points
- Many errors: 0-5 points

**Temperature Stability (10 points)**
- Temperature consistency during tests
- Very stable: Full points
- Some fluctuation: 5-7 points
- Unstable: 0-5 points

# Performance Metrics
Matrix Multiplication: Raw compute performance (TFLOPS)
Memory Bandwidth: Memory throughput (GB/s)
VRAM Stress: Memory allocation stability
Mixed Precision: FP16/BF16 support for AI workloads

# Command Line Usage
Benchmark Command

gpu-benchmark benchmark [OPTIONS]

Options:
  --gpu-id INTEGER    Specific GPU to test (default: all GPUs)
  --duration INTEGER  Test duration in seconds (default: 60)
  --basic            Run basic tests only (faster)
  --export TEXT      Export results to JSON file
  --verbose          Show detailed output
  --mock             Use mock GPU (no hardware required)

# Examples

# Full test on GPU 0 with export
gpu-benchmark benchmark --gpu-id 0 --duration 120 --export full_test.json

# Quick health check
gpu-benchmark benchmark --gpu-id 0 --duration 30 --basic

# Development testing
gpu-benchmark benchmark --mock --export mock_results.json


## Real-time Monitoring

# Monitor GPU metrics in real-time (NVIDIA only)
gpu-benchmark monitor --gpu-id 0

# Python API Usage
Basic Usage

import pynvml
from gpu_benchmark import run_full_benchmark

# Initialize NVML
pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0)

# Run benchmark
results = run_full_benchmark(
    handle=handle,
    duration=60,
    enhanced=True,
    device_id=0
)

# Access results

print(f"Health Score: {results['health_score']['score']}/100")
print(f"Status: {results['health_score']['status']}")

Analyzing Results

# Check if GPU is healthy for production
if results['health_score']['score'] >= 70:
    print("✅ GPU is suitable for production workloads")
else:
    print("⚠️ GPU needs attention")
    
# Access performance metrics
if 'performance_tests' in results:
    tflops = results['performance_tests']['matrix_multiply']['tflops']
    print(f"Compute Performance: {tflops:.2f} TFLOPS")

🔧 Troubleshooting

# Common Issues

"No GPUs found"

Use --mock flag for testing without GPUs
Ensure NVIDIA/AMD/Intel drivers are installed

"NVML Error" on Colab

This warning can be ignored - the tool still works correctly
Use --gpu-id 0 for cleaner output


# Low Health Scores

Check system cooling
Ensure GPU isn't thermal throttling
Close other GPU applications
Multi-GPU JSON Format

Use --gpu-id 0 to test single GPU (simpler output)
Without --gpu-id, results are nested under 'results' key

# Supported GPUs
NVIDIA GPUs (Full Support)
Consumer: RTX 4090, 4080, 4070, 3090, 3080, 3070, 3060
Data Center: A100, V100, T4, P100, K80
Workstation: RTX A6000, A5000, A4000
AMD GPUs (ROCm Required)
MI250X, MI210, MI100
Radeon RX 7900 XTX, RX 6900 XT
Intel GPUs (Limited Support)
Arc A770, A750
Intel Xe integrated graphics

# Requirements
Python 3.8 or higher
For NVIDIA: CUDA drivers
For AMD: ROCm drivers
For Intel: Intel GPU drivers

📄 License
MIT License - see LICENSE file for details.

🙏 Acknowledgments
Built to solve real-world GPU selection challenges and reduce cloud computing costs through better hardware decisions.

📧 Contact
PyPI: https://pypi.org/project/gpu-benchmark-tool/
Email: ywrajput@gmail.com

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yourusername/gpu-benchmark-tool",
    "name": "gpu-benchmark-tool",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "gpu benchmark monitoring cuda rocm intel nvidia amd old-gpu ewaste recycling sustainability",
    "author": "Yousuf Rajput",
    "author_email": "ywrajput@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/d4/6c/c5293bc77e6fa759859a1bbb818a8bdf8cd88b78de7a1b7642418c1cfbd3/gpu_benchmark_tool-0.2.8.tar.gz",
    "platform": null,
    "description": "A comprehensive multi-vendor GPU health monitoring and optimization tool that helps users assess GPU performance and select optimal hardware for their workloads.\n\n\ud83d\ude80 Features\n\n\ud83d\udd25 Comprehensive GPU Health Monitoring: Temperature, power, utilization, and throttling detection\n\n\u26a1 Advanced Stress Testing: Compute, memory bandwidth, VRAM, and mixed-precision tests\n\n\ud83d\udcca Detailed Health Scoring: 100-point scoring system with actionable recommendations\n\n\ud83d\udda5\ufe0f Multi-GPU Support: Test and compare multiple GPUs simultaneously\n\n\ud83e\uddea Mock Mode: Test on any computer without GPUs (perfect for development)\n\n\ud83d\udd0c Multi-Vendor Support: NVIDIA, AMD, Intel, and Mock mode\n\n\u2601\ufe0f Cloud-Ready: Designed to help select optimal GPUs for cloud deployment (coming soon!)\n\n## Installation\n\nBasic Installation (Mock Mode Only)\n\n# For systems without GPUs or for testing\npip install gpu-benchmark-tool\nInstallation with GPU Support\nbash\n# For NVIDIA GPUs (most common)\npip install gpu-benchmark-tool[nvidia]\n\n# For AMD GPUs\npip install gpu-benchmark-tool[amd]\n\n# For Intel GPUs\npip install gpu-benchmark-tool[intel]\n\n# For all GPU vendors\npip install gpu-benchmark-tool[all]\n\n\ud83c\udfaf Quick Start\n1. Check Available GPUs\ngpu-benchmark list\n\n2. Run Benchmark\n\n# Benchmark all GPUs\ngpu-benchmark benchmark\n\n# Benchmark specific GPU (recommended)\ngpu-benchmark benchmark --gpu-id 0\n\n# Quick 30-second test\ngpu-benchmark benchmark --gpu-id 0 --duration 30\n\n# Export results to JSON\ngpu-benchmark benchmark --gpu-id 0 --export results.json\n\n3. Mock Mode (No GPU Required)\n\n# Perfect for development or systems without GPUs\ngpu-benchmark benchmark --mock --duration 30\n\n\ud83d\udcca Google Colab Quick Start\n\n# Run in a Colab notebook (Runtime > Change runtime type > GPU)\n!pip install gpu-benchmark-tool[nvidia]\n!gpu-benchmark benchmark --gpu-id 0 --duration 30\n\n# Understanding Results\n\nHealth Score (0-100 points)\n85-100: \ud83d\udfe2 Healthy - Safe for all workloads including AI training\n70-84: \ud83d\udfe2 Good - Suitable for most workloads\n55-69: \ud83d\udfe1 Degraded - Limit to inference or light compute\n40-54: \ud83d\udfe1 Warning - Monitor closely, avoid heavy workloads\n0-39: \ud83d\udd34 Critical - Do not use for production\n\n### Score Components\n\nEach component contributes to the total 100-point score:\n\n**Temperature (20 points)**\n- Peak temperature during stress test\n- Under 80\u00b0C: Full points\n- 80-85\u00b0C: 15 points\n- 85-90\u00b0C: 10 points\n- Over 90\u00b0C: 5 points\n\n**Baseline Temperature (10 points)**\n- GPU temperature at idle\n- Under 50\u00b0C: Full points\n- 50-60\u00b0C: 5 points\n- Over 60\u00b0C: 0 points\n\n**Power Efficiency (10 points)**\n- Power consumption optimization\n- Within optimal range: Full points\n- Slightly outside range: 5 points\n- Far from optimal: 0 points\n\n**GPU Utilization (10 points)**\n- How well the GPU is utilized during tests\n- 99%+: Full points\n- 90-98%: 5 points\n- Under 90%: 0 points\n\n**Throttling (20 points)**\n- Thermal or power throttling detection\n- No throttling: Full points\n- Occasional throttling: 10-15 points\n- Frequent throttling: 0-5 points\n\n**Errors (20 points)**\n- Stability during stress tests\n- No errors: Full points\n- Few errors: 10-15 points\n- Many errors: 0-5 points\n\n**Temperature Stability (10 points)**\n- Temperature consistency during tests\n- Very stable: Full points\n- Some fluctuation: 5-7 points\n- Unstable: 0-5 points\n\n# Performance Metrics\nMatrix Multiplication: Raw compute performance (TFLOPS)\nMemory Bandwidth: Memory throughput (GB/s)\nVRAM Stress: Memory allocation stability\nMixed Precision: FP16/BF16 support for AI workloads\n\n# Command Line Usage\nBenchmark Command\n\ngpu-benchmark benchmark [OPTIONS]\n\nOptions:\n  --gpu-id INTEGER    Specific GPU to test (default: all GPUs)\n  --duration INTEGER  Test duration in seconds (default: 60)\n  --basic            Run basic tests only (faster)\n  --export TEXT      Export results to JSON file\n  --verbose          Show detailed output\n  --mock             Use mock GPU (no hardware required)\n\n# Examples\n\n# Full test on GPU 0 with export\ngpu-benchmark benchmark --gpu-id 0 --duration 120 --export full_test.json\n\n# Quick health check\ngpu-benchmark benchmark --gpu-id 0 --duration 30 --basic\n\n# Development testing\ngpu-benchmark benchmark --mock --export mock_results.json\n\n\n## Real-time Monitoring\n\n# Monitor GPU metrics in real-time (NVIDIA only)\ngpu-benchmark monitor --gpu-id 0\n\n# Python API Usage\nBasic Usage\n\nimport pynvml\nfrom gpu_benchmark import run_full_benchmark\n\n# Initialize NVML\npynvml.nvmlInit()\nhandle = pynvml.nvmlDeviceGetHandleByIndex(0)\n\n# Run benchmark\nresults = run_full_benchmark(\n    handle=handle,\n    duration=60,\n    enhanced=True,\n    device_id=0\n)\n\n# Access results\n\nprint(f\"Health Score: {results['health_score']['score']}/100\")\nprint(f\"Status: {results['health_score']['status']}\")\n\nAnalyzing Results\n\n# Check if GPU is healthy for production\nif results['health_score']['score'] >= 70:\n    print(\"\u2705 GPU is suitable for production workloads\")\nelse:\n    print(\"\u26a0\ufe0f GPU needs attention\")\n    \n# Access performance metrics\nif 'performance_tests' in results:\n    tflops = results['performance_tests']['matrix_multiply']['tflops']\n    print(f\"Compute Performance: {tflops:.2f} TFLOPS\")\n\n\ud83d\udd27 Troubleshooting\n\n# Common Issues\n\n\"No GPUs found\"\n\nUse --mock flag for testing without GPUs\nEnsure NVIDIA/AMD/Intel drivers are installed\n\n\"NVML Error\" on Colab\n\nThis warning can be ignored - the tool still works correctly\nUse --gpu-id 0 for cleaner output\n\n\n# Low Health Scores\n\nCheck system cooling\nEnsure GPU isn't thermal throttling\nClose other GPU applications\nMulti-GPU JSON Format\n\nUse --gpu-id 0 to test single GPU (simpler output)\nWithout --gpu-id, results are nested under 'results' key\n\n# Supported GPUs\nNVIDIA GPUs (Full Support)\nConsumer: RTX 4090, 4080, 4070, 3090, 3080, 3070, 3060\nData Center: A100, V100, T4, P100, K80\nWorkstation: RTX A6000, A5000, A4000\nAMD GPUs (ROCm Required)\nMI250X, MI210, MI100\nRadeon RX 7900 XTX, RX 6900 XT\nIntel GPUs (Limited Support)\nArc A770, A750\nIntel Xe integrated graphics\n\n# Requirements\nPython 3.8 or higher\nFor NVIDIA: CUDA drivers\nFor AMD: ROCm drivers\nFor Intel: Intel GPU drivers\n\n\ud83d\udcc4 License\nMIT License - see LICENSE file for details.\n\n\ud83d\ude4f Acknowledgments\nBuilt to solve real-world GPU selection challenges and reduce cloud computing costs through better hardware decisions.\n\n\ud83d\udce7 Contact\nPyPI: https://pypi.org/project/gpu-benchmark-tool/\nEmail: ywrajput@gmail.com\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Multi-vendor GPU health monitoring supporting old GPUs for e-waste reduction",
    "version": "0.2.8",
    "project_urls": {
        "Homepage": "https://github.com/yourusername/gpu-benchmark-tool"
    },
    "split_keywords": [
        "gpu",
        "benchmark",
        "monitoring",
        "cuda",
        "rocm",
        "intel",
        "nvidia",
        "amd",
        "old-gpu",
        "ewaste",
        "recycling",
        "sustainability"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fe6188013e8db993d3e0897134f46da4b56059bd7b34be4e73a7f2da2ea3057c",
                "md5": "a02cfb9608520b896881e62859c657f9",
                "sha256": "9f95dabecd7ede0c4a9f73b9c925e22b07e6b96ee581f79b2ce378fab53f72cd"
            },
            "downloads": -1,
            "filename": "gpu_benchmark_tool-0.2.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a02cfb9608520b896881e62859c657f9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 59133,
            "upload_time": "2025-07-29T22:08:08",
            "upload_time_iso_8601": "2025-07-29T22:08:08.729057Z",
            "url": "https://files.pythonhosted.org/packages/fe/61/88013e8db993d3e0897134f46da4b56059bd7b34be4e73a7f2da2ea3057c/gpu_benchmark_tool-0.2.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d46cc5293bc77e6fa759859a1bbb818a8bdf8cd88b78de7a1b7642418c1cfbd3",
                "md5": "d0345d9d766da9e13f689869a3468f60",
                "sha256": "eee90a0f8c274a3a30461e7fb69fd46437a8dfed4a8f52d38352c983962016fe"
            },
            "downloads": -1,
            "filename": "gpu_benchmark_tool-0.2.8.tar.gz",
            "has_sig": false,
            "md5_digest": "d0345d9d766da9e13f689869a3468f60",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 47501,
            "upload_time": "2025-07-29T22:08:09",
            "upload_time_iso_8601": "2025-07-29T22:08:09.851834Z",
            "url": "https://files.pythonhosted.org/packages/d4/6c/c5293bc77e6fa759859a1bbb818a8bdf8cd88b78de7a1b7642418c1cfbd3/gpu_benchmark_tool-0.2.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-29 22:08:09",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yourusername",
    "github_project": "gpu-benchmark-tool",
    "github_not_found": true,
    "lcname": "gpu-benchmark-tool"
}

Yousuf Rajput