gpu-estimator


Namegpu-estimator JSON
Version 0.1.4 PyPI version JSON
download
home_pageNone
SummaryA Python package for estimating GPU memory requirements and the number of GPUs needed for training machine learning models
upload_time2025-09-10 19:38:33
maintainerNone
docs_urlNone
authorHemanth HM
requires_python>=3.8
licenseMIT
keywords gpu machine-learning deep-learning memory estimation huggingface pytorch
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # GPU Estimator

A Python package for estimating GPU memory requirements and the number of GPUs needed for training machine learning models.

## Features

- **Latest Model Support**: Built-in configs for LLaMA 4, Gemma 3, Qwen 2.5/3, and more
- Estimate GPU memory requirements based on model parameters
- Calculate optimal number of GPUs for training
- Support for different precision types (FP32, FP16, BF16, INT8)
- Account for optimizer states and gradient storage
- Integration with Hugging Face Hub for latest models
- Discover and search trending models
- Support for popular architectures (GPT, LLaMA, BERT, T5, Mistral, Gemma, Qwen, etc.)
- CLI interface for quick estimates
- Detailed memory breakdown and recommendations

## Installation

```bash
pip install gpu-estimator
```


## Quick Start

### Basic Usage
```python
from gpu_estimator import GPUEstimator

estimator = GPUEstimator()

# Estimate for latest models using predefined configs
from gpu_estimator.utils import get_model_config

result = estimator.estimate_from_architecture(
    **get_model_config("qwen2.5-7b"),
    batch_size=8,
    sequence_length=2048,
    precision="fp16"
)

print(f"Memory needed per GPU: {result.memory_per_gpu_gb:.2f} GB")
print(f"Recommended GPUs: {result.num_gpus}")

# Or estimate by parameters for any model size
result = estimator.estimate(
    model_params=7e9,
    batch_size=32,
    sequence_length=2048,
    precision="fp16"
)
```

### Hugging Face Integration

```python
from gpu_estimator import GPUEstimator

estimator = GPUEstimator()

# Estimate directly from Hugging Face model ID
result = estimator.estimate_from_huggingface(
    model_id="meta-llama/Llama-3.2-3B",
    batch_size=4,
    sequence_length=2048,
    precision="fp16",
    gradient_checkpointing=True
)

print(f"Total memory required: {result.total_memory_gb:.2f} GB")
print(f"GPUs needed: {result.num_gpus}")

# Discover trending models
trending = estimator.list_trending_models(limit=10, task="text-generation")
for model in trending:
    print(f"{model.model_id} - {model.downloads:,} downloads")

# Search for specific models
models = estimator.search_models("qwen", limit=5)
for model in models:
    print(f"{model.model_id} - {model.architecture}")
```

## CLI Usage

### Basic Estimation
```bash
# Estimate for any model by parameters
gpu-estimate estimate --model-params 7e9 --batch-size 4 --precision fp16

# Estimate for predefined models (classic)
gpu-estimate estimate --model-name llama-7b --batch-size 8

# Estimate for latest predefined models
gpu-estimate estimate --model-name qwen2.5-7b --batch-size 4 --precision fp16
gpu-estimate estimate --model-name llama3.2-3b --batch-size 16 --gpu-type A100
gpu-estimate estimate --model-name gemma2-9b --batch-size 8 --precision bf16

# Estimate for Hugging Face models
gpu-estimate estimate --huggingface-model meta-llama/Llama-3.2-3B --batch-size 4
gpu-estimate estimate --huggingface-model Qwen/Qwen2.5-7B --batch-size 8
```

### Model Discovery
```bash
# List trending models
gpu-estimate trending --limit 20 --task text-generation

# Search for models
gpu-estimate search "mistral" --limit 10

# Get popular models by architecture
gpu-estimate popular llama --limit 5

# Get model information
gpu-estimate info qwen2.5-7b
```

### Advanced Options
```bash
# With gradient checkpointing and specific GPU
gpu-estimate estimate \
  --huggingface-model meta-llama/Llama-4-Scout-17B \
  --batch-size 8 \
  --seq-length 1024 \
  --precision fp16 \
  --gpu-type A100 \
  --gradient-checkpointing \
  --verbose
```

### Interactive Mode
Launch an interactive session for guided GPU estimation:

```bash
gpu-estimate interactive
```

Features:
- Guided workflows for all estimation tasks
- Model discovery with direct estimation
- Flexible model specification (parameters, names, or HF IDs)
- Step-by-step configuration of training parameters
- Quick estimates from trending model lists

## Supported Models & Architectures

### Hugging Face Models
The package automatically supports any model on Hugging Face Hub by detecting their configuration. Popular architectures include:

| Architecture | Examples | Use Cases |
|-------------|----------|-----------|
| LLaMA/LLaMA2/3/4 | `meta-llama/Llama-2-7b-hf`, `meta-llama/Llama-3.2-3B`, `meta-llama/Llama-4-Scout-17B` | General language modeling, chat |
| GPT | `gpt2`, `microsoft/DialoGPT-large` | Text generation, conversation |
| Mistral | `mistralai/Mistral-7B-v0.1` | Efficient language modeling |
| CodeLlama | `codellama/CodeLlama-7b-Python-hf` | Code generation |
| BERT | `google-bert/bert-base-uncased` | Text classification, NLU |
| T5 | `google-t5/t5-base`, `google/flan-t5-large` | Text-to-text tasks |
| Phi | `microsoft/phi-2` | Small efficient models |
| Gemma/Gemma2/3 | `google/gemma-7b`, `google/gemma-2-9b`, `google/gemma-3-270m` | Google's language models |
| Qwen/Qwen2.5/3 | `Qwen/Qwen-7B`, `Qwen/Qwen2.5-7B`, `Qwen/Qwen3-4B` | Multilingual models |

### Predefined Models
Classic and latest models with known configurations:

**GPT Family:**
- `gpt2`, `gpt2-medium`, `gpt2-large`, `gpt2-xl`, `gpt3`

**LLaMA Family:**
- Original: `llama-7b`, `llama-13b`, `llama-30b`, `llama-65b`  
- LLaMA 2: `llama2-7b`, `llama2-13b`, `llama2-70b`
- LLaMA 3.2: `llama3.2-1b`, `llama3.2-3b`
- LLaMA 3.3: `llama3.3-70b`
- LLaMA 4: `llama4-scout-17b`, `llama4-maverick-17b`
- Code LLaMA: `codellama-7b`, `codellama-13b`, `codellama-34b`

**Mistral Family:**
- `mistral-7b`

**Phi Family:**
- `phi-1.5b`, `phi-2.7b`

**Gemma Family:**
- Original: `gemma-2b`, `gemma-7b`
- Gemma 2: `gemma2-2b`, `gemma2-9b`, `gemma2-27b`
- Gemma 3: `gemma3-270m`

**Qwen Family:**
- Qwen 2.5: `qwen2.5-7b`, `qwen2.5-14b`, `qwen2.5-32b`, `qwen2.5-72b`
- Qwen 3: `qwen3-4b`, `qwen3-30b`, `qwen3-235b`

**Flexible Naming**: Model names support flexible matching. Use `custom-llama-7b`, `my-mistral-7b`, or any name containing a known model identifier.

## GPU Types Supported

| GPU | Memory | Use Case |
|-----|--------|----------|
| H100 | 80 GB | Latest high-performance training |
| A100 | 80 GB | Large model training and inference |
| A40 | 48 GB | Professional workstation training |
| A6000 | 48 GB | Creative and AI workstation |
| L40 | 48 GB | Data center inference |
| L4 | 24 GB | Efficient inference |
| RTX 4090 | 24 GB | Consumer high-end |
| RTX 3090 | 24 GB | Consumer enthusiast |
| V100 | 32 GB | Previous generation training |
| T4 | 16 GB | Cloud inference |

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "gpu-estimator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "gpu, machine-learning, deep-learning, memory, estimation, huggingface, pytorch",
    "author": "Hemanth HM",
    "author_email": "Hemanth HM <hemanth.hm@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/37/0b/96f5e8c20ec543c64686cb51a490828f3fd81a0f3f1166bb9e70c5b56fb6/gpu_estimator-0.1.4.tar.gz",
    "platform": null,
    "description": "# GPU Estimator\n\nA Python package for estimating GPU memory requirements and the number of GPUs needed for training machine learning models.\n\n## Features\n\n- **Latest Model Support**: Built-in configs for LLaMA 4, Gemma 3, Qwen 2.5/3, and more\n- Estimate GPU memory requirements based on model parameters\n- Calculate optimal number of GPUs for training\n- Support for different precision types (FP32, FP16, BF16, INT8)\n- Account for optimizer states and gradient storage\n- Integration with Hugging Face Hub for latest models\n- Discover and search trending models\n- Support for popular architectures (GPT, LLaMA, BERT, T5, Mistral, Gemma, Qwen, etc.)\n- CLI interface for quick estimates\n- Detailed memory breakdown and recommendations\n\n## Installation\n\n```bash\npip install gpu-estimator\n```\n\n\n## Quick Start\n\n### Basic Usage\n```python\nfrom gpu_estimator import GPUEstimator\n\nestimator = GPUEstimator()\n\n# Estimate for latest models using predefined configs\nfrom gpu_estimator.utils import get_model_config\n\nresult = estimator.estimate_from_architecture(\n    **get_model_config(\"qwen2.5-7b\"),\n    batch_size=8,\n    sequence_length=2048,\n    precision=\"fp16\"\n)\n\nprint(f\"Memory needed per GPU: {result.memory_per_gpu_gb:.2f} GB\")\nprint(f\"Recommended GPUs: {result.num_gpus}\")\n\n# Or estimate by parameters for any model size\nresult = estimator.estimate(\n    model_params=7e9,\n    batch_size=32,\n    sequence_length=2048,\n    precision=\"fp16\"\n)\n```\n\n### Hugging Face Integration\n\n```python\nfrom gpu_estimator import GPUEstimator\n\nestimator = GPUEstimator()\n\n# Estimate directly from Hugging Face model ID\nresult = estimator.estimate_from_huggingface(\n    model_id=\"meta-llama/Llama-3.2-3B\",\n    batch_size=4,\n    sequence_length=2048,\n    precision=\"fp16\",\n    gradient_checkpointing=True\n)\n\nprint(f\"Total memory required: {result.total_memory_gb:.2f} GB\")\nprint(f\"GPUs needed: {result.num_gpus}\")\n\n# Discover trending models\ntrending = estimator.list_trending_models(limit=10, task=\"text-generation\")\nfor model in trending:\n    print(f\"{model.model_id} - {model.downloads:,} downloads\")\n\n# Search for specific models\nmodels = estimator.search_models(\"qwen\", limit=5)\nfor model in models:\n    print(f\"{model.model_id} - {model.architecture}\")\n```\n\n## CLI Usage\n\n### Basic Estimation\n```bash\n# Estimate for any model by parameters\ngpu-estimate estimate --model-params 7e9 --batch-size 4 --precision fp16\n\n# Estimate for predefined models (classic)\ngpu-estimate estimate --model-name llama-7b --batch-size 8\n\n# Estimate for latest predefined models\ngpu-estimate estimate --model-name qwen2.5-7b --batch-size 4 --precision fp16\ngpu-estimate estimate --model-name llama3.2-3b --batch-size 16 --gpu-type A100\ngpu-estimate estimate --model-name gemma2-9b --batch-size 8 --precision bf16\n\n# Estimate for Hugging Face models\ngpu-estimate estimate --huggingface-model meta-llama/Llama-3.2-3B --batch-size 4\ngpu-estimate estimate --huggingface-model Qwen/Qwen2.5-7B --batch-size 8\n```\n\n### Model Discovery\n```bash\n# List trending models\ngpu-estimate trending --limit 20 --task text-generation\n\n# Search for models\ngpu-estimate search \"mistral\" --limit 10\n\n# Get popular models by architecture\ngpu-estimate popular llama --limit 5\n\n# Get model information\ngpu-estimate info qwen2.5-7b\n```\n\n### Advanced Options\n```bash\n# With gradient checkpointing and specific GPU\ngpu-estimate estimate \\\n  --huggingface-model meta-llama/Llama-4-Scout-17B \\\n  --batch-size 8 \\\n  --seq-length 1024 \\\n  --precision fp16 \\\n  --gpu-type A100 \\\n  --gradient-checkpointing \\\n  --verbose\n```\n\n### Interactive Mode\nLaunch an interactive session for guided GPU estimation:\n\n```bash\ngpu-estimate interactive\n```\n\nFeatures:\n- Guided workflows for all estimation tasks\n- Model discovery with direct estimation\n- Flexible model specification (parameters, names, or HF IDs)\n- Step-by-step configuration of training parameters\n- Quick estimates from trending model lists\n\n## Supported Models & Architectures\n\n### Hugging Face Models\nThe package automatically supports any model on Hugging Face Hub by detecting their configuration. Popular architectures include:\n\n| Architecture | Examples | Use Cases |\n|-------------|----------|-----------|\n| LLaMA/LLaMA2/3/4 | `meta-llama/Llama-2-7b-hf`, `meta-llama/Llama-3.2-3B`, `meta-llama/Llama-4-Scout-17B` | General language modeling, chat |\n| GPT | `gpt2`, `microsoft/DialoGPT-large` | Text generation, conversation |\n| Mistral | `mistralai/Mistral-7B-v0.1` | Efficient language modeling |\n| CodeLlama | `codellama/CodeLlama-7b-Python-hf` | Code generation |\n| BERT | `google-bert/bert-base-uncased` | Text classification, NLU |\n| T5 | `google-t5/t5-base`, `google/flan-t5-large` | Text-to-text tasks |\n| Phi | `microsoft/phi-2` | Small efficient models |\n| Gemma/Gemma2/3 | `google/gemma-7b`, `google/gemma-2-9b`, `google/gemma-3-270m` | Google's language models |\n| Qwen/Qwen2.5/3 | `Qwen/Qwen-7B`, `Qwen/Qwen2.5-7B`, `Qwen/Qwen3-4B` | Multilingual models |\n\n### Predefined Models\nClassic and latest models with known configurations:\n\n**GPT Family:**\n- `gpt2`, `gpt2-medium`, `gpt2-large`, `gpt2-xl`, `gpt3`\n\n**LLaMA Family:**\n- Original: `llama-7b`, `llama-13b`, `llama-30b`, `llama-65b`  \n- LLaMA 2: `llama2-7b`, `llama2-13b`, `llama2-70b`\n- LLaMA 3.2: `llama3.2-1b`, `llama3.2-3b`\n- LLaMA 3.3: `llama3.3-70b`\n- LLaMA 4: `llama4-scout-17b`, `llama4-maverick-17b`\n- Code LLaMA: `codellama-7b`, `codellama-13b`, `codellama-34b`\n\n**Mistral Family:**\n- `mistral-7b`\n\n**Phi Family:**\n- `phi-1.5b`, `phi-2.7b`\n\n**Gemma Family:**\n- Original: `gemma-2b`, `gemma-7b`\n- Gemma 2: `gemma2-2b`, `gemma2-9b`, `gemma2-27b`\n- Gemma 3: `gemma3-270m`\n\n**Qwen Family:**\n- Qwen 2.5: `qwen2.5-7b`, `qwen2.5-14b`, `qwen2.5-32b`, `qwen2.5-72b`\n- Qwen 3: `qwen3-4b`, `qwen3-30b`, `qwen3-235b`\n\n**Flexible Naming**: Model names support flexible matching. Use `custom-llama-7b`, `my-mistral-7b`, or any name containing a known model identifier.\n\n## GPU Types Supported\n\n| GPU | Memory | Use Case |\n|-----|--------|----------|\n| H100 | 80 GB | Latest high-performance training |\n| A100 | 80 GB | Large model training and inference |\n| A40 | 48 GB | Professional workstation training |\n| A6000 | 48 GB | Creative and AI workstation |\n| L40 | 48 GB | Data center inference |\n| L4 | 24 GB | Efficient inference |\n| RTX 4090 | 24 GB | Consumer high-end |\n| RTX 3090 | 24 GB | Consumer enthusiast |\n| V100 | 32 GB | Previous generation training |\n| T4 | 16 GB | Cloud inference |\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python package for estimating GPU memory requirements and the number of GPUs needed for training machine learning models",
    "version": "0.1.4",
    "project_urls": {
        "Documentation": "https://github.com/hemanth/gpu-estimator#readme",
        "Homepage": "https://github.com/hemanth/gpu-estimator",
        "Issues": "https://github.com/hemanth/gpu-estimator/issues",
        "Repository": "https://github.com/hemanth/gpu-estimator"
    },
    "split_keywords": [
        "gpu",
        " machine-learning",
        " deep-learning",
        " memory",
        " estimation",
        " huggingface",
        " pytorch"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "ee8900a678bb5f99e15600fee6621b9bafefccb9b18c5b6993bbc2649bd601c5",
                "md5": "25897b7b1c69547e58dfdad8a7f62f24",
                "sha256": "3c73661b1e8bb52f0ea49a45ea922525905617b4c9cc12a34240724e1a0e2712"
            },
            "downloads": -1,
            "filename": "gpu_estimator-0.1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "25897b7b1c69547e58dfdad8a7f62f24",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 17933,
            "upload_time": "2025-09-10T19:38:30",
            "upload_time_iso_8601": "2025-09-10T19:38:30.995591Z",
            "url": "https://files.pythonhosted.org/packages/ee/89/00a678bb5f99e15600fee6621b9bafefccb9b18c5b6993bbc2649bd601c5/gpu_estimator-0.1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "370b96f5e8c20ec543c64686cb51a490828f3fd81a0f3f1166bb9e70c5b56fb6",
                "md5": "a28bf93360503dfc440832ed1971fe27",
                "sha256": "0116ad4889f380016c3f9faa84011029574066bffccd9c485aba873566173f56"
            },
            "downloads": -1,
            "filename": "gpu_estimator-0.1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "a28bf93360503dfc440832ed1971fe27",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 20361,
            "upload_time": "2025-09-10T19:38:33",
            "upload_time_iso_8601": "2025-09-10T19:38:33.330770Z",
            "url": "https://files.pythonhosted.org/packages/37/0b/96f5e8c20ec543c64686cb51a490828f3fd81a0f3f1166bb9e70c5b56fb6/gpu_estimator-0.1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-10 19:38:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hemanth",
    "github_project": "gpu-estimator#readme",
    "github_not_found": true,
    "lcname": "gpu-estimator"
}
        
Elapsed time: 2.51853s