whisper-parallel-cpu


Namewhisper-parallel-cpu JSON
Version 1.2.3 PyPI version JSON
download
home_pageNone
SummaryHigh-performance audio and video transcription using whisper.cpp with automatic model downloading and CPU parallelism
upload_time2025-07-25 10:44:35
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords whisper transcription audio speech ai ml cpp pybind11 cpu parallel multithreading
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Whisper Parallel CPU Audio & Video Transcriber

A minimal, robust Python package for whisper.cpp with CPU-optimized threading and integrated model management. Transcribe both audio and video files with high performance. Targeting distributed cloud deployments and transcription workflows.

## ๐Ÿš€ Quick Start

**Install from PyPI:**
```bash
pip install whisper-parallel-cpu
```

**Use in Python:**
```python
import whisper_parallel_cpu

# Transcribe audio files
text = whisper_parallel_cpu.transcribe("audio.mp3", model="base")

# Transcribe video files
text = whisper_parallel_cpu.transcribe("video.mp4", model="base")

# Or use specific functions
text = whisper_parallel_cpu.transcribe_audio("audio.wav", model="small")
text = whisper_parallel_cpu.transcribe_video("video.mkv", model="medium")
```

**Or use the CLI:**
```bash
# Transcribe audio
whisper_parallel_cpu transcribe audio.mp3 --model base

# Transcribe video
whisper_parallel_cpu transcribe video.mp4 --model base
```

---

## โœจ Features

- **Native C++/pybind11 speed** (CPU & GPU acceleration)
- **Automatic model download/caching** - no manual setup required
- **Simple Python & CLI interface** - just `pip install` and go
- **Input**: Audio (`.mp3`, `.wav`, `.flac`, `.aac`, `.ogg`, `.m4a`) and video (`.mp4`, `.mkv`, `.avi`, `.mov`) formats
- **Output**: Transcribed text as a Python string
- **Benchmarking**: Built-in performance testing and optimization tools
- **Cross-platform**: Works on macOS, Linux, and Windows

---

## ๐Ÿ“ฆ Installation

### From PyPI (Recommended)
```bash
pip install whisper-parallel-cpu
```

### From Source (Development)
```bash
# Clone the repository
git clone https://github.com/krisfur/whisper-parallel-cpu.git
cd whisper-parallel-cpu

# Install in editable mode
pip install -e .

# Test the installation
python test_transcribe.py video.mp4
```

---

## ๐Ÿงฐ Requirements

### System Tools
- **C++17 compiler** (`g++`, `clang++`) - automatically handled by pip
- **cmake** (>=3.15) - automatically handled by pip
- **ffmpeg** (for audio extraction)

### Install ffmpeg

**macOS:**
```bash
brew install ffmpeg
```

**Ubuntu/Debian:**
```bash
sudo apt update && sudo apt install ffmpeg
```

**Windows:**
Download from [ffmpeg.org](https://ffmpeg.org/download.html) or use Chocolatey:
```bash
choco install ffmpeg
```

---

## ๐Ÿงช Usage

### Python API

#### Basic Usage (Function-based)

```python
import whisper_parallel_cpu

# Transcribe any audio or video file (auto-detects format)
text = whisper_parallel_cpu.transcribe("audio.mp3", model="base", threads=4)
text = whisper_parallel_cpu.transcribe("video.mp4", model="small")

# Use specific functions for audio or video
text = whisper_parallel_cpu.transcribe_audio("audio.wav", model="base", threads=4)
text = whisper_parallel_cpu.transcribe_video("video.mkv", model="medium", threads=8)

# CPU-only mode (no GPU)
text = whisper_parallel_cpu.transcribe("audio.flac", model="base", use_gpu=False)
```

#### Advanced Usage (Model Reuse)

For better performance when transcribing multiple files, use the `WhisperModel` class to load the model once and reuse it:

```python
from whisper_parallel_cpu import WhisperModel

# Create a model instance (model is loaded on first use)
model = WhisperModel(model="base", use_gpu=False, threads=4)

# Transcribe multiple files using the same loaded model
files = ["audio1.mp3", "audio2.wav", "video1.mp4", "video2.mkv"]
for file_path in files:
    text = model.transcribe(file_path)
    print(f"Transcribed {file_path}: {text[:100]}...")

# Use as context manager
with WhisperModel(model="small", use_gpu=True) as model:
    text1 = model.transcribe("audio1.mp3")
    text2 = model.transcribe("audio2.wav")
    # Model is automatically managed

# Memory management
model.clear_contexts()  # Free memory
print(f"Active contexts: {model.get_context_count()}")
```

### Supported File Formats

**Audio Formats:**
- `.mp3`, `.wav`, `.flac`, `.aac`, `.ogg`, `.m4a`, `.wma`, `.opus`, `.webm`, `.3gp`, `.amr`, `.au`, `.ra`, `.mid`, `.midi`

**Video Formats:**
- `.mp4`, `.avi`, `.mov`, `.mkv`, `.wmv`, `.flv`, `.webm`, `.m4v`, `.3gp`, `.ogv`, `.ts`, `.mts`, `.m2ts`

### Available Models

The following models are available and will be downloaded automatically:

| Model | Size | Accuracy | Speed | Use Case |
|-------|------|----------|-------|----------|
| `tiny` | 74MB | Good | Fastest | Quick transcriptions |
| `base` | 141MB | Better | Fast | General purpose |
| `small` | 444MB | Better | Medium | High accuracy needed |
| `medium` | 1.4GB | Best | Slow | Maximum accuracy |
| `large` | 2.9GB | Best | Slowest | Professional use |

### Command Line Interface

```bash
# List available models
whisper_parallel_cpu list

# Download a specific model
whisper_parallel_cpu download base

# Transcribe audio files
whisper_parallel_cpu transcribe audio.mp3 --model base --threads 4
whisper_parallel_cpu transcribe audio.wav --model small

# Transcribe video files
whisper_parallel_cpu transcribe video.mp4 --model base --threads 4
whisper_parallel_cpu transcribe video.mkv --model medium

# Transcribe without GPU (CPU-only)
whisper_parallel_cpu transcribe audio.flac --model small --no-gpu
```

### Model Management

```python
import whisper_parallel_cpu

# List available models
whisper_parallel_cpu.list_models()

# Download a specific model
whisper_parallel_cpu.download_model("medium")

# Force re-download
whisper_parallel_cpu.download_model("base", force=True)
```

---

## ๐Ÿ“Š Benchmarking & Performance

### Run Performance Tests

```bash
# Test with 5 audio/video copies
python benchmark.py audio.mp3 5
python benchmark.py video.mp4 5
```

### What the Benchmark Tests

1. **Thread Scaling**: Tests different thread counts (1, 2, 4, 8, 16, etc.) for single audio/video transcription
2. **Sequential Processing**: Measures throughput when processing multiple audio/video files one after another
3. **Parallel Processing**: Tests concurrent processing with different numbers of workers
4. **Optimal Configuration**: Provides the best settings for your specific hardware

### Performance Optimization Tips

1. **Model Reuse**: Use `WhisperModel` class for multiple transcriptions to avoid reloading the model each time
2. **GPU Acceleration**: The system automatically uses Metal (macOS) or CUDA (Linux/Windows) when available
3. **Thread Count**: Use the benchmark to find optimal thread count for your CPU
4. **Batch Processing**: For multiple audio/video files, use parallel processing with ThreadPoolExecutor
5. **Model Size**: Smaller models (base, small) are faster but less accurate than larger ones (medium, large)

### Model Reuse Performance

When transcribing multiple files, using the `WhisperModel` class can provide significant performance improvements:

```python
from whisper_parallel_cpu import WhisperModel
import time

# Method 1: Using WhisperModel (model reuse) - FASTER
model = WhisperModel(model="base")
start = time.time()
for file in files:
    text = model.transcribe(file)
model_time = time.time() - start

# Method 2: Using transcribe function (no reuse) - SLOWER
start = time.time()
for file in files:
    text = whisper_parallel_cpu.transcribe(file, model="base")
function_time = time.time() - start

print(f"Speedup with model reuse: {function_time / model_time:.2f}x")
```

**Typical speedups:**
- 2-5x faster for multiple files with the same model
- Reduced memory usage through context sharing
- Better for batch processing workflows

---

## โš™๏ธ API Reference

### `transcribe(file_path, model, threads, use_gpu)`

Transcribes an audio or video file using Whisper. Automatically detects file type.

**Parameters:**
- `file_path` (str): Path to the audio or video file
- `model` (str): Model name (e.g. "base", "tiny", etc.) or path to Whisper model binary (.bin file)
- `threads` (int): Number of CPU threads to use (default: 4)
- `use_gpu` (bool): Whether to use GPU acceleration (default: True)

**Returns:**
- `str`: Transcribed text

### `transcribe_audio(audio_path, model, threads, use_gpu)`

Transcribes an audio file using Whisper.

**Parameters:**
- `audio_path` (str): Path to the audio file
- `model` (str): Model name (e.g. "base", "tiny", etc.) or path to Whisper model binary (.bin file)
- `threads` (int): Number of CPU threads to use (default: 4)
- `use_gpu` (bool): Whether to use GPU acceleration (default: True)

**Returns:**
- `str`: Transcribed text

### `transcribe_video(video_path, model, threads, use_gpu)`

Transcribes a video file using Whisper.

**Parameters:**
- `video_path` (str): Path to the video file
- `model` (str): Model name (e.g. "base", "tiny", etc.) or path to Whisper model binary (.bin file)
- `threads` (int): Number of CPU threads to use (default: 4)
- `use_gpu` (bool): Whether to use GPU acceleration (default: True)

**Returns:**
- `str`: Transcribed text

**Example:**
```python
import whisper_parallel_cpu

# Basic usage
text = whisper_parallel_cpu.transcribe_video("sample.mp4")

# Advanced usage
text = whisper_parallel_cpu.transcribe_video(
    "sample.mp4", 
    model="medium", 
    threads=8, 
    use_gpu=False
)
```

### `WhisperModel(model, use_gpu, threads)`

A class for efficient model reuse across multiple transcriptions.

**Parameters:**
- `model` (str): Model name (e.g. "base", "tiny", etc.) or path to Whisper model binary (.bin file)
- `use_gpu` (bool): Whether to use GPU acceleration (default: False)
- `threads` (int): Number of CPU threads to use (default: 4)

**Methods:**
- `transcribe(file_path)`: Transcribe any audio or video file
- `transcribe_audio(audio_path)`: Transcribe an audio file
- `transcribe_video(video_path)`: Transcribe a video file
- `clear_contexts()`: Clear all cached contexts to free memory
- `get_context_count()`: Get number of cached contexts

**Example:**
```python
from whisper_parallel_cpu import WhisperModel

# Create model instance
model = WhisperModel(model="base", use_gpu=False, threads=4)

# Transcribe multiple files efficiently
files = ["audio1.mp3", "audio2.wav", "video1.mp4"]
for file_path in files:
    text = model.transcribe(file_path)
    print(f"Transcribed: {text[:50]}...")

# Memory management
model.clear_contexts()
```

### `clear_contexts()`

Clear all cached whisper contexts to free memory.

**Example:**
```python
import whisper_parallel_cpu

# Clear all cached contexts
whisper_parallel_cpu.clear_contexts()
```

### `get_context_count()`

Get the number of currently cached whisper contexts.

**Returns:**
- `int`: Number of cached contexts

**Example:**
```python
import whisper_parallel_cpu

# Check how many contexts are cached
count = whisper_parallel_cpu.get_context_count()
print(f"Active contexts: {count}")
```

---

## ๐Ÿค Contributing

1. Fork the repository
2. Create a feature branch: `git checkout -b feature-name`
3. Make your changes and test thoroughly
4. Commit your changes: `git commit -m 'Add feature'`
5. Push to the branch: `git push origin feature-name`
6. Submit a pull request

---

## ๐Ÿ“„ License

MIT License - see [LICENSE](LICENSE) file for details.

---

## ๐Ÿ™ Acknowledgments

- Built on [whisper.cpp](https://github.com/ggerganov/whisper.cpp) by Georgi Gerganov
- Uses [pybind11](https://github.com/pybind/pybind11) for Python bindings
- Model management inspired by the original OpenAI Whisper project

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "whisper-parallel-cpu",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Krzysztof Furman <k_furman@outlook.com>",
    "keywords": "whisper, transcription, audio, speech, ai, ml, cpp, pybind11, cpu, parallel, multithreading",
    "author": null,
    "author_email": "Krzysztof Furman <k_furman@outlook.com>",
    "download_url": "https://files.pythonhosted.org/packages/7d/02/d3548b0b089cf582c2f53a4d2c1c01e1fb7c31b672cea4a1bdaade8b1dfc/whisper_parallel_cpu-1.2.3.tar.gz",
    "platform": null,
    "description": "# Whisper Parallel CPU Audio & Video Transcriber\n\nA minimal, robust Python package for whisper.cpp with CPU-optimized threading and integrated model management. Transcribe both audio and video files with high performance. Targeting distributed cloud deployments and transcription workflows.\n\n## \ud83d\ude80 Quick Start\n\n**Install from PyPI:**\n```bash\npip install whisper-parallel-cpu\n```\n\n**Use in Python:**\n```python\nimport whisper_parallel_cpu\n\n# Transcribe audio files\ntext = whisper_parallel_cpu.transcribe(\"audio.mp3\", model=\"base\")\n\n# Transcribe video files\ntext = whisper_parallel_cpu.transcribe(\"video.mp4\", model=\"base\")\n\n# Or use specific functions\ntext = whisper_parallel_cpu.transcribe_audio(\"audio.wav\", model=\"small\")\ntext = whisper_parallel_cpu.transcribe_video(\"video.mkv\", model=\"medium\")\n```\n\n**Or use the CLI:**\n```bash\n# Transcribe audio\nwhisper_parallel_cpu transcribe audio.mp3 --model base\n\n# Transcribe video\nwhisper_parallel_cpu transcribe video.mp4 --model base\n```\n\n---\n\n## \u2728 Features\n\n- **Native C++/pybind11 speed** (CPU & GPU acceleration)\n- **Automatic model download/caching** - no manual setup required\n- **Simple Python & CLI interface** - just `pip install` and go\n- **Input**: Audio (`.mp3`, `.wav`, `.flac`, `.aac`, `.ogg`, `.m4a`) and video (`.mp4`, `.mkv`, `.avi`, `.mov`) formats\n- **Output**: Transcribed text as a Python string\n- **Benchmarking**: Built-in performance testing and optimization tools\n- **Cross-platform**: Works on macOS, Linux, and Windows\n\n---\n\n## \ud83d\udce6 Installation\n\n### From PyPI (Recommended)\n```bash\npip install whisper-parallel-cpu\n```\n\n### From Source (Development)\n```bash\n# Clone the repository\ngit clone https://github.com/krisfur/whisper-parallel-cpu.git\ncd whisper-parallel-cpu\n\n# Install in editable mode\npip install -e .\n\n# Test the installation\npython test_transcribe.py video.mp4\n```\n\n---\n\n## \ud83e\uddf0 Requirements\n\n### System Tools\n- **C++17 compiler** (`g++`, `clang++`) - automatically handled by pip\n- **cmake** (>=3.15) - automatically handled by pip\n- **ffmpeg** (for audio extraction)\n\n### Install ffmpeg\n\n**macOS:**\n```bash\nbrew install ffmpeg\n```\n\n**Ubuntu/Debian:**\n```bash\nsudo apt update && sudo apt install ffmpeg\n```\n\n**Windows:**\nDownload from [ffmpeg.org](https://ffmpeg.org/download.html) or use Chocolatey:\n```bash\nchoco install ffmpeg\n```\n\n---\n\n## \ud83e\uddea Usage\n\n### Python API\n\n#### Basic Usage (Function-based)\n\n```python\nimport whisper_parallel_cpu\n\n# Transcribe any audio or video file (auto-detects format)\ntext = whisper_parallel_cpu.transcribe(\"audio.mp3\", model=\"base\", threads=4)\ntext = whisper_parallel_cpu.transcribe(\"video.mp4\", model=\"small\")\n\n# Use specific functions for audio or video\ntext = whisper_parallel_cpu.transcribe_audio(\"audio.wav\", model=\"base\", threads=4)\ntext = whisper_parallel_cpu.transcribe_video(\"video.mkv\", model=\"medium\", threads=8)\n\n# CPU-only mode (no GPU)\ntext = whisper_parallel_cpu.transcribe(\"audio.flac\", model=\"base\", use_gpu=False)\n```\n\n#### Advanced Usage (Model Reuse)\n\nFor better performance when transcribing multiple files, use the `WhisperModel` class to load the model once and reuse it:\n\n```python\nfrom whisper_parallel_cpu import WhisperModel\n\n# Create a model instance (model is loaded on first use)\nmodel = WhisperModel(model=\"base\", use_gpu=False, threads=4)\n\n# Transcribe multiple files using the same loaded model\nfiles = [\"audio1.mp3\", \"audio2.wav\", \"video1.mp4\", \"video2.mkv\"]\nfor file_path in files:\n    text = model.transcribe(file_path)\n    print(f\"Transcribed {file_path}: {text[:100]}...\")\n\n# Use as context manager\nwith WhisperModel(model=\"small\", use_gpu=True) as model:\n    text1 = model.transcribe(\"audio1.mp3\")\n    text2 = model.transcribe(\"audio2.wav\")\n    # Model is automatically managed\n\n# Memory management\nmodel.clear_contexts()  # Free memory\nprint(f\"Active contexts: {model.get_context_count()}\")\n```\n\n### Supported File Formats\n\n**Audio Formats:**\n- `.mp3`, `.wav`, `.flac`, `.aac`, `.ogg`, `.m4a`, `.wma`, `.opus`, `.webm`, `.3gp`, `.amr`, `.au`, `.ra`, `.mid`, `.midi`\n\n**Video Formats:**\n- `.mp4`, `.avi`, `.mov`, `.mkv`, `.wmv`, `.flv`, `.webm`, `.m4v`, `.3gp`, `.ogv`, `.ts`, `.mts`, `.m2ts`\n\n### Available Models\n\nThe following models are available and will be downloaded automatically:\n\n| Model | Size | Accuracy | Speed | Use Case |\n|-------|------|----------|-------|----------|\n| `tiny` | 74MB | Good | Fastest | Quick transcriptions |\n| `base` | 141MB | Better | Fast | General purpose |\n| `small` | 444MB | Better | Medium | High accuracy needed |\n| `medium` | 1.4GB | Best | Slow | Maximum accuracy |\n| `large` | 2.9GB | Best | Slowest | Professional use |\n\n### Command Line Interface\n\n```bash\n# List available models\nwhisper_parallel_cpu list\n\n# Download a specific model\nwhisper_parallel_cpu download base\n\n# Transcribe audio files\nwhisper_parallel_cpu transcribe audio.mp3 --model base --threads 4\nwhisper_parallel_cpu transcribe audio.wav --model small\n\n# Transcribe video files\nwhisper_parallel_cpu transcribe video.mp4 --model base --threads 4\nwhisper_parallel_cpu transcribe video.mkv --model medium\n\n# Transcribe without GPU (CPU-only)\nwhisper_parallel_cpu transcribe audio.flac --model small --no-gpu\n```\n\n### Model Management\n\n```python\nimport whisper_parallel_cpu\n\n# List available models\nwhisper_parallel_cpu.list_models()\n\n# Download a specific model\nwhisper_parallel_cpu.download_model(\"medium\")\n\n# Force re-download\nwhisper_parallel_cpu.download_model(\"base\", force=True)\n```\n\n---\n\n## \ud83d\udcca Benchmarking & Performance\n\n### Run Performance Tests\n\n```bash\n# Test with 5 audio/video copies\npython benchmark.py audio.mp3 5\npython benchmark.py video.mp4 5\n```\n\n### What the Benchmark Tests\n\n1. **Thread Scaling**: Tests different thread counts (1, 2, 4, 8, 16, etc.) for single audio/video transcription\n2. **Sequential Processing**: Measures throughput when processing multiple audio/video files one after another\n3. **Parallel Processing**: Tests concurrent processing with different numbers of workers\n4. **Optimal Configuration**: Provides the best settings for your specific hardware\n\n### Performance Optimization Tips\n\n1. **Model Reuse**: Use `WhisperModel` class for multiple transcriptions to avoid reloading the model each time\n2. **GPU Acceleration**: The system automatically uses Metal (macOS) or CUDA (Linux/Windows) when available\n3. **Thread Count**: Use the benchmark to find optimal thread count for your CPU\n4. **Batch Processing**: For multiple audio/video files, use parallel processing with ThreadPoolExecutor\n5. **Model Size**: Smaller models (base, small) are faster but less accurate than larger ones (medium, large)\n\n### Model Reuse Performance\n\nWhen transcribing multiple files, using the `WhisperModel` class can provide significant performance improvements:\n\n```python\nfrom whisper_parallel_cpu import WhisperModel\nimport time\n\n# Method 1: Using WhisperModel (model reuse) - FASTER\nmodel = WhisperModel(model=\"base\")\nstart = time.time()\nfor file in files:\n    text = model.transcribe(file)\nmodel_time = time.time() - start\n\n# Method 2: Using transcribe function (no reuse) - SLOWER\nstart = time.time()\nfor file in files:\n    text = whisper_parallel_cpu.transcribe(file, model=\"base\")\nfunction_time = time.time() - start\n\nprint(f\"Speedup with model reuse: {function_time / model_time:.2f}x\")\n```\n\n**Typical speedups:**\n- 2-5x faster for multiple files with the same model\n- Reduced memory usage through context sharing\n- Better for batch processing workflows\n\n---\n\n## \u2699\ufe0f API Reference\n\n### `transcribe(file_path, model, threads, use_gpu)`\n\nTranscribes an audio or video file using Whisper. Automatically detects file type.\n\n**Parameters:**\n- `file_path` (str): Path to the audio or video file\n- `model` (str): Model name (e.g. \"base\", \"tiny\", etc.) or path to Whisper model binary (.bin file)\n- `threads` (int): Number of CPU threads to use (default: 4)\n- `use_gpu` (bool): Whether to use GPU acceleration (default: True)\n\n**Returns:**\n- `str`: Transcribed text\n\n### `transcribe_audio(audio_path, model, threads, use_gpu)`\n\nTranscribes an audio file using Whisper.\n\n**Parameters:**\n- `audio_path` (str): Path to the audio file\n- `model` (str): Model name (e.g. \"base\", \"tiny\", etc.) or path to Whisper model binary (.bin file)\n- `threads` (int): Number of CPU threads to use (default: 4)\n- `use_gpu` (bool): Whether to use GPU acceleration (default: True)\n\n**Returns:**\n- `str`: Transcribed text\n\n### `transcribe_video(video_path, model, threads, use_gpu)`\n\nTranscribes a video file using Whisper.\n\n**Parameters:**\n- `video_path` (str): Path to the video file\n- `model` (str): Model name (e.g. \"base\", \"tiny\", etc.) or path to Whisper model binary (.bin file)\n- `threads` (int): Number of CPU threads to use (default: 4)\n- `use_gpu` (bool): Whether to use GPU acceleration (default: True)\n\n**Returns:**\n- `str`: Transcribed text\n\n**Example:**\n```python\nimport whisper_parallel_cpu\n\n# Basic usage\ntext = whisper_parallel_cpu.transcribe_video(\"sample.mp4\")\n\n# Advanced usage\ntext = whisper_parallel_cpu.transcribe_video(\n    \"sample.mp4\", \n    model=\"medium\", \n    threads=8, \n    use_gpu=False\n)\n```\n\n### `WhisperModel(model, use_gpu, threads)`\n\nA class for efficient model reuse across multiple transcriptions.\n\n**Parameters:**\n- `model` (str): Model name (e.g. \"base\", \"tiny\", etc.) or path to Whisper model binary (.bin file)\n- `use_gpu` (bool): Whether to use GPU acceleration (default: False)\n- `threads` (int): Number of CPU threads to use (default: 4)\n\n**Methods:**\n- `transcribe(file_path)`: Transcribe any audio or video file\n- `transcribe_audio(audio_path)`: Transcribe an audio file\n- `transcribe_video(video_path)`: Transcribe a video file\n- `clear_contexts()`: Clear all cached contexts to free memory\n- `get_context_count()`: Get number of cached contexts\n\n**Example:**\n```python\nfrom whisper_parallel_cpu import WhisperModel\n\n# Create model instance\nmodel = WhisperModel(model=\"base\", use_gpu=False, threads=4)\n\n# Transcribe multiple files efficiently\nfiles = [\"audio1.mp3\", \"audio2.wav\", \"video1.mp4\"]\nfor file_path in files:\n    text = model.transcribe(file_path)\n    print(f\"Transcribed: {text[:50]}...\")\n\n# Memory management\nmodel.clear_contexts()\n```\n\n### `clear_contexts()`\n\nClear all cached whisper contexts to free memory.\n\n**Example:**\n```python\nimport whisper_parallel_cpu\n\n# Clear all cached contexts\nwhisper_parallel_cpu.clear_contexts()\n```\n\n### `get_context_count()`\n\nGet the number of currently cached whisper contexts.\n\n**Returns:**\n- `int`: Number of cached contexts\n\n**Example:**\n```python\nimport whisper_parallel_cpu\n\n# Check how many contexts are cached\ncount = whisper_parallel_cpu.get_context_count()\nprint(f\"Active contexts: {count}\")\n```\n\n---\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature-name`\n3. Make your changes and test thoroughly\n4. Commit your changes: `git commit -m 'Add feature'`\n5. Push to the branch: `git push origin feature-name`\n6. Submit a pull request\n\n---\n\n## \ud83d\udcc4 License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n---\n\n## \ud83d\ude4f Acknowledgments\n\n- Built on [whisper.cpp](https://github.com/ggerganov/whisper.cpp) by Georgi Gerganov\n- Uses [pybind11](https://github.com/pybind/pybind11) for Python bindings\n- Model management inspired by the original OpenAI Whisper project\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "High-performance audio and video transcription using whisper.cpp with automatic model downloading and CPU parallelism",
    "version": "1.2.3",
    "project_urls": {
        "Documentation": "https://github.com/krisfur/whisper-parallel-cpu#readme",
        "Homepage": "https://github.com/krisfur/whisper-parallel-cpu",
        "Issues": "https://github.com/krisfur/whisper-parallel-cpu/issues",
        "Repository": "https://github.com/krisfur/whisper-parallel-cpu"
    },
    "split_keywords": [
        "whisper",
        " transcription",
        " audio",
        " speech",
        " ai",
        " ml",
        " cpp",
        " pybind11",
        " cpu",
        " parallel",
        " multithreading"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7d02d3548b0b089cf582c2f53a4d2c1c01e1fb7c31b672cea4a1bdaade8b1dfc",
                "md5": "c1f4c9d11f1ad55654f31acbf4972b6c",
                "sha256": "c415480d97d9cc8f21f09d1a8c5677b6280c00a226eb85b7d1a6b1e63b1ec7cf"
            },
            "downloads": -1,
            "filename": "whisper_parallel_cpu-1.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "c1f4c9d11f1ad55654f31acbf4972b6c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 27898,
            "upload_time": "2025-07-25T10:44:35",
            "upload_time_iso_8601": "2025-07-25T10:44:35.834767Z",
            "url": "https://files.pythonhosted.org/packages/7d/02/d3548b0b089cf582c2f53a4d2c1c01e1fb7c31b672cea4a1bdaade8b1dfc/whisper_parallel_cpu-1.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-25 10:44:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "krisfur",
    "github_project": "whisper-parallel-cpu#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "whisper-parallel-cpu"
}
        
Elapsed time: 0.68458s