whisperpipe


Namewhisperpipe JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryReal-time speech-to-text streaming with OpenAI Whisper
upload_time2025-10-20 23:24:12
maintainerNone
docs_urlNone
authorErfan Ramezani
requires_python<3.13,>=3.9
licenseMIT
keywords whisper openai speech-to-text stt asr automatic speech recognition real-time realtime audio microphone transcription streaming live transcription
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # whisperpipe

Real-time speech-to-text streaming with OpenAI Whisper

## Description

whisperpipe is a powerful, easy-to-use Python package for real-time, offline audio transcription using OpenAI's Whisper model. It runs locally, making it a free and private solution for continuous speech-to-text applications. It provides seamless integration with callback functions for LLM processing and supports pause/resume functionality for interactive applications.

## Why whisperpipe?

In a world where most ASR (Automatic Speech Recognition) services are cloud-based, whisperpipe offers a refreshing alternative by harnessing the power of OpenAI's Whisper model to run directly on your local machine. This approach provides several key advantages:

- **Complete Privacy**: Since all transcription is done locally, your voice data never leaves your computer. This is crucial for applications that handle sensitive or private conversations.
- **Zero Cost**: Say goodbye to recurring subscription fees and per-minute charges. whisperpipe is free to use, making it an economical choice for both hobbyists and commercial projects.
- **No Internet Required**: Whether you're on a plane, in a remote location, or simply have an unstable internet connection, whisperpipe works flawlessly offline.
- **Real-time Performance**: Designed for continuous, real-time transcription, whisperpipe is ideal for live applications such as voice-controlled assistants, dictation software, and more.
- **Unleash the Power of Whisper**: By running the Whisper model locally, you have full control over the transcription process, from model selection to performance tuning.

whisperpipe empowers you to build powerful, private, and cost-effective voice applications with ease.

## Features

- **Real-time audio transcription** using OpenAI Whisper
- **Callback system** for custom processing (LLM integration, etc.)
- **Pause/Resume functionality** for interactive applications
- **Multiple language support**
- **Configurable processing parameters**
- **Thread-safe operation**
- **Easy installation and usage**

## Installation

### From PyPI

```bash
pip install whisperpipe
```

### From GitHub

```bash
pip install git+https://github.com/Erfan-ram/whisperpipe.git
```

## Quick Start

```python
from whisperpipe import pipeStream

# Basic usage
transcriber = pipeStream(
    model_name="base",
    language="en",
    finalization_delay=10.0,
    processing_interval=1.0
)

# Start streaming
transcriber.start_streaming()
```

## Usage Examples

### Basic Transcription

```python
from whisperpipe import pipeStream

# Create transcriber instance
transcriber = pipeStream(
    model_name="base",
    language="en",
    finalization_delay=10.0,
    processing_interval=1.0
)

# Start transcription
transcriber.start_streaming()

# The transcribed text will be printed to console
# Press Ctrl+C to stop
```

### With Custom Callback (LLM Integration)

```python
from whisperpipe import pipeStream

def llm_processor(text):
    """Custom function to process transcribed text"""
    print(f"Processing: {text}")
    # Your LLM integration here
    # e.g., send to OpenAI, Claude, local model, etc.
    response = your_llm_api.chat(text)
    print(f"Response: {response}")
    return response

# Create transcriber with callback
transcriber = pipeStream(
    model_name="base",
    language="en",
    finalization_delay=10.0,
    processing_interval=1.0
)

# Register callback
transcriber.set_def_callback(llm_processor)

# Start streaming with LLM integration
transcriber.start_streaming()
```

### Interactive Mode with Pause/Resume

```python
from whisperpipe import pipeStream
import time

def interactive_processor(text):
    """Process text and pause for response"""
    # Pause transcriber while processing
    transcriber.pause_streaming()
    
    print(f"User said: {text}")
    
    # Process with your system
    response = process_with_llm(text)
    
    # Speak or display response
    print(f"Assistant: {response}")
    
    # Resume for next input
    transcriber.resume_streaming()

transcriber = pipeStream()
transcriber.set_def_callback(interactive_processor)
transcriber.start_streaming()
```

## API Reference

### Constructor Parameters

- `model_name` (str): Whisper model name ("tiny", "base", "small", "medium", "large"). Default: "base"
- `language` (str): Language code for transcription ("en", "es", "fr", etc.). Default: "en"
- `finalization_delay` (float): Wait time in seconds before finalizing transcription. Default: 10.0
- `processing_interval` (float): Interval in seconds between processing cycles. Default: 1.0
- `buffer_duration_seconds` (float): Time window in seconds to hold audio for processing. Default: 5.0
- `debug_mode` (bool): Enable debug mode for detailed logging. Default: True

### Methods

#### Core Methods
- `start_streaming()`: Start audio capture and transcription
- `stop_streaming()`: Stop audio capture and transcription

#### Callback System
- `set_def_callback(callback_function)`: Register a callback function for processing transcribed text
- `set_def_callback(None)`: Clear the callback (use default behavior)

#### Pause/Resume Control
- `pause_streaming()`: Pause audio processing temporarily
- `resume_streaming()`: Resume audio processing
- `is_paused()`: Check if transcriber is paused
- `is_running()`: Check if transcriber is running

## Requirements

- Python 3.8+
- PyAudio
- OpenAI Whisper
- PyTorch
- NumPy
- pynput

## License

MIT License

## Author

Erfan Ramezani - erfanramezany245@gmail.com

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Support

For issues and questions, please use the [GitHub Issues](https://github.com/Erfan-ram/whisperpipe/issues) page.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "whisperpipe",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.9",
    "maintainer_email": null,
    "keywords": "whisper, openai, speech-to-text, stt, asr, automatic speech recognition, real-time, realtime, audio, microphone, transcription, streaming, live transcription",
    "author": "Erfan Ramezani",
    "author_email": "erfanramezany245@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/15/e2/3a792ced5852231a066880fa1da7073095805ac4501aca018e32dba4e564/whisperpipe-0.1.0.tar.gz",
    "platform": null,
    "description": "# whisperpipe\n\nReal-time speech-to-text streaming with OpenAI Whisper\n\n## Description\n\nwhisperpipe is a powerful, easy-to-use Python package for real-time, offline audio transcription using OpenAI's Whisper model. It runs locally, making it a free and private solution for continuous speech-to-text applications. It provides seamless integration with callback functions for LLM processing and supports pause/resume functionality for interactive applications.\n\n## Why whisperpipe?\n\nIn a world where most ASR (Automatic Speech Recognition) services are cloud-based, whisperpipe offers a refreshing alternative by harnessing the power of OpenAI's Whisper model to run directly on your local machine. This approach provides several key advantages:\n\n- **Complete Privacy**: Since all transcription is done locally, your voice data never leaves your computer. This is crucial for applications that handle sensitive or private conversations.\n- **Zero Cost**: Say goodbye to recurring subscription fees and per-minute charges. whisperpipe is free to use, making it an economical choice for both hobbyists and commercial projects.\n- **No Internet Required**: Whether you're on a plane, in a remote location, or simply have an unstable internet connection, whisperpipe works flawlessly offline.\n- **Real-time Performance**: Designed for continuous, real-time transcription, whisperpipe is ideal for live applications such as voice-controlled assistants, dictation software, and more.\n- **Unleash the Power of Whisper**: By running the Whisper model locally, you have full control over the transcription process, from model selection to performance tuning.\n\nwhisperpipe empowers you to build powerful, private, and cost-effective voice applications with ease.\n\n## Features\n\n- **Real-time audio transcription** using OpenAI Whisper\n- **Callback system** for custom processing (LLM integration, etc.)\n- **Pause/Resume functionality** for interactive applications\n- **Multiple language support**\n- **Configurable processing parameters**\n- **Thread-safe operation**\n- **Easy installation and usage**\n\n## Installation\n\n### From PyPI\n\n```bash\npip install whisperpipe\n```\n\n### From GitHub\n\n```bash\npip install git+https://github.com/Erfan-ram/whisperpipe.git\n```\n\n## Quick Start\n\n```python\nfrom whisperpipe import pipeStream\n\n# Basic usage\ntranscriber = pipeStream(\n    model_name=\"base\",\n    language=\"en\",\n    finalization_delay=10.0,\n    processing_interval=1.0\n)\n\n# Start streaming\ntranscriber.start_streaming()\n```\n\n## Usage Examples\n\n### Basic Transcription\n\n```python\nfrom whisperpipe import pipeStream\n\n# Create transcriber instance\ntranscriber = pipeStream(\n    model_name=\"base\",\n    language=\"en\",\n    finalization_delay=10.0,\n    processing_interval=1.0\n)\n\n# Start transcription\ntranscriber.start_streaming()\n\n# The transcribed text will be printed to console\n# Press Ctrl+C to stop\n```\n\n### With Custom Callback (LLM Integration)\n\n```python\nfrom whisperpipe import pipeStream\n\ndef llm_processor(text):\n    \"\"\"Custom function to process transcribed text\"\"\"\n    print(f\"Processing: {text}\")\n    # Your LLM integration here\n    # e.g., send to OpenAI, Claude, local model, etc.\n    response = your_llm_api.chat(text)\n    print(f\"Response: {response}\")\n    return response\n\n# Create transcriber with callback\ntranscriber = pipeStream(\n    model_name=\"base\",\n    language=\"en\",\n    finalization_delay=10.0,\n    processing_interval=1.0\n)\n\n# Register callback\ntranscriber.set_def_callback(llm_processor)\n\n# Start streaming with LLM integration\ntranscriber.start_streaming()\n```\n\n### Interactive Mode with Pause/Resume\n\n```python\nfrom whisperpipe import pipeStream\nimport time\n\ndef interactive_processor(text):\n    \"\"\"Process text and pause for response\"\"\"\n    # Pause transcriber while processing\n    transcriber.pause_streaming()\n    \n    print(f\"User said: {text}\")\n    \n    # Process with your system\n    response = process_with_llm(text)\n    \n    # Speak or display response\n    print(f\"Assistant: {response}\")\n    \n    # Resume for next input\n    transcriber.resume_streaming()\n\ntranscriber = pipeStream()\ntranscriber.set_def_callback(interactive_processor)\ntranscriber.start_streaming()\n```\n\n## API Reference\n\n### Constructor Parameters\n\n- `model_name` (str): Whisper model name (\"tiny\", \"base\", \"small\", \"medium\", \"large\"). Default: \"base\"\n- `language` (str): Language code for transcription (\"en\", \"es\", \"fr\", etc.). Default: \"en\"\n- `finalization_delay` (float): Wait time in seconds before finalizing transcription. Default: 10.0\n- `processing_interval` (float): Interval in seconds between processing cycles. Default: 1.0\n- `buffer_duration_seconds` (float): Time window in seconds to hold audio for processing. Default: 5.0\n- `debug_mode` (bool): Enable debug mode for detailed logging. Default: True\n\n### Methods\n\n#### Core Methods\n- `start_streaming()`: Start audio capture and transcription\n- `stop_streaming()`: Stop audio capture and transcription\n\n#### Callback System\n- `set_def_callback(callback_function)`: Register a callback function for processing transcribed text\n- `set_def_callback(None)`: Clear the callback (use default behavior)\n\n#### Pause/Resume Control\n- `pause_streaming()`: Pause audio processing temporarily\n- `resume_streaming()`: Resume audio processing\n- `is_paused()`: Check if transcriber is paused\n- `is_running()`: Check if transcriber is running\n\n## Requirements\n\n- Python 3.8+\n- PyAudio\n- OpenAI Whisper\n- PyTorch\n- NumPy\n- pynput\n\n## License\n\nMIT License\n\n## Author\n\nErfan Ramezani - erfanramezany245@gmail.com\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## Support\n\nFor issues and questions, please use the [GitHub Issues](https://github.com/Erfan-ram/whisperpipe/issues) page.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Real-time speech-to-text streaming with OpenAI Whisper",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/Erfan-ram/whisperpipe/issues",
        "Changelog": "https://github.com/Erfan-ram/whisperpipe/releases",
        "Documentation": "https://github.com/Erfan-ram/whisperpipe/blob/main/README.md",
        "Homepage": "https://github.com/Erfan-ram/whisperpipe",
        "Repository": "https://github.com/Erfan-ram/whisperpipe",
        "Source": "https://github.com/Erfan-ram/whisperpipe"
    },
    "split_keywords": [
        "whisper",
        " openai",
        " speech-to-text",
        " stt",
        " asr",
        " automatic speech recognition",
        " real-time",
        " realtime",
        " audio",
        " microphone",
        " transcription",
        " streaming",
        " live transcription"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "235fd9d68f2c318832a6a1ad2b02383ce1223b88ee227303d2b570810328a3fe",
                "md5": "7185872aca2f1493a387198107fe3ca1",
                "sha256": "afd7737cf474a6767ba1a82ce3dd35975e469a920f317168b83a9892e3ffc61d"
            },
            "downloads": -1,
            "filename": "whisperpipe-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7185872aca2f1493a387198107fe3ca1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.9",
            "size": 21560,
            "upload_time": "2025-10-20T23:24:11",
            "upload_time_iso_8601": "2025-10-20T23:24:11.032665Z",
            "url": "https://files.pythonhosted.org/packages/23/5f/d9d68f2c318832a6a1ad2b02383ce1223b88ee227303d2b570810328a3fe/whisperpipe-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "15e23a792ced5852231a066880fa1da7073095805ac4501aca018e32dba4e564",
                "md5": "e506cedd3cb28ecaefec5317f7f96ac3",
                "sha256": "cd750f587cf54bdd7f3dfefb7ff101545e8265961b3264b950c98a92b193d1f1"
            },
            "downloads": -1,
            "filename": "whisperpipe-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e506cedd3cb28ecaefec5317f7f96ac3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.9",
            "size": 22746,
            "upload_time": "2025-10-20T23:24:12",
            "upload_time_iso_8601": "2025-10-20T23:24:12.467073Z",
            "url": "https://files.pythonhosted.org/packages/15/e2/3a792ced5852231a066880fa1da7073095805ac4501aca018e32dba4e564/whisperpipe-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-20 23:24:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Erfan-ram",
    "github_project": "whisperpipe",
    "github_not_found": true,
    "lcname": "whisperpipe"
}
        
Elapsed time: 1.51967s