whisperchain


Namewhisperchain JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryVoice control using Whisper.cpp with LangChain cleanup
upload_time2025-02-09 20:40:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords whisper langchain voice-control speech-to-text
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Whisper Chain

<p align="center">
  <img src="https://github.com/chrischoy/WhisperChain/raw/main/assets/logo.jpg" width="30%" alt="Whisper Chain Logo" />
</p>

## Overview

Typing is boring, let's use voice to speed up your workflow. This project combines:
- Real-time speech recognition using Whisper.cpp
- Transcription cleanup using LangChain
- Global hotkey support for voice control
- Automatic clipboard integration for the cleaned transcription

## Requirements

- Python 3.8+
- OpenAI API Key
- For MacOS:
  - ffmpeg (for audio processing)
  - portaudio (for audio capture)

## Installation

1. Install system dependencies (MacOS):
```bash
# Install ffmpeg and portaudio using Homebrew
brew install ffmpeg portaudio
```

2. Install the project:
```bash
pip install whisperchain
```

## Configuration

WhisperChain will look for configuration in the following locations:
1. Environment variables
2. .env file in the current directory
3. ~/.whisperchain/.env file

On first run, if no configuration is found, you will be prompted to enter your OpenAI API key. The key will be saved in `~/.whisperchain/.env` for future use.

You can also manually set your OpenAI API key in any of these ways:
```bash
# Option 1: Environment variable
export OPENAI_API_KEY=your-api-key-here

# Option 2: Create .env file in current directory
echo "OPENAI_API_KEY=your-api-key-here" > .env

# Option 3: Create global config
mkdir -p ~/.whisperchain
echo "OPENAI_API_KEY=your-api-key-here" > ~/.whisperchain/.env
```

## Usage

1. Start the application:
```bash
# Run with default settings
whisperchain

# Run with custom configuration
whisperchain --config config.json

# Override specific settings
whisperchain --port 8080 --hotkey "<ctrl>+<alt>+t" --model "large" --debug
```

3. Use the global hotkey (`<ctrl>+<alt>+r` by default. `<ctrl>+<option>+r` on MacOS):
   - Press and hold to start recording
   - Speak your text
   - Release to stop recording
   - The cleaned transcription will be copied to your clipboard automatically
   - Paste (Ctrl+V) to paste the transcription

## Development

### Streamlit UI

```bash
streamlit run src/whisperchain/ui/streamlit_app.py
```

If there is an error in the Streamlit UI, you can run the following command to kill all running Streamlit processes:

```bash
lsof -ti :8501 | xargs kill -9
```

### Running Tests

Install test dependencies:
```bash
pip install -e ".[test]"
```

Run tests:
```bash
pytest tests/
```

Run tests with microphone input:
```bash
# Run specific microphone test
TEST_WITH_MIC=1 pytest tests/test_stream_client.py -v -k test_stream_client_with_real_mic

# Run all tests including microphone test
TEST_WITH_MIC=1 pytest tests/
```

### Building the project

```bash
python -m build
pip install .
```

### Publishing to PyPI

```bash
python -m build
twine upload --repository pypi dist/*
```

## License

[LICENSE](LICENSE)

## Acknowledgments

- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)
- [pywhispercpp](https://github.com/absadiki/pywhispercpp.git)
- [LangChain](https://github.com/langchain-ai/langchain)


## Architecture

```mermaid
graph TB
    subgraph "Client Options"
        K[Key Listener]
        A[Audio Stream]
        C[Clipboard]
    end

    subgraph "Streamlit Web UI :8501"
        WebP[Prompt]
        WebH[History]
    end

    subgraph "FastAPI Server :8000"
        WS[WebSocket /stream]
        W[Whisper Model]
        LC[LangChain Processor]
        H[History]
    end

    K -->|"Hot Key"| A
    A -->|"Audio Stream"| WS
    WS --> W
    W --> LC
    WebP --> LC
    LC --> C
    LC --> H
    H --> WebH
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "whisperchain",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "whisper, langchain, voice-control, speech-to-text",
    "author": null,
    "author_email": "Chris Choy <chrischoy@ai.stanford.edu>",
    "download_url": "https://files.pythonhosted.org/packages/b1/aa/091fefc96ab72566824ebef37067ae462f219f78104965a97a9379c90d89/whisperchain-0.1.3.tar.gz",
    "platform": null,
    "description": "# Whisper Chain\n\n<p align=\"center\">\n  <img src=\"https://github.com/chrischoy/WhisperChain/raw/main/assets/logo.jpg\" width=\"30%\" alt=\"Whisper Chain Logo\" />\n</p>\n\n## Overview\n\nTyping is boring, let's use voice to speed up your workflow. This project combines:\n- Real-time speech recognition using Whisper.cpp\n- Transcription cleanup using LangChain\n- Global hotkey support for voice control\n- Automatic clipboard integration for the cleaned transcription\n\n## Requirements\n\n- Python 3.8+\n- OpenAI API Key\n- For MacOS:\n  - ffmpeg (for audio processing)\n  - portaudio (for audio capture)\n\n## Installation\n\n1. Install system dependencies (MacOS):\n```bash\n# Install ffmpeg and portaudio using Homebrew\nbrew install ffmpeg portaudio\n```\n\n2. Install the project:\n```bash\npip install whisperchain\n```\n\n## Configuration\n\nWhisperChain will look for configuration in the following locations:\n1. Environment variables\n2. .env file in the current directory\n3. ~/.whisperchain/.env file\n\nOn first run, if no configuration is found, you will be prompted to enter your OpenAI API key. The key will be saved in `~/.whisperchain/.env` for future use.\n\nYou can also manually set your OpenAI API key in any of these ways:\n```bash\n# Option 1: Environment variable\nexport OPENAI_API_KEY=your-api-key-here\n\n# Option 2: Create .env file in current directory\necho \"OPENAI_API_KEY=your-api-key-here\" > .env\n\n# Option 3: Create global config\nmkdir -p ~/.whisperchain\necho \"OPENAI_API_KEY=your-api-key-here\" > ~/.whisperchain/.env\n```\n\n## Usage\n\n1. Start the application:\n```bash\n# Run with default settings\nwhisperchain\n\n# Run with custom configuration\nwhisperchain --config config.json\n\n# Override specific settings\nwhisperchain --port 8080 --hotkey \"<ctrl>+<alt>+t\" --model \"large\" --debug\n```\n\n3. Use the global hotkey (`<ctrl>+<alt>+r` by default. `<ctrl>+<option>+r` on MacOS):\n   - Press and hold to start recording\n   - Speak your text\n   - Release to stop recording\n   - The cleaned transcription will be copied to your clipboard automatically\n   - Paste (Ctrl+V) to paste the transcription\n\n## Development\n\n### Streamlit UI\n\n```bash\nstreamlit run src/whisperchain/ui/streamlit_app.py\n```\n\nIf there is an error in the Streamlit UI, you can run the following command to kill all running Streamlit processes:\n\n```bash\nlsof -ti :8501 | xargs kill -9\n```\n\n### Running Tests\n\nInstall test dependencies:\n```bash\npip install -e \".[test]\"\n```\n\nRun tests:\n```bash\npytest tests/\n```\n\nRun tests with microphone input:\n```bash\n# Run specific microphone test\nTEST_WITH_MIC=1 pytest tests/test_stream_client.py -v -k test_stream_client_with_real_mic\n\n# Run all tests including microphone test\nTEST_WITH_MIC=1 pytest tests/\n```\n\n### Building the project\n\n```bash\npython -m build\npip install .\n```\n\n### Publishing to PyPI\n\n```bash\npython -m build\ntwine upload --repository pypi dist/*\n```\n\n## License\n\n[LICENSE](LICENSE)\n\n## Acknowledgments\n\n- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)\n- [pywhispercpp](https://github.com/absadiki/pywhispercpp.git)\n- [LangChain](https://github.com/langchain-ai/langchain)\n\n\n## Architecture\n\n```mermaid\ngraph TB\n    subgraph \"Client Options\"\n        K[Key Listener]\n        A[Audio Stream]\n        C[Clipboard]\n    end\n\n    subgraph \"Streamlit Web UI :8501\"\n        WebP[Prompt]\n        WebH[History]\n    end\n\n    subgraph \"FastAPI Server :8000\"\n        WS[WebSocket /stream]\n        W[Whisper Model]\n        LC[LangChain Processor]\n        H[History]\n    end\n\n    K -->|\"Hot Key\"| A\n    A -->|\"Audio Stream\"| WS\n    WS --> W\n    W --> LC\n    WebP --> LC\n    LC --> C\n    LC --> H\n    H --> WebH\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Voice control using Whisper.cpp with LangChain cleanup",
    "version": "0.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/chrischoy/whisperchain/issues",
        "Homepage": "https://github.com/chrischoy/whisperchain"
    },
    "split_keywords": [
        "whisper",
        " langchain",
        " voice-control",
        " speech-to-text"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "91bf0eac79291464b354cb23548026ae05d2becff917c1262b27931cece24cd1",
                "md5": "90310768c632e7b3332185b15d47506d",
                "sha256": "91e607ea8cdf2143c9552acb322a8fc7b5cb3ab78f2c61bbc6d37653bff68e2d"
            },
            "downloads": -1,
            "filename": "whisperchain-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "90310768c632e7b3332185b15d47506d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 19386,
            "upload_time": "2025-02-09T20:40:52",
            "upload_time_iso_8601": "2025-02-09T20:40:52.653897Z",
            "url": "https://files.pythonhosted.org/packages/91/bf/0eac79291464b354cb23548026ae05d2becff917c1262b27931cece24cd1/whisperchain-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b1aa091fefc96ab72566824ebef37067ae462f219f78104965a97a9379c90d89",
                "md5": "5cf31a0757ec3af6466a3664a23b730d",
                "sha256": "088ec97c71bbe93b9826efe2781d65754a5ef3932691e88526ba2c311f31048e"
            },
            "downloads": -1,
            "filename": "whisperchain-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "5cf31a0757ec3af6466a3664a23b730d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 22128,
            "upload_time": "2025-02-09T20:40:54",
            "upload_time_iso_8601": "2025-02-09T20:40:54.429102Z",
            "url": "https://files.pythonhosted.org/packages/b1/aa/091fefc96ab72566824ebef37067ae462f219f78104965a97a9379c90d89/whisperchain-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-09 20:40:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "chrischoy",
    "github_project": "whisperchain",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "whisperchain"
}
        
Elapsed time: 0.50922s