# Whisper Chain
<p align="center">
<img src="https://github.com/chrischoy/WhisperChain/raw/main/assets/logo.jpg" width="30%" alt="Whisper Chain Logo" />
</p>
## Overview
Typing is boring, let's use voice to speed up your workflow. This project combines:
- Real-time speech recognition using Whisper.cpp
- Transcription cleanup using LangChain
- Global hotkey support for voice control
- Automatic clipboard integration for the cleaned transcription
## Requirements
- Python 3.8+
- OpenAI API Key
- For MacOS:
- ffmpeg (for audio processing)
- portaudio (for audio capture)
## Installation
1. Install system dependencies (MacOS):
```bash
# Install ffmpeg and portaudio using Homebrew
brew install ffmpeg portaudio
```
2. Install the project:
```bash
pip install whisperchain
```
## Configuration
WhisperChain will look for configuration in the following locations:
1. Environment variables
2. .env file in the current directory
3. ~/.whisperchain/.env file
On first run, if no configuration is found, you will be prompted to enter your OpenAI API key. The key will be saved in `~/.whisperchain/.env` for future use.
You can also manually set your OpenAI API key in any of these ways:
```bash
# Option 1: Environment variable
export OPENAI_API_KEY=your-api-key-here
# Option 2: Create .env file in current directory
echo "OPENAI_API_KEY=your-api-key-here" > .env
# Option 3: Create global config
mkdir -p ~/.whisperchain
echo "OPENAI_API_KEY=your-api-key-here" > ~/.whisperchain/.env
```
## Usage
1. Start the application:
```bash
# Run with default settings
whisperchain
# Run with custom configuration
whisperchain --config config.json
# Override specific settings
whisperchain --port 8080 --hotkey "<ctrl>+<alt>+t" --model "large" --debug
```
3. Use the global hotkey (`<ctrl>+<alt>+r` by default. `<ctrl>+<option>+r` on MacOS):
- Press and hold to start recording
- Speak your text
- Release to stop recording
- The cleaned transcription will be copied to your clipboard automatically
- Paste (Ctrl+V) to paste the transcription
## Development
### Streamlit UI
```bash
streamlit run src/whisperchain/ui/streamlit_app.py
```
If there is an error in the Streamlit UI, you can run the following command to kill all running Streamlit processes:
```bash
lsof -ti :8501 | xargs kill -9
```
### Running Tests
Install test dependencies:
```bash
pip install -e ".[test]"
```
Run tests:
```bash
pytest tests/
```
Run tests with microphone input:
```bash
# Run specific microphone test
TEST_WITH_MIC=1 pytest tests/test_stream_client.py -v -k test_stream_client_with_real_mic
# Run all tests including microphone test
TEST_WITH_MIC=1 pytest tests/
```
### Building the project
```bash
python -m build
pip install .
```
### Publishing to PyPI
```bash
python -m build
twine upload --repository pypi dist/*
```
## License
[LICENSE](LICENSE)
## Acknowledgments
- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)
- [pywhispercpp](https://github.com/absadiki/pywhispercpp.git)
- [LangChain](https://github.com/langchain-ai/langchain)
## Architecture
```mermaid
graph TB
subgraph "Client Options"
K[Key Listener]
A[Audio Stream]
C[Clipboard]
end
subgraph "Streamlit Web UI :8501"
WebP[Prompt]
WebH[History]
end
subgraph "FastAPI Server :8000"
WS[WebSocket /stream]
W[Whisper Model]
LC[LangChain Processor]
H[History]
end
K -->|"Hot Key"| A
A -->|"Audio Stream"| WS
WS --> W
W --> LC
WebP --> LC
LC --> C
LC --> H
H --> WebH
```
Raw data
{
"_id": null,
"home_page": null,
"name": "whisperchain",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "whisper, langchain, voice-control, speech-to-text",
"author": null,
"author_email": "Chris Choy <chrischoy@ai.stanford.edu>",
"download_url": "https://files.pythonhosted.org/packages/b1/aa/091fefc96ab72566824ebef37067ae462f219f78104965a97a9379c90d89/whisperchain-0.1.3.tar.gz",
"platform": null,
"description": "# Whisper Chain\n\n<p align=\"center\">\n <img src=\"https://github.com/chrischoy/WhisperChain/raw/main/assets/logo.jpg\" width=\"30%\" alt=\"Whisper Chain Logo\" />\n</p>\n\n## Overview\n\nTyping is boring, let's use voice to speed up your workflow. This project combines:\n- Real-time speech recognition using Whisper.cpp\n- Transcription cleanup using LangChain\n- Global hotkey support for voice control\n- Automatic clipboard integration for the cleaned transcription\n\n## Requirements\n\n- Python 3.8+\n- OpenAI API Key\n- For MacOS:\n - ffmpeg (for audio processing)\n - portaudio (for audio capture)\n\n## Installation\n\n1. Install system dependencies (MacOS):\n```bash\n# Install ffmpeg and portaudio using Homebrew\nbrew install ffmpeg portaudio\n```\n\n2. Install the project:\n```bash\npip install whisperchain\n```\n\n## Configuration\n\nWhisperChain will look for configuration in the following locations:\n1. Environment variables\n2. .env file in the current directory\n3. ~/.whisperchain/.env file\n\nOn first run, if no configuration is found, you will be prompted to enter your OpenAI API key. The key will be saved in `~/.whisperchain/.env` for future use.\n\nYou can also manually set your OpenAI API key in any of these ways:\n```bash\n# Option 1: Environment variable\nexport OPENAI_API_KEY=your-api-key-here\n\n# Option 2: Create .env file in current directory\necho \"OPENAI_API_KEY=your-api-key-here\" > .env\n\n# Option 3: Create global config\nmkdir -p ~/.whisperchain\necho \"OPENAI_API_KEY=your-api-key-here\" > ~/.whisperchain/.env\n```\n\n## Usage\n\n1. Start the application:\n```bash\n# Run with default settings\nwhisperchain\n\n# Run with custom configuration\nwhisperchain --config config.json\n\n# Override specific settings\nwhisperchain --port 8080 --hotkey \"<ctrl>+<alt>+t\" --model \"large\" --debug\n```\n\n3. Use the global hotkey (`<ctrl>+<alt>+r` by default. `<ctrl>+<option>+r` on MacOS):\n - Press and hold to start recording\n - Speak your text\n - Release to stop recording\n - The cleaned transcription will be copied to your clipboard automatically\n - Paste (Ctrl+V) to paste the transcription\n\n## Development\n\n### Streamlit UI\n\n```bash\nstreamlit run src/whisperchain/ui/streamlit_app.py\n```\n\nIf there is an error in the Streamlit UI, you can run the following command to kill all running Streamlit processes:\n\n```bash\nlsof -ti :8501 | xargs kill -9\n```\n\n### Running Tests\n\nInstall test dependencies:\n```bash\npip install -e \".[test]\"\n```\n\nRun tests:\n```bash\npytest tests/\n```\n\nRun tests with microphone input:\n```bash\n# Run specific microphone test\nTEST_WITH_MIC=1 pytest tests/test_stream_client.py -v -k test_stream_client_with_real_mic\n\n# Run all tests including microphone test\nTEST_WITH_MIC=1 pytest tests/\n```\n\n### Building the project\n\n```bash\npython -m build\npip install .\n```\n\n### Publishing to PyPI\n\n```bash\npython -m build\ntwine upload --repository pypi dist/*\n```\n\n## License\n\n[LICENSE](LICENSE)\n\n## Acknowledgments\n\n- [Whisper.cpp](https://github.com/ggerganov/whisper.cpp)\n- [pywhispercpp](https://github.com/absadiki/pywhispercpp.git)\n- [LangChain](https://github.com/langchain-ai/langchain)\n\n\n## Architecture\n\n```mermaid\ngraph TB\n subgraph \"Client Options\"\n K[Key Listener]\n A[Audio Stream]\n C[Clipboard]\n end\n\n subgraph \"Streamlit Web UI :8501\"\n WebP[Prompt]\n WebH[History]\n end\n\n subgraph \"FastAPI Server :8000\"\n WS[WebSocket /stream]\n W[Whisper Model]\n LC[LangChain Processor]\n H[History]\n end\n\n K -->|\"Hot Key\"| A\n A -->|\"Audio Stream\"| WS\n WS --> W\n W --> LC\n WebP --> LC\n LC --> C\n LC --> H\n H --> WebH\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Voice control using Whisper.cpp with LangChain cleanup",
"version": "0.1.3",
"project_urls": {
"Bug Tracker": "https://github.com/chrischoy/whisperchain/issues",
"Homepage": "https://github.com/chrischoy/whisperchain"
},
"split_keywords": [
"whisper",
" langchain",
" voice-control",
" speech-to-text"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "91bf0eac79291464b354cb23548026ae05d2becff917c1262b27931cece24cd1",
"md5": "90310768c632e7b3332185b15d47506d",
"sha256": "91e607ea8cdf2143c9552acb322a8fc7b5cb3ab78f2c61bbc6d37653bff68e2d"
},
"downloads": -1,
"filename": "whisperchain-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "90310768c632e7b3332185b15d47506d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 19386,
"upload_time": "2025-02-09T20:40:52",
"upload_time_iso_8601": "2025-02-09T20:40:52.653897Z",
"url": "https://files.pythonhosted.org/packages/91/bf/0eac79291464b354cb23548026ae05d2becff917c1262b27931cece24cd1/whisperchain-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b1aa091fefc96ab72566824ebef37067ae462f219f78104965a97a9379c90d89",
"md5": "5cf31a0757ec3af6466a3664a23b730d",
"sha256": "088ec97c71bbe93b9826efe2781d65754a5ef3932691e88526ba2c311f31048e"
},
"downloads": -1,
"filename": "whisperchain-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "5cf31a0757ec3af6466a3664a23b730d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 22128,
"upload_time": "2025-02-09T20:40:54",
"upload_time_iso_8601": "2025-02-09T20:40:54.429102Z",
"url": "https://files.pythonhosted.org/packages/b1/aa/091fefc96ab72566824ebef37067ae462f219f78104965a97a9379c90d89/whisperchain-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-09 20:40:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "chrischoy",
"github_project": "whisperchain",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "whisperchain"
}