# HoldTranscribe
Hotkey-Activated Voice-to-Clipboard Transcriber
A lightweight tool that records audio while you hold a configurable hotkey, transcribes speech using OpenAI's Whisper model, and copies the result to your clipboard.
---
## Features
* Hold-to-record using a customizable hotkey combination
* GPU acceleration with automatic CUDA detection and CPU fallback
* Instant copy of transcribed text to the clipboard
* Persistent model instance for low-latency transcription
* Configurable model size and beam search settings
* Detailed debug output and performance metrics
* Cross-platform support (Linux, macOS, Windows)
* Voice Activity Detection (VAD) for clean audio capture
* Auto-start service integration for all platforms
---
## Platform-Specific Requirements
### Linux
* Python 3.8 or later
* Bash-compatible shell (for installer script)
* A CUDA-capable GPU (optional, for hardware acceleration)
* PulseAudio or equivalent audio system
* Permissions to read input events (user in `input` group)
* X11 or Wayland desktop environment
### macOS
* Python 3.8 or later
* macOS 10.14 (Mojave) or later
* Microphone access permissions
* Accessibility permissions for global hotkey monitoring
* Optional: CUDA-capable GPU (limited support on newer Macs)
### Windows
* Python 3.8 or later
* Windows 10 or later (Windows 11 recommended)
* Microphone access permissions
* Optional: CUDA-capable GPU with appropriate drivers
* PowerShell 5.0 or later (for service installation)
---
## Installation
### Option 1: Pip Installation (Recommended)
**From GitHub (all platforms):**
```bash
pip install git+https://github.com/binaryninja/holdtranscribe.git
```
**From PyPI (when available):**
```bash
pip install holdtranscribe
```
### Option 2: Manual Installation
1. **Clone the repository:**
```bash
git clone https://github.com/binaryninja/holdtranscribe.git
cd holdtranscribe
```
2. **Install Python dependencies:**
```bash
pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil
```
3. **Optional GPU acceleration:**
**Linux/Windows with CUDA:**
```bash
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```
**macOS with Metal Performance Shaders:**
```bash
pip install torch torchvision torchaudio
```
---
## Platform-Specific Setup
### Linux Setup
1. **Add user to input group (if needed):**
```bash
sudo usermod -aG input $USER
```
Log out and back in for changes to take effect.
2. **Install system dependencies (Ubuntu/Debian):**
```bash
sudo apt update
sudo apt install python3-pip portaudio19-dev pulseaudio
```
3. **Install system dependencies (Fedora/RHEL):**
```bash
sudo dnf install python3-pip portaudio-devel pulseaudio
```
### macOS Setup
1. **Install dependencies via Homebrew:**
```bash
# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install PortAudio
brew install portaudio
```
2. **Grant permissions:**
- **Microphone Access:** System Preferences → Security & Privacy → Privacy → Microphone → Enable for Terminal/your Python environment
- **Accessibility Access:** System Preferences → Security & Privacy → Privacy → Accessibility → Enable for Terminal/your Python environment
- **Input Monitoring:** System Preferences → Security & Privacy → Privacy → Input Monitoring → Enable for Terminal/your Python environment
3. **For Apple Silicon Macs:**
```bash
# Install Python dependencies with conda for better compatibility
conda install python=3.9
pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil
```
### Windows Setup
1. **Install via Microsoft Store or python.org:**
- Download Python from [python.org](https://python.org) or install via Microsoft Store
- Ensure "Add Python to PATH" is checked during installation
2. **Install Visual C++ Build Tools (if compilation errors occur):**
- Download and install Microsoft C++ Build Tools
- Or install Visual Studio Community with C++ workload
3. **Grant microphone permissions:**
- Settings → Privacy → Microphone → Allow apps to access microphone → Enable for Python/Terminal
---
## Usage
### Basic Usage (All Platforms)
```bash
# Run with default settings (if installed via pip)
holdtranscribe
# Or if using the script directly
python voice_hold_to_clip.py
```
### Command Line Options
```bash
--model <size> Whisper model size (tiny, base, small, medium, large-v3). Default: large-v3
--beam-size <n> Beam search width (1 for fastest). Default: 5
--fast Shorthand for `--model base --beam-size 1`
--debug Enable verbose timing and resource metrics
--device <cpu|cuda> Force CPU or GPU mode
```
### Platform-Specific Examples
**Linux/macOS:**
```bash
holdtranscribe --model tiny --beam-size 1
```
**Windows (Command Prompt):**
```cmd
holdtranscribe --model tiny --beam-size 1
```
**Windows (PowerShell):**
```powershell
holdtranscribe --model tiny --beam-size 1
```
---
## Auto-Start Service Setup
### Linux (systemd)
1. **Create service directory:**
```bash
mkdir -p ~/.config/systemd/user
```
2. **Create service file:**
```bash
cat > ~/.config/systemd/user/holdtranscribe.service << 'EOF'
[Unit]
Description=HoldTranscribe Voice Transcriber
After=graphical-session.target
[Service]
Type=simple
ExecStart=/usr/bin/holdtranscribe --model large-v3 --beam-size 1
Restart=always
RestartSec=5
Environment=DISPLAY=:0
Environment=XDG_RUNTIME_DIR=/run/user/%i
WorkingDirectory=%h
[Install]
WantedBy=default.target
EOF
```
3. **Enable and start:**
```bash
systemctl --user daemon-reload
systemctl --user enable holdtranscribe.service
systemctl --user start holdtranscribe.service
```
### macOS (launchd)
1. **Create launch agent directory:**
```bash
mkdir -p ~/Library/LaunchAgents
```
2. **Create plist file:**
```bash
cat > ~/Library/LaunchAgents/com.holdtranscribe.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.holdtranscribe</string>
<key>ProgramArguments</key>
<array>
<string>/usr/local/bin/holdtranscribe</string>
<string>--model</string>
<string>large-v3</string>
<string>--beam-size</string>
<string>1</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
</dict>
</plist>
EOF
```
3. **Load the service:**
```bash
launchctl load ~/Library/LaunchAgents/com.holdtranscribe.plist
launchctl start com.holdtranscribe
```
### Windows (Task Scheduler)
1. **Create batch file for easier management:**
```batch
@echo off
holdtranscribe --model large-v3 --beam-size 1
```
Save as `holdtranscribe.bat`
2. **Using Task Scheduler GUI:**
- Open Task Scheduler (taskschd.msc)
- Create Basic Task → Name: "HoldTranscribe"
- Trigger: When I log on
- Action: Start a program → Browse to your batch file
- Finish and test
3. **Using PowerShell (run as Administrator):**
```powershell
$action = New-ScheduledTaskAction -Execute "C:\path\to\holdtranscribe.bat"
$trigger = New-ScheduledTaskTrigger -AtLogon
$settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries
Register-ScheduledTask -TaskName "HoldTranscribe" -Action $action -Trigger $trigger -Settings $settings
```
---
## Configuration
### Hotkey Customization
Edit the `HOTKEY` set in the script to change key combinations:
```python
# Default: Ctrl + Mouse Forward Button
HOTKEY = {keyboard.Key.ctrl, mouse.Button.button9}
# Alternative examples:
# HOTKEY = {keyboard.Key.ctrl, keyboard.Key.space} # Ctrl + Space
# HOTKEY = {keyboard.Key.alt, mouse.Button.left} # Alt + Left Click
# HOTKEY = {mouse.Button.button8} # Mouse Back Button only
```
### Platform-Specific Mouse Button Notes
- **Windows:** Button numbers may vary by mouse driver
- **macOS:** Some mouse buttons may require additional permissions
- **Linux:** Button numbers can be checked with `xev` command
### Environment Variables
- `CUDA_VISIBLE_DEVICES` - Control GPU usage
- `TRANSFORMERS_CACHE` - Customize model cache location
- `DISABLE_NOTIFY=1` - Suppress desktop notifications
- `PULSE_SERVER` (Linux) - Specify PulseAudio server
- `PORTAUDIO_DEVICE` - Force specific audio device
---
## Monitoring and Logs
### Linux (systemd)
```bash
# View logs
journalctl --user -u holdtranscribe.service -f
# Check status
systemctl --user status holdtranscribe.service
```
### macOS (launchd)
```bash
# View logs
tail -f ~/Library/Logs/com.holdtranscribe.log
# Check status
launchctl list | grep holdtranscribe
```
### Windows (Task Scheduler)
- Task Scheduler → Task Scheduler Library → HoldTranscribe → History tab
- Or check Windows Event Viewer → Applications and Services Logs
---
## Troubleshooting
### Common Issues (All Platforms)
**Model loading errors:**
```bash
# Clear cache and retry
rm -rf ~/.cache/huggingface/transformers/
holdtranscribe --model tiny # Start with smaller model
```
**Audio device issues:**
```bash
# List available devices
python -c "import sounddevice as sd; print(sd.query_devices())"
```
### Linux-Specific Issues
**Permission denied on input events:**
```bash
sudo usermod -aG input $USER
# Log out and back in
```
**Audio issues with PulseAudio:**
```bash
# Restart PulseAudio
pulseaudio -k
pulseaudio --start
```
**X11 forwarding issues:**
```bash
export DISPLAY=:0
xhost +local:
```
### macOS-Specific Issues
**Accessibility permissions denied:**
- System Preferences → Security & Privacy → Privacy → Accessibility
- Add Terminal or your Python executable
- May need to remove and re-add if issues persist
**Microphone access denied:**
- System Preferences → Security & Privacy → Privacy → Microphone
- Enable for Terminal/Python
**"Operation not permitted" errors:**
```bash
# Try running with sudo temporarily to identify permission issue
sudo holdtranscribe --debug
```
**Python/PortAudio conflicts:**
```bash
# Reinstall with Homebrew
brew uninstall portaudio
brew install portaudio
pip uninstall sounddevice
pip install sounddevice
```
### Windows-Specific Issues
**DLL load failures:**
```cmd
# Install Visual C++ Redistributable
# Download from Microsoft website
```
**Microphone access denied:**
- Settings → Privacy → Microphone → Allow apps to access microphone
- Ensure Python/Terminal is enabled
**CUDA issues:**
```cmd
# Check CUDA installation
nvidia-smi
python -c "import torch; print(torch.cuda.is_available())"
```
**PowerShell execution policy:**
```powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
```
**Antivirus blocking:**
- Add Python executable to antivirus exclusions
- Add HoldTranscribe directory to exclusions
### Performance Optimization
**For slower systems:**
```bash
# Use fastest settings
holdtranscribe --model tiny --beam-size 1 --fast
```
**For better accuracy:**
```bash
# Use larger model with more processing
holdtranscribe --model large-v3 --beam-size 5
```
**Memory management:**
```bash
# Monitor memory usage
holdtranscribe --debug
```
---
## Contributing
Contributions, issues, and feature requests are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Test on multiple platforms when possible
4. Submit a pull request
When reporting issues, please include:
- Operating system and version
- Python version
- Full error message
- Steps to reproduce
---
## License
This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.
---
## Acknowledgments
- OpenAI Whisper team for the excellent speech recognition model
- Contributors to the faster-whisper implementation
- All the open-source libraries that make this project possible
Raw data
{
"_id": null,
"home_page": "https://github.com/binaryninja/holdtranscribe",
"name": "holdtranscribe",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "voice, transcription, whisper, hotkey, clipboard, speech-to-text",
"author": "binaryninja",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/25/35/5f7c82aefff237db3f7c1c160541043918c2da5a9dc08ee53a2ffd981982/holdtranscribe-1.0.1.tar.gz",
"platform": null,
"description": "# HoldTranscribe\n\nHotkey-Activated Voice-to-Clipboard Transcriber\n\nA lightweight tool that records audio while you hold a configurable hotkey, transcribes speech using OpenAI's Whisper model, and copies the result to your clipboard.\n\n---\n\n## Features\n\n* Hold-to-record using a customizable hotkey combination\n* GPU acceleration with automatic CUDA detection and CPU fallback\n* Instant copy of transcribed text to the clipboard\n* Persistent model instance for low-latency transcription\n* Configurable model size and beam search settings\n* Detailed debug output and performance metrics\n* Cross-platform support (Linux, macOS, Windows)\n* Voice Activity Detection (VAD) for clean audio capture\n* Auto-start service integration for all platforms\n\n---\n\n## Platform-Specific Requirements\n\n### Linux\n* Python 3.8 or later\n* Bash-compatible shell (for installer script)\n* A CUDA-capable GPU (optional, for hardware acceleration)\n* PulseAudio or equivalent audio system\n* Permissions to read input events (user in `input` group)\n* X11 or Wayland desktop environment\n\n### macOS\n* Python 3.8 or later\n* macOS 10.14 (Mojave) or later\n* Microphone access permissions\n* Accessibility permissions for global hotkey monitoring\n* Optional: CUDA-capable GPU (limited support on newer Macs)\n\n### Windows\n* Python 3.8 or later\n* Windows 10 or later (Windows 11 recommended)\n* Microphone access permissions\n* Optional: CUDA-capable GPU with appropriate drivers\n* PowerShell 5.0 or later (for service installation)\n\n---\n\n## Installation\n\n### Option 1: Pip Installation (Recommended)\n\n**From GitHub (all platforms):**\n```bash\npip install git+https://github.com/binaryninja/holdtranscribe.git\n```\n\n**From PyPI (when available):**\n```bash\npip install holdtranscribe\n```\n\n### Option 2: Manual Installation\n\n1. **Clone the repository:**\n ```bash\n git clone https://github.com/binaryninja/holdtranscribe.git\n cd holdtranscribe\n ```\n\n2. **Install Python dependencies:**\n ```bash\n pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil\n ```\n\n3. **Optional GPU acceleration:**\n \n **Linux/Windows with CUDA:**\n ```bash\n pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121\n ```\n \n **macOS with Metal Performance Shaders:**\n ```bash\n pip install torch torchvision torchaudio\n ```\n\n---\n\n## Platform-Specific Setup\n\n### Linux Setup\n\n1. **Add user to input group (if needed):**\n ```bash\n sudo usermod -aG input $USER\n ```\n Log out and back in for changes to take effect.\n\n2. **Install system dependencies (Ubuntu/Debian):**\n ```bash\n sudo apt update\n sudo apt install python3-pip portaudio19-dev pulseaudio\n ```\n\n3. **Install system dependencies (Fedora/RHEL):**\n ```bash\n sudo dnf install python3-pip portaudio-devel pulseaudio\n ```\n\n### macOS Setup\n\n1. **Install dependencies via Homebrew:**\n ```bash\n # Install Homebrew if not already installed\n /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n \n # Install PortAudio\n brew install portaudio\n ```\n\n2. **Grant permissions:**\n - **Microphone Access:** System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Microphone \u2192 Enable for Terminal/your Python environment\n - **Accessibility Access:** System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Accessibility \u2192 Enable for Terminal/your Python environment\n - **Input Monitoring:** System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Input Monitoring \u2192 Enable for Terminal/your Python environment\n\n3. **For Apple Silicon Macs:**\n ```bash\n # Install Python dependencies with conda for better compatibility\n conda install python=3.9\n pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil\n ```\n\n### Windows Setup\n\n1. **Install via Microsoft Store or python.org:**\n - Download Python from [python.org](https://python.org) or install via Microsoft Store\n - Ensure \"Add Python to PATH\" is checked during installation\n\n2. **Install Visual C++ Build Tools (if compilation errors occur):**\n - Download and install Microsoft C++ Build Tools\n - Or install Visual Studio Community with C++ workload\n\n3. **Grant microphone permissions:**\n - Settings \u2192 Privacy \u2192 Microphone \u2192 Allow apps to access microphone \u2192 Enable for Python/Terminal\n\n---\n\n## Usage\n\n### Basic Usage (All Platforms)\n\n```bash\n# Run with default settings (if installed via pip)\nholdtranscribe\n\n# Or if using the script directly\npython voice_hold_to_clip.py\n```\n\n### Command Line Options\n\n```bash\n--model <size> Whisper model size (tiny, base, small, medium, large-v3). Default: large-v3\n--beam-size <n> Beam search width (1 for fastest). Default: 5\n--fast Shorthand for `--model base --beam-size 1`\n--debug Enable verbose timing and resource metrics\n--device <cpu|cuda> Force CPU or GPU mode\n```\n\n### Platform-Specific Examples\n\n**Linux/macOS:**\n```bash\nholdtranscribe --model tiny --beam-size 1\n```\n\n**Windows (Command Prompt):**\n```cmd\nholdtranscribe --model tiny --beam-size 1\n```\n\n**Windows (PowerShell):**\n```powershell\nholdtranscribe --model tiny --beam-size 1\n```\n\n---\n\n## Auto-Start Service Setup\n\n### Linux (systemd)\n\n1. **Create service directory:**\n ```bash\n mkdir -p ~/.config/systemd/user\n ```\n\n2. **Create service file:**\n ```bash\n cat > ~/.config/systemd/user/holdtranscribe.service << 'EOF'\n [Unit]\n Description=HoldTranscribe Voice Transcriber\n After=graphical-session.target\n\n [Service]\n Type=simple\n ExecStart=/usr/bin/holdtranscribe --model large-v3 --beam-size 1\n Restart=always\n RestartSec=5\n Environment=DISPLAY=:0\n Environment=XDG_RUNTIME_DIR=/run/user/%i\n WorkingDirectory=%h\n\n [Install]\n WantedBy=default.target\n EOF\n ```\n\n3. **Enable and start:**\n ```bash\n systemctl --user daemon-reload\n systemctl --user enable holdtranscribe.service\n systemctl --user start holdtranscribe.service\n ```\n\n### macOS (launchd)\n\n1. **Create launch agent directory:**\n ```bash\n mkdir -p ~/Library/LaunchAgents\n ```\n\n2. **Create plist file:**\n ```bash\n cat > ~/Library/LaunchAgents/com.holdtranscribe.plist << 'EOF'\n <?xml version=\"1.0\" encoding=\"UTF-8\"?>\n <!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n <plist version=\"1.0\">\n <dict>\n <key>Label</key>\n <string>com.holdtranscribe</string>\n <key>ProgramArguments</key>\n <array>\n <string>/usr/local/bin/holdtranscribe</string>\n <string>--model</string>\n <string>large-v3</string>\n <string>--beam-size</string>\n <string>1</string>\n </array>\n <key>RunAtLoad</key>\n <true/>\n <key>KeepAlive</key>\n <true/>\n </dict>\n </plist>\n EOF\n ```\n\n3. **Load the service:**\n ```bash\n launchctl load ~/Library/LaunchAgents/com.holdtranscribe.plist\n launchctl start com.holdtranscribe\n ```\n\n### Windows (Task Scheduler)\n\n1. **Create batch file for easier management:**\n ```batch\n @echo off\n holdtranscribe --model large-v3 --beam-size 1\n ```\n Save as `holdtranscribe.bat`\n\n2. **Using Task Scheduler GUI:**\n - Open Task Scheduler (taskschd.msc)\n - Create Basic Task \u2192 Name: \"HoldTranscribe\"\n - Trigger: When I log on\n - Action: Start a program \u2192 Browse to your batch file\n - Finish and test\n\n3. **Using PowerShell (run as Administrator):**\n ```powershell\n $action = New-ScheduledTaskAction -Execute \"C:\\path\\to\\holdtranscribe.bat\"\n $trigger = New-ScheduledTaskTrigger -AtLogon\n $settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries\n Register-ScheduledTask -TaskName \"HoldTranscribe\" -Action $action -Trigger $trigger -Settings $settings\n ```\n\n---\n\n## Configuration\n\n### Hotkey Customization\n\nEdit the `HOTKEY` set in the script to change key combinations:\n\n```python\n# Default: Ctrl + Mouse Forward Button\nHOTKEY = {keyboard.Key.ctrl, mouse.Button.button9}\n\n# Alternative examples:\n# HOTKEY = {keyboard.Key.ctrl, keyboard.Key.space} # Ctrl + Space\n# HOTKEY = {keyboard.Key.alt, mouse.Button.left} # Alt + Left Click\n# HOTKEY = {mouse.Button.button8} # Mouse Back Button only\n```\n\n### Platform-Specific Mouse Button Notes\n\n- **Windows:** Button numbers may vary by mouse driver\n- **macOS:** Some mouse buttons may require additional permissions\n- **Linux:** Button numbers can be checked with `xev` command\n\n### Environment Variables\n\n- `CUDA_VISIBLE_DEVICES` - Control GPU usage\n- `TRANSFORMERS_CACHE` - Customize model cache location \n- `DISABLE_NOTIFY=1` - Suppress desktop notifications\n- `PULSE_SERVER` (Linux) - Specify PulseAudio server\n- `PORTAUDIO_DEVICE` - Force specific audio device\n\n---\n\n## Monitoring and Logs\n\n### Linux (systemd)\n```bash\n# View logs\njournalctl --user -u holdtranscribe.service -f\n\n# Check status\nsystemctl --user status holdtranscribe.service\n```\n\n### macOS (launchd)\n```bash\n# View logs\ntail -f ~/Library/Logs/com.holdtranscribe.log\n\n# Check status\nlaunchctl list | grep holdtranscribe\n```\n\n### Windows (Task Scheduler)\n- Task Scheduler \u2192 Task Scheduler Library \u2192 HoldTranscribe \u2192 History tab\n- Or check Windows Event Viewer \u2192 Applications and Services Logs\n\n---\n\n## Troubleshooting\n\n### Common Issues (All Platforms)\n\n**Model loading errors:**\n```bash\n# Clear cache and retry\nrm -rf ~/.cache/huggingface/transformers/\nholdtranscribe --model tiny # Start with smaller model\n```\n\n**Audio device issues:**\n```bash\n# List available devices\npython -c \"import sounddevice as sd; print(sd.query_devices())\"\n```\n\n### Linux-Specific Issues\n\n**Permission denied on input events:**\n```bash\nsudo usermod -aG input $USER\n# Log out and back in\n```\n\n**Audio issues with PulseAudio:**\n```bash\n# Restart PulseAudio\npulseaudio -k\npulseaudio --start\n```\n\n**X11 forwarding issues:**\n```bash\nexport DISPLAY=:0\nxhost +local:\n```\n\n### macOS-Specific Issues\n\n**Accessibility permissions denied:**\n- System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Accessibility\n- Add Terminal or your Python executable\n- May need to remove and re-add if issues persist\n\n**Microphone access denied:**\n- System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Microphone\n- Enable for Terminal/Python\n\n**\"Operation not permitted\" errors:**\n```bash\n# Try running with sudo temporarily to identify permission issue\nsudo holdtranscribe --debug\n```\n\n**Python/PortAudio conflicts:**\n```bash\n# Reinstall with Homebrew\nbrew uninstall portaudio\nbrew install portaudio\npip uninstall sounddevice\npip install sounddevice\n```\n\n### Windows-Specific Issues\n\n**DLL load failures:**\n```cmd\n# Install Visual C++ Redistributable\n# Download from Microsoft website\n```\n\n**Microphone access denied:**\n- Settings \u2192 Privacy \u2192 Microphone \u2192 Allow apps to access microphone\n- Ensure Python/Terminal is enabled\n\n**CUDA issues:**\n```cmd\n# Check CUDA installation\nnvidia-smi\npython -c \"import torch; print(torch.cuda.is_available())\"\n```\n\n**PowerShell execution policy:**\n```powershell\nSet-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser\n```\n\n**Antivirus blocking:**\n- Add Python executable to antivirus exclusions\n- Add HoldTranscribe directory to exclusions\n\n### Performance Optimization\n\n**For slower systems:**\n```bash\n# Use fastest settings\nholdtranscribe --model tiny --beam-size 1 --fast\n```\n\n**For better accuracy:**\n```bash\n# Use larger model with more processing\nholdtranscribe --model large-v3 --beam-size 5\n```\n\n**Memory management:**\n```bash\n# Monitor memory usage\nholdtranscribe --debug\n```\n\n---\n\n## Contributing\n\nContributions, issues, and feature requests are welcome! Please:\n\n1. Fork the repository\n2. Create a feature branch\n3. Test on multiple platforms when possible\n4. Submit a pull request\n\nWhen reporting issues, please include:\n- Operating system and version\n- Python version\n- Full error message\n- Steps to reproduce\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See [LICENSE](LICENSE) for details.\n\n---\n\n## Acknowledgments\n\n- OpenAI Whisper team for the excellent speech recognition model\n- Contributors to the faster-whisper implementation\n- All the open-source libraries that make this project possible\n",
"bugtrack_url": null,
"license": null,
"summary": "Hotkey-Activated Voice-to-Clipboard Transcriber",
"version": "1.0.1",
"project_urls": {
"Documentation": "https://github.com/binaryninja/holdtranscribe#readme",
"Homepage": "https://github.com/binaryninja/holdtranscribe",
"Issues": "https://github.com/binaryninja/holdtranscribe/issues",
"Repository": "https://github.com/binaryninja/holdtranscribe"
},
"split_keywords": [
"voice",
" transcription",
" whisper",
" hotkey",
" clipboard",
" speech-to-text"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f83a396824b62b27eb02c0e1f6acf16ddf7e9b213cf13be1f87cc9790b81e9c1",
"md5": "ea35c8bd833e2098790998d5c3628bfa",
"sha256": "af6c5b22a3af62efbfccd4ae28a8c12df774d593efda2d2581321590b0f9e6e2"
},
"downloads": -1,
"filename": "holdtranscribe-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ea35c8bd833e2098790998d5c3628bfa",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 13839,
"upload_time": "2025-07-18T17:12:46",
"upload_time_iso_8601": "2025-07-18T17:12:46.517658Z",
"url": "https://files.pythonhosted.org/packages/f8/3a/396824b62b27eb02c0e1f6acf16ddf7e9b213cf13be1f87cc9790b81e9c1/holdtranscribe-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "25355f7c82aefff237db3f7c1c160541043918c2da5a9dc08ee53a2ffd981982",
"md5": "2fe9738cbfb9e690a8505b72379c0d4a",
"sha256": "5b1c63f5a9480c93a4648b344ede67fb5342b0e4c5c361a2431bd1b27e1ff865"
},
"downloads": -1,
"filename": "holdtranscribe-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "2fe9738cbfb9e690a8505b72379c0d4a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 25989,
"upload_time": "2025-07-18T17:12:47",
"upload_time_iso_8601": "2025-07-18T17:12:47.675083Z",
"url": "https://files.pythonhosted.org/packages/25/35/5f7c82aefff237db3f7c1c160541043918c2da5a9dc08ee53a2ffd981982/holdtranscribe-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-18 17:12:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "binaryninja",
"github_project": "holdtranscribe",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "faster-whisper",
"specs": [
[
">=",
"0.9.0"
]
]
},
{
"name": "sounddevice",
"specs": [
[
">=",
"0.4.6"
]
]
},
{
"name": "pynput",
"specs": [
[
">=",
"1.7.6"
]
]
},
{
"name": "webrtcvad",
"specs": [
[
">=",
"2.0.10"
]
]
},
{
"name": "pyperclip",
"specs": [
[
">=",
"1.8.2"
]
]
},
{
"name": "notify2",
"specs": [
[
">=",
"0.3.1"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
]
]
},
{
"name": "psutil",
"specs": [
[
">=",
"5.9.0"
]
]
}
],
"lcname": "holdtranscribe"
}