holdtranscribe

Name	holdtranscribe JSON
Version	1.0.1 JSON
	download
home_page	https://github.com/binaryninja/holdtranscribe
Summary	Hotkey-Activated Voice-to-Clipboard Transcriber
upload_time	2025-07-18 17:12:47
maintainer	None
docs_url	None
author	binaryninja
requires_python	>=3.8
license	None
keywords	voice transcription whisper hotkey clipboard speech-to-text
VCS
bugtrack_url
requirements	faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # HoldTranscribe

Hotkey-Activated Voice-to-Clipboard Transcriber

A lightweight tool that records audio while you hold a configurable hotkey, transcribes speech using OpenAI's Whisper model, and copies the result to your clipboard.

---

## Features

* Hold-to-record using a customizable hotkey combination
* GPU acceleration with automatic CUDA detection and CPU fallback
* Instant copy of transcribed text to the clipboard
* Persistent model instance for low-latency transcription
* Configurable model size and beam search settings
* Detailed debug output and performance metrics
* Cross-platform support (Linux, macOS, Windows)
* Voice Activity Detection (VAD) for clean audio capture
* Auto-start service integration for all platforms

---

## Platform-Specific Requirements

### Linux
* Python 3.8 or later
* Bash-compatible shell (for installer script)
* A CUDA-capable GPU (optional, for hardware acceleration)
* PulseAudio or equivalent audio system
* Permissions to read input events (user in `input` group)
* X11 or Wayland desktop environment

### macOS
* Python 3.8 or later
* macOS 10.14 (Mojave) or later
* Microphone access permissions
* Accessibility permissions for global hotkey monitoring
* Optional: CUDA-capable GPU (limited support on newer Macs)

### Windows
* Python 3.8 or later
* Windows 10 or later (Windows 11 recommended)
* Microphone access permissions
* Optional: CUDA-capable GPU with appropriate drivers
* PowerShell 5.0 or later (for service installation)

---

## Installation

### Option 1: Pip Installation (Recommended)

**From GitHub (all platforms):**
```bash
pip install git+https://github.com/binaryninja/holdtranscribe.git
```

**From PyPI (when available):**
```bash
pip install holdtranscribe
```

### Option 2: Manual Installation

1. **Clone the repository:**
   ```bash
   git clone https://github.com/binaryninja/holdtranscribe.git
   cd holdtranscribe
   ```

2. **Install Python dependencies:**
   ```bash
   pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil
   ```

3. **Optional GPU acceleration:**
   
   **Linux/Windows with CUDA:**
   ```bash
   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
   ```
   
   **macOS with Metal Performance Shaders:**
   ```bash
   pip install torch torchvision torchaudio
   ```

---

## Platform-Specific Setup

### Linux Setup

1. **Add user to input group (if needed):**
   ```bash
   sudo usermod -aG input $USER
   ```
   Log out and back in for changes to take effect.

2. **Install system dependencies (Ubuntu/Debian):**
   ```bash
   sudo apt update
   sudo apt install python3-pip portaudio19-dev pulseaudio
   ```

3. **Install system dependencies (Fedora/RHEL):**
   ```bash
   sudo dnf install python3-pip portaudio-devel pulseaudio
   ```

### macOS Setup

1. **Install dependencies via Homebrew:**
   ```bash
   # Install Homebrew if not already installed
   /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
   
   # Install PortAudio
   brew install portaudio
   ```

2. **Grant permissions:**
   - **Microphone Access:** System Preferences → Security & Privacy → Privacy → Microphone → Enable for Terminal/your Python environment
   - **Accessibility Access:** System Preferences → Security & Privacy → Privacy → Accessibility → Enable for Terminal/your Python environment
   - **Input Monitoring:** System Preferences → Security & Privacy → Privacy → Input Monitoring → Enable for Terminal/your Python environment

3. **For Apple Silicon Macs:**
   ```bash
   # Install Python dependencies with conda for better compatibility
   conda install python=3.9
   pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil
   ```

### Windows Setup

1. **Install via Microsoft Store or python.org:**
   - Download Python from [python.org](https://python.org) or install via Microsoft Store
   - Ensure "Add Python to PATH" is checked during installation

2. **Install Visual C++ Build Tools (if compilation errors occur):**
   - Download and install Microsoft C++ Build Tools
   - Or install Visual Studio Community with C++ workload

3. **Grant microphone permissions:**
   - Settings → Privacy → Microphone → Allow apps to access microphone → Enable for Python/Terminal

---

## Usage

### Basic Usage (All Platforms)

```bash
# Run with default settings (if installed via pip)
holdtranscribe

# Or if using the script directly
python voice_hold_to_clip.py
```

### Command Line Options

```bash
--model <size>       Whisper model size (tiny, base, small, medium, large-v3). Default: large-v3
--beam-size <n>      Beam search width (1 for fastest). Default: 5
--fast               Shorthand for `--model base --beam-size 1`
--debug              Enable verbose timing and resource metrics
--device <cpu|cuda>  Force CPU or GPU mode
```

### Platform-Specific Examples

**Linux/macOS:**
```bash
holdtranscribe --model tiny --beam-size 1
```

**Windows (Command Prompt):**
```cmd
holdtranscribe --model tiny --beam-size 1
```

**Windows (PowerShell):**
```powershell
holdtranscribe --model tiny --beam-size 1
```

---

## Auto-Start Service Setup

### Linux (systemd)

1. **Create service directory:**
   ```bash
   mkdir -p ~/.config/systemd/user
   ```

2. **Create service file:**
   ```bash
   cat > ~/.config/systemd/user/holdtranscribe.service << 'EOF'
   [Unit]
   Description=HoldTranscribe Voice Transcriber
   After=graphical-session.target

   [Service]
   Type=simple
   ExecStart=/usr/bin/holdtranscribe --model large-v3 --beam-size 1
   Restart=always
   RestartSec=5
   Environment=DISPLAY=:0
   Environment=XDG_RUNTIME_DIR=/run/user/%i
   WorkingDirectory=%h

   [Install]
   WantedBy=default.target
   EOF
   ```

3. **Enable and start:**
   ```bash
   systemctl --user daemon-reload
   systemctl --user enable holdtranscribe.service
   systemctl --user start holdtranscribe.service
   ```

### macOS (launchd)

1. **Create launch agent directory:**
   ```bash
   mkdir -p ~/Library/LaunchAgents
   ```

2. **Create plist file:**
   ```bash
   cat > ~/Library/LaunchAgents/com.holdtranscribe.plist << 'EOF'
   <?xml version="1.0" encoding="UTF-8"?>
   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
   <plist version="1.0">
   <dict>
       <key>Label</key>
       <string>com.holdtranscribe</string>
       <key>ProgramArguments</key>
       <array>
           <string>/usr/local/bin/holdtranscribe</string>
           <string>--model</string>
           <string>large-v3</string>
           <string>--beam-size</string>
           <string>1</string>
       </array>
       <key>RunAtLoad</key>
       <true/>
       <key>KeepAlive</key>
       <true/>
   </dict>
   </plist>
   EOF
   ```

3. **Load the service:**
   ```bash
   launchctl load ~/Library/LaunchAgents/com.holdtranscribe.plist
   launchctl start com.holdtranscribe
   ```

### Windows (Task Scheduler)

1. **Create batch file for easier management:**
   ```batch
   @echo off
   holdtranscribe --model large-v3 --beam-size 1
   ```
   Save as `holdtranscribe.bat`

2. **Using Task Scheduler GUI:**
   - Open Task Scheduler (taskschd.msc)
   - Create Basic Task → Name: "HoldTranscribe"
   - Trigger: When I log on
   - Action: Start a program → Browse to your batch file
   - Finish and test

3. **Using PowerShell (run as Administrator):**
   ```powershell
   $action = New-ScheduledTaskAction -Execute "C:\path\to\holdtranscribe.bat"
   $trigger = New-ScheduledTaskTrigger -AtLogon
   $settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries
   Register-ScheduledTask -TaskName "HoldTranscribe" -Action $action -Trigger $trigger -Settings $settings
   ```

---

## Configuration

### Hotkey Customization

Edit the `HOTKEY` set in the script to change key combinations:

```python
# Default: Ctrl + Mouse Forward Button
HOTKEY = {keyboard.Key.ctrl, mouse.Button.button9}

# Alternative examples:
# HOTKEY = {keyboard.Key.ctrl, keyboard.Key.space}  # Ctrl + Space
# HOTKEY = {keyboard.Key.alt, mouse.Button.left}    # Alt + Left Click
# HOTKEY = {mouse.Button.button8}                   # Mouse Back Button only
```

### Platform-Specific Mouse Button Notes

- **Windows:** Button numbers may vary by mouse driver
- **macOS:** Some mouse buttons may require additional permissions
- **Linux:** Button numbers can be checked with `xev` command

### Environment Variables

- `CUDA_VISIBLE_DEVICES` - Control GPU usage
- `TRANSFORMERS_CACHE` - Customize model cache location  
- `DISABLE_NOTIFY=1` - Suppress desktop notifications
- `PULSE_SERVER` (Linux) - Specify PulseAudio server
- `PORTAUDIO_DEVICE` - Force specific audio device

---

## Monitoring and Logs

### Linux (systemd)
```bash
# View logs
journalctl --user -u holdtranscribe.service -f

# Check status
systemctl --user status holdtranscribe.service
```

### macOS (launchd)
```bash
# View logs
tail -f ~/Library/Logs/com.holdtranscribe.log

# Check status
launchctl list | grep holdtranscribe
```

### Windows (Task Scheduler)
- Task Scheduler → Task Scheduler Library → HoldTranscribe → History tab
- Or check Windows Event Viewer → Applications and Services Logs

---

## Troubleshooting

### Common Issues (All Platforms)

**Model loading errors:**
```bash
# Clear cache and retry
rm -rf ~/.cache/huggingface/transformers/
holdtranscribe --model tiny  # Start with smaller model
```

**Audio device issues:**
```bash
# List available devices
python -c "import sounddevice as sd; print(sd.query_devices())"
```

### Linux-Specific Issues

**Permission denied on input events:**
```bash
sudo usermod -aG input $USER
# Log out and back in
```

**Audio issues with PulseAudio:**
```bash
# Restart PulseAudio
pulseaudio -k
pulseaudio --start
```

**X11 forwarding issues:**
```bash
export DISPLAY=:0
xhost +local:
```

### macOS-Specific Issues

**Accessibility permissions denied:**
- System Preferences → Security & Privacy → Privacy → Accessibility
- Add Terminal or your Python executable
- May need to remove and re-add if issues persist

**Microphone access denied:**
- System Preferences → Security & Privacy → Privacy → Microphone
- Enable for Terminal/Python

**"Operation not permitted" errors:**
```bash
# Try running with sudo temporarily to identify permission issue
sudo holdtranscribe --debug
```

**Python/PortAudio conflicts:**
```bash
# Reinstall with Homebrew
brew uninstall portaudio
brew install portaudio
pip uninstall sounddevice
pip install sounddevice
```

### Windows-Specific Issues

**DLL load failures:**
```cmd
# Install Visual C++ Redistributable
# Download from Microsoft website
```

**Microphone access denied:**
- Settings → Privacy → Microphone → Allow apps to access microphone
- Ensure Python/Terminal is enabled

**CUDA issues:**
```cmd
# Check CUDA installation
nvidia-smi
python -c "import torch; print(torch.cuda.is_available())"
```

**PowerShell execution policy:**
```powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
```

**Antivirus blocking:**
- Add Python executable to antivirus exclusions
- Add HoldTranscribe directory to exclusions

### Performance Optimization

**For slower systems:**
```bash
# Use fastest settings
holdtranscribe --model tiny --beam-size 1 --fast
```

**For better accuracy:**
```bash
# Use larger model with more processing
holdtranscribe --model large-v3 --beam-size 5
```

**Memory management:**
```bash
# Monitor memory usage
holdtranscribe --debug
```

---

## Contributing

Contributions, issues, and feature requests are welcome! Please:

1. Fork the repository
2. Create a feature branch
3. Test on multiple platforms when possible
4. Submit a pull request

When reporting issues, please include:
- Operating system and version
- Python version
- Full error message
- Steps to reproduce

---

## License

This project is licensed under the MIT License. See [LICENSE](LICENSE) for details.

---

## Acknowledgments

- OpenAI Whisper team for the excellent speech recognition model
- Contributors to the faster-whisper implementation
- All the open-source libraries that make this project possible

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/binaryninja/holdtranscribe",
    "name": "holdtranscribe",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "voice, transcription, whisper, hotkey, clipboard, speech-to-text",
    "author": "binaryninja",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/25/35/5f7c82aefff237db3f7c1c160541043918c2da5a9dc08ee53a2ffd981982/holdtranscribe-1.0.1.tar.gz",
    "platform": null,
    "description": "# HoldTranscribe\n\nHotkey-Activated Voice-to-Clipboard Transcriber\n\nA lightweight tool that records audio while you hold a configurable hotkey, transcribes speech using OpenAI's Whisper model, and copies the result to your clipboard.\n\n---\n\n## Features\n\n* Hold-to-record using a customizable hotkey combination\n* GPU acceleration with automatic CUDA detection and CPU fallback\n* Instant copy of transcribed text to the clipboard\n* Persistent model instance for low-latency transcription\n* Configurable model size and beam search settings\n* Detailed debug output and performance metrics\n* Cross-platform support (Linux, macOS, Windows)\n* Voice Activity Detection (VAD) for clean audio capture\n* Auto-start service integration for all platforms\n\n---\n\n## Platform-Specific Requirements\n\n### Linux\n* Python 3.8 or later\n* Bash-compatible shell (for installer script)\n* A CUDA-capable GPU (optional, for hardware acceleration)\n* PulseAudio or equivalent audio system\n* Permissions to read input events (user in `input` group)\n* X11 or Wayland desktop environment\n\n### macOS\n* Python 3.8 or later\n* macOS 10.14 (Mojave) or later\n* Microphone access permissions\n* Accessibility permissions for global hotkey monitoring\n* Optional: CUDA-capable GPU (limited support on newer Macs)\n\n### Windows\n* Python 3.8 or later\n* Windows 10 or later (Windows 11 recommended)\n* Microphone access permissions\n* Optional: CUDA-capable GPU with appropriate drivers\n* PowerShell 5.0 or later (for service installation)\n\n---\n\n## Installation\n\n### Option 1: Pip Installation (Recommended)\n\n**From GitHub (all platforms):**\n```bash\npip install git+https://github.com/binaryninja/holdtranscribe.git\n```\n\n**From PyPI (when available):**\n```bash\npip install holdtranscribe\n```\n\n### Option 2: Manual Installation\n\n1. **Clone the repository:**\n   ```bash\n   git clone https://github.com/binaryninja/holdtranscribe.git\n   cd holdtranscribe\n   ```\n\n2. **Install Python dependencies:**\n   ```bash\n   pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil\n   ```\n\n3. **Optional GPU acceleration:**\n   \n   **Linux/Windows with CUDA:**\n   ```bash\n   pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121\n   ```\n   \n   **macOS with Metal Performance Shaders:**\n   ```bash\n   pip install torch torchvision torchaudio\n   ```\n\n---\n\n## Platform-Specific Setup\n\n### Linux Setup\n\n1. **Add user to input group (if needed):**\n   ```bash\n   sudo usermod -aG input $USER\n   ```\n   Log out and back in for changes to take effect.\n\n2. **Install system dependencies (Ubuntu/Debian):**\n   ```bash\n   sudo apt update\n   sudo apt install python3-pip portaudio19-dev pulseaudio\n   ```\n\n3. **Install system dependencies (Fedora/RHEL):**\n   ```bash\n   sudo dnf install python3-pip portaudio-devel pulseaudio\n   ```\n\n### macOS Setup\n\n1. **Install dependencies via Homebrew:**\n   ```bash\n   # Install Homebrew if not already installed\n   /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n   \n   # Install PortAudio\n   brew install portaudio\n   ```\n\n2. **Grant permissions:**\n   - **Microphone Access:** System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Microphone \u2192 Enable for Terminal/your Python environment\n   - **Accessibility Access:** System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Accessibility \u2192 Enable for Terminal/your Python environment\n   - **Input Monitoring:** System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Input Monitoring \u2192 Enable for Terminal/your Python environment\n\n3. **For Apple Silicon Macs:**\n   ```bash\n   # Install Python dependencies with conda for better compatibility\n   conda install python=3.9\n   pip install faster-whisper sounddevice pynput webrtcvad pyperclip notify2 numpy psutil\n   ```\n\n### Windows Setup\n\n1. **Install via Microsoft Store or python.org:**\n   - Download Python from [python.org](https://python.org) or install via Microsoft Store\n   - Ensure \"Add Python to PATH\" is checked during installation\n\n2. **Install Visual C++ Build Tools (if compilation errors occur):**\n   - Download and install Microsoft C++ Build Tools\n   - Or install Visual Studio Community with C++ workload\n\n3. **Grant microphone permissions:**\n   - Settings \u2192 Privacy \u2192 Microphone \u2192 Allow apps to access microphone \u2192 Enable for Python/Terminal\n\n---\n\n## Usage\n\n### Basic Usage (All Platforms)\n\n```bash\n# Run with default settings (if installed via pip)\nholdtranscribe\n\n# Or if using the script directly\npython voice_hold_to_clip.py\n```\n\n### Command Line Options\n\n```bash\n--model <size>       Whisper model size (tiny, base, small, medium, large-v3). Default: large-v3\n--beam-size <n>      Beam search width (1 for fastest). Default: 5\n--fast               Shorthand for `--model base --beam-size 1`\n--debug              Enable verbose timing and resource metrics\n--device <cpu|cuda>  Force CPU or GPU mode\n```\n\n### Platform-Specific Examples\n\n**Linux/macOS:**\n```bash\nholdtranscribe --model tiny --beam-size 1\n```\n\n**Windows (Command Prompt):**\n```cmd\nholdtranscribe --model tiny --beam-size 1\n```\n\n**Windows (PowerShell):**\n```powershell\nholdtranscribe --model tiny --beam-size 1\n```\n\n---\n\n## Auto-Start Service Setup\n\n### Linux (systemd)\n\n1. **Create service directory:**\n   ```bash\n   mkdir -p ~/.config/systemd/user\n   ```\n\n2. **Create service file:**\n   ```bash\n   cat > ~/.config/systemd/user/holdtranscribe.service << 'EOF'\n   [Unit]\n   Description=HoldTranscribe Voice Transcriber\n   After=graphical-session.target\n\n   [Service]\n   Type=simple\n   ExecStart=/usr/bin/holdtranscribe --model large-v3 --beam-size 1\n   Restart=always\n   RestartSec=5\n   Environment=DISPLAY=:0\n   Environment=XDG_RUNTIME_DIR=/run/user/%i\n   WorkingDirectory=%h\n\n   [Install]\n   WantedBy=default.target\n   EOF\n   ```\n\n3. **Enable and start:**\n   ```bash\n   systemctl --user daemon-reload\n   systemctl --user enable holdtranscribe.service\n   systemctl --user start holdtranscribe.service\n   ```\n\n### macOS (launchd)\n\n1. **Create launch agent directory:**\n   ```bash\n   mkdir -p ~/Library/LaunchAgents\n   ```\n\n2. **Create plist file:**\n   ```bash\n   cat > ~/Library/LaunchAgents/com.holdtranscribe.plist << 'EOF'\n   <?xml version=\"1.0\" encoding=\"UTF-8\"?>\n   <!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n   <plist version=\"1.0\">\n   <dict>\n       <key>Label</key>\n       <string>com.holdtranscribe</string>\n       <key>ProgramArguments</key>\n       <array>\n           <string>/usr/local/bin/holdtranscribe</string>\n           <string>--model</string>\n           <string>large-v3</string>\n           <string>--beam-size</string>\n           <string>1</string>\n       </array>\n       <key>RunAtLoad</key>\n       <true/>\n       <key>KeepAlive</key>\n       <true/>\n   </dict>\n   </plist>\n   EOF\n   ```\n\n3. **Load the service:**\n   ```bash\n   launchctl load ~/Library/LaunchAgents/com.holdtranscribe.plist\n   launchctl start com.holdtranscribe\n   ```\n\n### Windows (Task Scheduler)\n\n1. **Create batch file for easier management:**\n   ```batch\n   @echo off\n   holdtranscribe --model large-v3 --beam-size 1\n   ```\n   Save as `holdtranscribe.bat`\n\n2. **Using Task Scheduler GUI:**\n   - Open Task Scheduler (taskschd.msc)\n   - Create Basic Task \u2192 Name: \"HoldTranscribe\"\n   - Trigger: When I log on\n   - Action: Start a program \u2192 Browse to your batch file\n   - Finish and test\n\n3. **Using PowerShell (run as Administrator):**\n   ```powershell\n   $action = New-ScheduledTaskAction -Execute \"C:\\path\\to\\holdtranscribe.bat\"\n   $trigger = New-ScheduledTaskTrigger -AtLogon\n   $settings = New-ScheduledTaskSettingsSet -AllowStartIfOnBatteries -DontStopIfGoingOnBatteries\n   Register-ScheduledTask -TaskName \"HoldTranscribe\" -Action $action -Trigger $trigger -Settings $settings\n   ```\n\n---\n\n## Configuration\n\n### Hotkey Customization\n\nEdit the `HOTKEY` set in the script to change key combinations:\n\n```python\n# Default: Ctrl + Mouse Forward Button\nHOTKEY = {keyboard.Key.ctrl, mouse.Button.button9}\n\n# Alternative examples:\n# HOTKEY = {keyboard.Key.ctrl, keyboard.Key.space}  # Ctrl + Space\n# HOTKEY = {keyboard.Key.alt, mouse.Button.left}    # Alt + Left Click\n# HOTKEY = {mouse.Button.button8}                   # Mouse Back Button only\n```\n\n### Platform-Specific Mouse Button Notes\n\n- **Windows:** Button numbers may vary by mouse driver\n- **macOS:** Some mouse buttons may require additional permissions\n- **Linux:** Button numbers can be checked with `xev` command\n\n### Environment Variables\n\n- `CUDA_VISIBLE_DEVICES` - Control GPU usage\n- `TRANSFORMERS_CACHE` - Customize model cache location  \n- `DISABLE_NOTIFY=1` - Suppress desktop notifications\n- `PULSE_SERVER` (Linux) - Specify PulseAudio server\n- `PORTAUDIO_DEVICE` - Force specific audio device\n\n---\n\n## Monitoring and Logs\n\n### Linux (systemd)\n```bash\n# View logs\njournalctl --user -u holdtranscribe.service -f\n\n# Check status\nsystemctl --user status holdtranscribe.service\n```\n\n### macOS (launchd)\n```bash\n# View logs\ntail -f ~/Library/Logs/com.holdtranscribe.log\n\n# Check status\nlaunchctl list | grep holdtranscribe\n```\n\n### Windows (Task Scheduler)\n- Task Scheduler \u2192 Task Scheduler Library \u2192 HoldTranscribe \u2192 History tab\n- Or check Windows Event Viewer \u2192 Applications and Services Logs\n\n---\n\n## Troubleshooting\n\n### Common Issues (All Platforms)\n\n**Model loading errors:**\n```bash\n# Clear cache and retry\nrm -rf ~/.cache/huggingface/transformers/\nholdtranscribe --model tiny  # Start with smaller model\n```\n\n**Audio device issues:**\n```bash\n# List available devices\npython -c \"import sounddevice as sd; print(sd.query_devices())\"\n```\n\n### Linux-Specific Issues\n\n**Permission denied on input events:**\n```bash\nsudo usermod -aG input $USER\n# Log out and back in\n```\n\n**Audio issues with PulseAudio:**\n```bash\n# Restart PulseAudio\npulseaudio -k\npulseaudio --start\n```\n\n**X11 forwarding issues:**\n```bash\nexport DISPLAY=:0\nxhost +local:\n```\n\n### macOS-Specific Issues\n\n**Accessibility permissions denied:**\n- System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Accessibility\n- Add Terminal or your Python executable\n- May need to remove and re-add if issues persist\n\n**Microphone access denied:**\n- System Preferences \u2192 Security & Privacy \u2192 Privacy \u2192 Microphone\n- Enable for Terminal/Python\n\n**\"Operation not permitted\" errors:**\n```bash\n# Try running with sudo temporarily to identify permission issue\nsudo holdtranscribe --debug\n```\n\n**Python/PortAudio conflicts:**\n```bash\n# Reinstall with Homebrew\nbrew uninstall portaudio\nbrew install portaudio\npip uninstall sounddevice\npip install sounddevice\n```\n\n### Windows-Specific Issues\n\n**DLL load failures:**\n```cmd\n# Install Visual C++ Redistributable\n# Download from Microsoft website\n```\n\n**Microphone access denied:**\n- Settings \u2192 Privacy \u2192 Microphone \u2192 Allow apps to access microphone\n- Ensure Python/Terminal is enabled\n\n**CUDA issues:**\n```cmd\n# Check CUDA installation\nnvidia-smi\npython -c \"import torch; print(torch.cuda.is_available())\"\n```\n\n**PowerShell execution policy:**\n```powershell\nSet-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser\n```\n\n**Antivirus blocking:**\n- Add Python executable to antivirus exclusions\n- Add HoldTranscribe directory to exclusions\n\n### Performance Optimization\n\n**For slower systems:**\n```bash\n# Use fastest settings\nholdtranscribe --model tiny --beam-size 1 --fast\n```\n\n**For better accuracy:**\n```bash\n# Use larger model with more processing\nholdtranscribe --model large-v3 --beam-size 5\n```\n\n**Memory management:**\n```bash\n# Monitor memory usage\nholdtranscribe --debug\n```\n\n---\n\n## Contributing\n\nContributions, issues, and feature requests are welcome! Please:\n\n1. Fork the repository\n2. Create a feature branch\n3. Test on multiple platforms when possible\n4. Submit a pull request\n\nWhen reporting issues, please include:\n- Operating system and version\n- Python version\n- Full error message\n- Steps to reproduce\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See [LICENSE](LICENSE) for details.\n\n---\n\n## Acknowledgments\n\n- OpenAI Whisper team for the excellent speech recognition model\n- Contributors to the faster-whisper implementation\n- All the open-source libraries that make this project possible\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Hotkey-Activated Voice-to-Clipboard Transcriber",
    "version": "1.0.1",
    "project_urls": {
        "Documentation": "https://github.com/binaryninja/holdtranscribe#readme",
        "Homepage": "https://github.com/binaryninja/holdtranscribe",
        "Issues": "https://github.com/binaryninja/holdtranscribe/issues",
        "Repository": "https://github.com/binaryninja/holdtranscribe"
    },
    "split_keywords": [
        "voice",
        " transcription",
        " whisper",
        " hotkey",
        " clipboard",
        " speech-to-text"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f83a396824b62b27eb02c0e1f6acf16ddf7e9b213cf13be1f87cc9790b81e9c1",
                "md5": "ea35c8bd833e2098790998d5c3628bfa",
                "sha256": "af6c5b22a3af62efbfccd4ae28a8c12df774d593efda2d2581321590b0f9e6e2"
            },
            "downloads": -1,
            "filename": "holdtranscribe-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ea35c8bd833e2098790998d5c3628bfa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 13839,
            "upload_time": "2025-07-18T17:12:46",
            "upload_time_iso_8601": "2025-07-18T17:12:46.517658Z",
            "url": "https://files.pythonhosted.org/packages/f8/3a/396824b62b27eb02c0e1f6acf16ddf7e9b213cf13be1f87cc9790b81e9c1/holdtranscribe-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "25355f7c82aefff237db3f7c1c160541043918c2da5a9dc08ee53a2ffd981982",
                "md5": "2fe9738cbfb9e690a8505b72379c0d4a",
                "sha256": "5b1c63f5a9480c93a4648b344ede67fb5342b0e4c5c361a2431bd1b27e1ff865"
            },
            "downloads": -1,
            "filename": "holdtranscribe-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "2fe9738cbfb9e690a8505b72379c0d4a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 25989,
            "upload_time": "2025-07-18T17:12:47",
            "upload_time_iso_8601": "2025-07-18T17:12:47.675083Z",
            "url": "https://files.pythonhosted.org/packages/25/35/5f7c82aefff237db3f7c1c160541043918c2da5a9dc08ee53a2ffd981982/holdtranscribe-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-18 17:12:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "binaryninja",
    "github_project": "holdtranscribe",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "faster-whisper",
            "specs": [
                [
                    ">=",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "sounddevice",
            "specs": [
                [
                    ">=",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "pynput",
            "specs": [
                [
                    ">=",
                    "1.7.6"
                ]
            ]
        },
        {
            "name": "webrtcvad",
            "specs": [
                [
                    ">=",
                    "2.0.10"
                ]
            ]
        },
        {
            "name": "pyperclip",
            "specs": [
                [
                    ">=",
                    "1.8.2"
                ]
            ]
        },
        {
            "name": "notify2",
            "specs": [
                [
                    ">=",
                    "0.3.1"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ]
            ]
        },
        {
            "name": "psutil",
            "specs": [
                [
                    ">=",
                    "5.9.0"
                ]
            ]
        }
    ],
    "lcname": "holdtranscribe"
}

binaryninja