# ReadVideo
[](https://badge.fury.io/py/readvideo)
[](https://www.python.org/downloads/)
A modern Python-based video and audio transcription tool that extracts and transcribes content from YouTube, Bilibili, and local media files. This project is a complete rewrite of the original bash script with improved modularity, performance, and user experience.
## ๐ Features
### Multi-Platform Support
- **YouTube**: Prioritizes existing subtitles, falls back to audio transcription
- **Bilibili**: Automatically downloads and transcribes audio using BBDown
- **Local Files**: Supports various audio and video file formats
### Intelligent Processing
- **Subtitle Priority**: YouTube videos prioritize `youtube-transcript-api` for existing subtitles
- **Multi-Language Support**: Supports Chinese, English, and more with auto-detection or manual specification
- **Fallback Mechanism**: Automatically falls back to audio transcription when subtitles are unavailable
### High Performance
- **Tool Reuse**: Directly calls installed whisper-cli for native performance
- **Model Reuse**: Utilizes existing models in `~/.whisper-models/` directory
- **Efficient Processing**: Smart temporary file management and cleanup
## ๐ฆ Installation
### Prerequisites
- Python 3.11+
- ffmpeg (system installation)
- whisper-cli (from whisper.cpp)
- yt-dlp (Python package, included)
- BBDown (optional, for Bilibili support)
### Install from PyPI
#### Option 1: Install as a global tool (Recommended)
```bash
# Using uv (recommended - fast and isolated)
uv tool install readvideo
# Using pipx (alternative tool installer)
pipx install readvideo
# Using pip globally
pip install readvideo
```
#### Option 2: Development Installation
```bash
# Clone and install for development
git clone https://github.com/learnerLj/readvideo.git
cd readvideo
uv sync
# Or with pip
pip install -e .
```
### System Dependencies
```bash
# macOS
brew install ffmpeg whisper-cpp
# Ubuntu/Debian
sudo apt install ffmpeg
# Install whisper.cpp from source: https://github.com/ggerganov/whisper.cpp
# Download Whisper model (if not already present)
mkdir -p ~/.whisper-models
# Download ggml-large-v3.bin to ~/.whisper-models/
```
## ๐ฏ Quick Start
### Basic Usage
```bash
# YouTube video (prioritizes subtitles)
readvideo https://www.youtube.com/watch?v=abc123
# Auto language detection
readvideo --auto-detect https://www.youtube.com/watch?v=abc123
# Bilibili video
readvideo https://www.bilibili.com/video/BV1234567890
# Local audio file
readvideo ~/Music/podcast.mp3
# Local video file
readvideo ~/Videos/lecture.mp4
# Custom output directory
readvideo input.mp4 --output-dir ./transcripts
# Show information only
readvideo input.mp4 --info-only
```
### Command Line Options
```
Options:
--auto-detect Enable automatic language detection (default: Chinese)
--output-dir, -o PATH Output directory (default: current directory or input file directory)
--no-cleanup Do not clean up temporary files
--info-only Show input information only, do not process
--whisper-model PATH Path to Whisper model file [default: ~/.whisper-models/ggml-large-v3.bin]
--verbose, -v Verbose output
--proxy TEXT HTTP proxy address (e.g., http://127.0.0.1:8080)
--help Show this message and exit
```
## ๐๏ธ Architecture
### Project Structure
```
readvideo/
โโโ pyproject.toml # Project configuration
โโโ README.md # Project documentation
โโโ src/readvideo/
โโโ __init__.py # Package initialization
โโโ cli.py # CLI entry point
โโโ core/ # Core functionality modules
โ โโโ transcript_fetcher.py # YouTube subtitle fetcher
โ โโโ whisper_wrapper.py # whisper-cli wrapper
โ โโโ audio_processor.py # Audio processor
โโโ platforms/ # Platform handlers
โโโ youtube.py # YouTube handler
โโโ bilibili.py # Bilibili handler
โโโ local.py # Local file handler
```
### Core Dependencies
- `youtube-transcript-api`: YouTube subtitle extraction
- `yt-dlp`: YouTube video downloading
- `click`: Command-line interface
- `rich`: Beautiful console output
- `tenacity`: Retry mechanisms
- `ffmpeg`: Audio processing (system dependency)
- `whisper-cli`: Speech transcription (system dependency)
## ๐ง How It Works
### YouTube Processing
1. **Subtitle Priority**: Attempts to fetch existing subtitles using `youtube-transcript-api`
2. **Language Preference**: Prioritizes Chinese (zh, zh-Hans, zh-Hant), then English
3. **Fallback**: If no subtitles available, downloads audio with `yt-dlp`
4. **Transcription**: Converts audio to WAV and transcribes with whisper-cli
### Bilibili Processing
1. **Audio Download**: Uses BBDown to extract audio from Bilibili videos
2. **Format Conversion**: Converts audio to WAV format using ffmpeg
3. **Transcription**: Processes audio with whisper-cli
### Local File Processing
1. **Format Detection**: Automatically detects audio vs video files
2. **Audio Extraction**: Extracts audio tracks from video files using ffmpeg
3. **Format Conversion**: Converts to whisper-compatible WAV format
4. **Transcription**: Processes with whisper-cli
## ๐ Supported Formats
### Audio Formats
- MP3, M4A, WAV, FLAC, OGG, AAC, WMA
### Video Formats
- MP4, MKV, AVI, MOV, WMV, FLV, WEBM, M4V
## ๐ ๏ธ Configuration
### Whisper Model Configuration
```bash
# Default model path
~/.whisper-models/ggml-large-v3.bin
# Custom model
readvideo input.mp4 --whisper-model /path/to/model.bin
```
### Language Options
- `--auto-detect`: Automatic language detection
- Default: Chinese (`zh`)
- YouTube subtitles support multi-language priority
## ๐งช Testing
### Test Examples
```bash
# YouTube video with subtitles
readvideo "https://www.youtube.com/watch?v=JdKVJH3xmlU" --info-only
# Bilibili video
readvideo "https://www.bilibili.com/video/BV1Tjt9zJEdw" --info-only
# Test local file format support
echo "test" > test.txt
readvideo test.txt --info-only # Should show format error
```
### Debugging
```bash
# Verbose output
readvideo input.mp4 --verbose
# Keep temporary files
readvideo input.mp4 --no-cleanup --verbose
# Information only (no processing)
readvideo input.mp4 --info-only
```
## โก Performance
### Speed Comparison
| Operation | Time | Notes |
|-----------|------|-------|
| YouTube subtitle fetch | ~3-5s | When subtitles available |
| YouTube audio download | ~30s-2min | Depends on video length |
| Audio conversion | ~5-15s | Depends on file size |
| Whisper transcription | ~0.1-0.5x video length | Depends on model and audio length |
### Performance Features
- **Subtitle Priority**: 10-100x faster than audio transcription for YouTube
- **Native Tools**: Direct whisper-cli calls maintain original performance
- **Smart Caching**: Reuses existing models and temporary files efficiently
## ๐จ Troubleshooting
### Common Issues
#### 1. whisper-cli not found
```bash
# Solution: Install whisper.cpp
brew install whisper-cpp # macOS
# Or compile from source: https://github.com/ggerganov/whisper.cpp
```
#### 2. ffmpeg not found
```bash
# Solution: Install ffmpeg
brew install ffmpeg # macOS
sudo apt install ffmpeg # Ubuntu/Debian
```
#### 3. Model file missing
```bash
# Solution: Download whisper model
mkdir -p ~/.whisper-models
# Download ggml-large-v3.bin from whisper.cpp releases
```
#### 4. YouTube IP restrictions
- The tool automatically falls back to audio download when subtitle API is blocked
- Consider using a proxy with `--proxy` option if needed
- Wait some time and retry
#### 5. BBDown not found (Bilibili only)
- Download from [BBDown GitHub](https://github.com/nilaoda/BBDown)
- Ensure it's in your PATH
### Error Handling
- **Graceful Fallbacks**: YouTube subtitle failures automatically retry with audio transcription
- **Intelligent Retries**: Network issues are retried automatically, but IP blocks are not
- **Clear Error Messages**: Descriptive error messages with suggested solutions
- **Cleanup on Failure**: Temporary files are cleaned up even if processing fails
## ๐ Security Notes
### Cookie Usage
- Browser cookies are used only for video downloads (yt-dlp), not for subtitle API calls
- This follows security recommendations from the youtube-transcript-api maintainer
- Cookies help bypass some YouTube download restrictions
### Privacy
- No data is sent to external services except for downloading content
- All processing happens locally on your machine
- Temporary files are automatically cleaned up
## ๐ค Contributing
This project replaces a bash script with a modern Python implementation. Key design principles:
1. **Maintain Compatibility**: Same functionality as the original bash script
2. **Improve Performance**: Leverage existing tools efficiently
3. **Better UX**: Rich console output and clear error messages
4. **Extensible**: Modular design for easy platform additions
### Adding New Platforms
1. Create a new handler in `platforms/`
2. Implement `validate_url()`, `process()`, and `get_info()` methods
3. Add detection logic in CLI
### Adding New Formats
1. Update format lists in `AudioProcessor`
2. Add corresponding ffmpeg parameters
3. Test with sample files
## ๐ License
This project maintains compatibility with the original bash script while providing a modern Python implementation focused on performance, reliability, and user experience.
## ๐ Acknowledgments
- [whisper.cpp](https://github.com/ggerganov/whisper.cpp) for high-performance speech recognition
- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for robust video downloading
- [youtube-transcript-api](https://github.com/jdepoix/youtube-transcript-api) for subtitle extraction
- [BBDown](https://github.com/nilaoda/BBDown) for Bilibili support
Raw data
{
"_id": null,
"home_page": null,
"name": "readvideo",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "video, transcription, youtube, bilibili, audio, whisper",
"author": "Jiahao Luo",
"author_email": "Jiahao Luo <luoshitou9@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/39/7a/fb8e98d0d1f3925e4d0fc3d8ac4b91059e61d9bdfae47626d6c4766fa967/readvideo-0.1.1.tar.gz",
"platform": null,
"description": "# ReadVideo\n\n[](https://badge.fury.io/py/readvideo)\n[](https://www.python.org/downloads/)\n\nA modern Python-based video and audio transcription tool that extracts and transcribes content from YouTube, Bilibili, and local media files. This project is a complete rewrite of the original bash script with improved modularity, performance, and user experience.\n\n## \ud83d\ude80 Features\n\n### Multi-Platform Support\n- **YouTube**: Prioritizes existing subtitles, falls back to audio transcription\n- **Bilibili**: Automatically downloads and transcribes audio using BBDown\n- **Local Files**: Supports various audio and video file formats\n\n### Intelligent Processing\n- **Subtitle Priority**: YouTube videos prioritize `youtube-transcript-api` for existing subtitles\n- **Multi-Language Support**: Supports Chinese, English, and more with auto-detection or manual specification\n- **Fallback Mechanism**: Automatically falls back to audio transcription when subtitles are unavailable\n\n### High Performance\n- **Tool Reuse**: Directly calls installed whisper-cli for native performance\n- **Model Reuse**: Utilizes existing models in `~/.whisper-models/` directory\n- **Efficient Processing**: Smart temporary file management and cleanup\n\n## \ud83d\udce6 Installation\n\n### Prerequisites\n- Python 3.11+\n- ffmpeg (system installation)\n- whisper-cli (from whisper.cpp)\n- yt-dlp (Python package, included)\n- BBDown (optional, for Bilibili support)\n\n### Install from PyPI\n\n#### Option 1: Install as a global tool (Recommended)\n```bash\n# Using uv (recommended - fast and isolated)\nuv tool install readvideo\n\n# Using pipx (alternative tool installer)\npipx install readvideo\n\n# Using pip globally\npip install readvideo\n```\n\n#### Option 2: Development Installation\n```bash\n# Clone and install for development\ngit clone https://github.com/learnerLj/readvideo.git\ncd readvideo\nuv sync\n\n# Or with pip\npip install -e .\n```\n\n### System Dependencies\n```bash\n# macOS\nbrew install ffmpeg whisper-cpp\n\n# Ubuntu/Debian\nsudo apt install ffmpeg\n# Install whisper.cpp from source: https://github.com/ggerganov/whisper.cpp\n\n# Download Whisper model (if not already present)\nmkdir -p ~/.whisper-models\n# Download ggml-large-v3.bin to ~/.whisper-models/\n```\n\n## \ud83c\udfaf Quick Start\n\n### Basic Usage\n```bash\n# YouTube video (prioritizes subtitles)\nreadvideo https://www.youtube.com/watch?v=abc123\n\n# Auto language detection\nreadvideo --auto-detect https://www.youtube.com/watch?v=abc123\n\n# Bilibili video\nreadvideo https://www.bilibili.com/video/BV1234567890\n\n# Local audio file\nreadvideo ~/Music/podcast.mp3\n\n# Local video file\nreadvideo ~/Videos/lecture.mp4\n\n# Custom output directory\nreadvideo input.mp4 --output-dir ./transcripts\n\n# Show information only\nreadvideo input.mp4 --info-only\n```\n\n### Command Line Options\n```\nOptions:\n --auto-detect Enable automatic language detection (default: Chinese)\n --output-dir, -o PATH Output directory (default: current directory or input file directory)\n --no-cleanup Do not clean up temporary files\n --info-only Show input information only, do not process\n --whisper-model PATH Path to Whisper model file [default: ~/.whisper-models/ggml-large-v3.bin]\n --verbose, -v Verbose output\n --proxy TEXT HTTP proxy address (e.g., http://127.0.0.1:8080)\n --help Show this message and exit\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\n### Project Structure\n```\nreadvideo/\n\u251c\u2500\u2500 pyproject.toml # Project configuration\n\u251c\u2500\u2500 README.md # Project documentation\n\u2514\u2500\u2500 src/readvideo/\n \u251c\u2500\u2500 __init__.py # Package initialization\n \u251c\u2500\u2500 cli.py # CLI entry point\n \u251c\u2500\u2500 core/ # Core functionality modules\n \u2502 \u251c\u2500\u2500 transcript_fetcher.py # YouTube subtitle fetcher\n \u2502 \u251c\u2500\u2500 whisper_wrapper.py # whisper-cli wrapper\n \u2502 \u2514\u2500\u2500 audio_processor.py # Audio processor\n \u2514\u2500\u2500 platforms/ # Platform handlers\n \u251c\u2500\u2500 youtube.py # YouTube handler\n \u251c\u2500\u2500 bilibili.py # Bilibili handler\n \u2514\u2500\u2500 local.py # Local file handler\n```\n\n### Core Dependencies\n- `youtube-transcript-api`: YouTube subtitle extraction\n- `yt-dlp`: YouTube video downloading\n- `click`: Command-line interface\n- `rich`: Beautiful console output\n- `tenacity`: Retry mechanisms\n- `ffmpeg`: Audio processing (system dependency)\n- `whisper-cli`: Speech transcription (system dependency)\n\n## \ud83d\udd27 How It Works\n\n### YouTube Processing\n1. **Subtitle Priority**: Attempts to fetch existing subtitles using `youtube-transcript-api`\n2. **Language Preference**: Prioritizes Chinese (zh, zh-Hans, zh-Hant), then English\n3. **Fallback**: If no subtitles available, downloads audio with `yt-dlp`\n4. **Transcription**: Converts audio to WAV and transcribes with whisper-cli\n\n### Bilibili Processing\n1. **Audio Download**: Uses BBDown to extract audio from Bilibili videos\n2. **Format Conversion**: Converts audio to WAV format using ffmpeg\n3. **Transcription**: Processes audio with whisper-cli\n\n### Local File Processing\n1. **Format Detection**: Automatically detects audio vs video files\n2. **Audio Extraction**: Extracts audio tracks from video files using ffmpeg\n3. **Format Conversion**: Converts to whisper-compatible WAV format\n4. **Transcription**: Processes with whisper-cli\n\n## \ud83d\udccb Supported Formats\n\n### Audio Formats\n- MP3, M4A, WAV, FLAC, OGG, AAC, WMA\n\n### Video Formats \n- MP4, MKV, AVI, MOV, WMV, FLV, WEBM, M4V\n\n## \ud83d\udee0\ufe0f Configuration\n\n### Whisper Model Configuration\n```bash\n# Default model path\n~/.whisper-models/ggml-large-v3.bin\n\n# Custom model\nreadvideo input.mp4 --whisper-model /path/to/model.bin\n```\n\n### Language Options\n- `--auto-detect`: Automatic language detection\n- Default: Chinese (`zh`)\n- YouTube subtitles support multi-language priority\n\n## \ud83e\uddea Testing\n\n### Test Examples\n```bash\n# YouTube video with subtitles\nreadvideo \"https://www.youtube.com/watch?v=JdKVJH3xmlU\" --info-only\n\n# Bilibili video\nreadvideo \"https://www.bilibili.com/video/BV1Tjt9zJEdw\" --info-only\n\n# Test local file format support\necho \"test\" > test.txt\nreadvideo test.txt --info-only # Should show format error\n```\n\n### Debugging\n```bash\n# Verbose output\nreadvideo input.mp4 --verbose\n\n# Keep temporary files\nreadvideo input.mp4 --no-cleanup --verbose\n\n# Information only (no processing)\nreadvideo input.mp4 --info-only\n```\n\n## \u26a1 Performance\n\n### Speed Comparison\n| Operation | Time | Notes |\n|-----------|------|-------|\n| YouTube subtitle fetch | ~3-5s | When subtitles available |\n| YouTube audio download | ~30s-2min | Depends on video length |\n| Audio conversion | ~5-15s | Depends on file size |\n| Whisper transcription | ~0.1-0.5x video length | Depends on model and audio length |\n\n### Performance Features\n- **Subtitle Priority**: 10-100x faster than audio transcription for YouTube\n- **Native Tools**: Direct whisper-cli calls maintain original performance\n- **Smart Caching**: Reuses existing models and temporary files efficiently\n\n## \ud83d\udea8 Troubleshooting\n\n### Common Issues\n\n#### 1. whisper-cli not found\n```bash\n# Solution: Install whisper.cpp\nbrew install whisper-cpp # macOS\n# Or compile from source: https://github.com/ggerganov/whisper.cpp\n```\n\n#### 2. ffmpeg not found\n```bash\n# Solution: Install ffmpeg\nbrew install ffmpeg # macOS\nsudo apt install ffmpeg # Ubuntu/Debian\n```\n\n#### 3. Model file missing\n```bash\n# Solution: Download whisper model\nmkdir -p ~/.whisper-models\n# Download ggml-large-v3.bin from whisper.cpp releases\n```\n\n#### 4. YouTube IP restrictions\n- The tool automatically falls back to audio download when subtitle API is blocked\n- Consider using a proxy with `--proxy` option if needed\n- Wait some time and retry\n\n#### 5. BBDown not found (Bilibili only)\n- Download from [BBDown GitHub](https://github.com/nilaoda/BBDown)\n- Ensure it's in your PATH\n\n### Error Handling\n- **Graceful Fallbacks**: YouTube subtitle failures automatically retry with audio transcription\n- **Intelligent Retries**: Network issues are retried automatically, but IP blocks are not\n- **Clear Error Messages**: Descriptive error messages with suggested solutions\n- **Cleanup on Failure**: Temporary files are cleaned up even if processing fails\n\n## \ud83d\udd12 Security Notes\n\n### Cookie Usage\n- Browser cookies are used only for video downloads (yt-dlp), not for subtitle API calls\n- This follows security recommendations from the youtube-transcript-api maintainer\n- Cookies help bypass some YouTube download restrictions\n\n### Privacy\n- No data is sent to external services except for downloading content\n- All processing happens locally on your machine\n- Temporary files are automatically cleaned up\n\n## \ud83e\udd1d Contributing\n\nThis project replaces a bash script with a modern Python implementation. Key design principles:\n\n1. **Maintain Compatibility**: Same functionality as the original bash script\n2. **Improve Performance**: Leverage existing tools efficiently\n3. **Better UX**: Rich console output and clear error messages\n4. **Extensible**: Modular design for easy platform additions\n\n### Adding New Platforms\n1. Create a new handler in `platforms/`\n2. Implement `validate_url()`, `process()`, and `get_info()` methods\n3. Add detection logic in CLI\n\n### Adding New Formats\n1. Update format lists in `AudioProcessor`\n2. Add corresponding ffmpeg parameters\n3. Test with sample files\n\n## \ud83d\udcc4 License\n\nThis project maintains compatibility with the original bash script while providing a modern Python implementation focused on performance, reliability, and user experience.\n\n## \ud83d\ude4f Acknowledgments\n\n- [whisper.cpp](https://github.com/ggerganov/whisper.cpp) for high-performance speech recognition\n- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for robust video downloading\n- [youtube-transcript-api](https://github.com/jdepoix/youtube-transcript-api) for subtitle extraction\n- [BBDown](https://github.com/nilaoda/BBDown) for Bilibili support",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python tool for downloading and transcribing videos from YouTube/Bilibili and local media files",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/learnerLj/readvideo",
"Issues": "https://github.com/learnerLj/readvideo/issues",
"Repository": "https://github.com/learnerLj/readvideo"
},
"split_keywords": [
"video",
" transcription",
" youtube",
" bilibili",
" audio",
" whisper"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "23ffed2d6a468952b289f203ecab0c0fbb4637787ceab51c96a0b53e8e97ab4f",
"md5": "0b0e119313925e0d6bf518a373e51b07",
"sha256": "bc163e85111b45077ec1f11c9b2d2339594d820847fcefa437be06dd47ec7a98"
},
"downloads": -1,
"filename": "readvideo-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0b0e119313925e0d6bf518a373e51b07",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 25762,
"upload_time": "2025-08-15T12:39:12",
"upload_time_iso_8601": "2025-08-15T12:39:12.770559Z",
"url": "https://files.pythonhosted.org/packages/23/ff/ed2d6a468952b289f203ecab0c0fbb4637787ceab51c96a0b53e8e97ab4f/readvideo-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "397afb8e98d0d1f3925e4d0fc3d8ac4b91059e61d9bdfae47626d6c4766fa967",
"md5": "27832e2a2e8f73af82d1a0d47a76a7fe",
"sha256": "3d1a852c6dec50ea7ac0fbf09336395a303d6059dff5f3e3ad1ac93398021631"
},
"downloads": -1,
"filename": "readvideo-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "27832e2a2e8f73af82d1a0d47a76a7fe",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 17982,
"upload_time": "2025-08-15T12:39:14",
"upload_time_iso_8601": "2025-08-15T12:39:14.453538Z",
"url": "https://files.pythonhosted.org/packages/39/7a/fb8e98d0d1f3925e4d0fc3d8ac4b91059e61d9bdfae47626d6c4766fa967/readvideo-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-15 12:39:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "learnerLj",
"github_project": "readvideo",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "readvideo"
}