# Kokoro TTS
A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

## Features
- Multiple language and voice support
- Voice blending with customizable weights
- EPUB, PDF and TXT file input support
- Standard input (stdin) and `|` piping from other programs
- Streaming audio playback
- Split output into chapters
- Adjustable speech speed
- WAV and MP3 output formats
- Chapter merging capability
- Detailed debug output option
- GPU Support
## Demo
Kokoro TTS is an open-source CLI tool that delivers high-quality text-to-speech right from your terminal. Think of it as your personal voice studio, capable of transforming any text into natural-sounding speech with minimal effort.
https://github.com/user-attachments/assets/8413e640-59e9-490e-861d-49187e967526
[Demo Audio (MP3)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.mp3) | [Demo Audio (WAV)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.wav)
## TODO
- [x] Add GPU support
- [x] Add PDF support
- [ ] Add GUI
## Prerequisites
- Python 3.9-3.12 (Python 3.13+ is not currently supported)
## Installation
### Method 1: Install from PyPI (Recommended)
The easiest way to install Kokoro TTS is from PyPI:
```bash
# Using uv (recommended)
uv tool install kokoro-tts
# Using pip
pip install kokoro-tts
```
After installation, you can run:
```bash
kokoro-tts --help
```
### Method 2: Install from Git
Install directly from the repository:
```bash
# Using uv (recommended)
uv tool install git+https://github.com/nazdridoy/kokoro-tts
# Using pip
pip install git+https://github.com/nazdridoy/kokoro-tts
```
### Method 3: Clone and Install Locally
1. Clone the repository:
```bash
git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts
```
2. Install the package:
**With `uv` (recommended):**
```bash
uv venv
uv pip install -e .
```
**With `pip`:**
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e .
```
3. Run the tool:
```bash
# If using uv
uv run kokoro-tts --help
# If using pip with activated venv
kokoro-tts --help
```
### Method 4: Run Without Installation
If you prefer to run without installing:
1. Clone the repository:
```bash
git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts
```
2. Install dependencies only:
**With `uv`:**
```bash
uv venv
uv sync
```
**With `pip`:**
```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
```
3. Run directly:
```bash
# With uv
uv run -m kokoro_tts --help
# With pip (venv activated)
python -m kokoro_tts --help
```
### Download Model Files
After installation, download the required model files to your working directory:
```bash
# Download voice data (bin format is preferred)
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/voices-v1.0.bin
# Download the model
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/kokoro-v1.0.onnx
```
> [!NOTE]
> The script will automatically use `voices-v1.0.bin` if present, falling back to `voices.json` if the bin file is not available. Place these files in the same directory where you run the `kokoro-tts` command.
## Supported voices:
| **Category** | **Voices** | **Language Code** |
| --- | --- | --- |
| 🇺🇸 👩 | af\_alloy, af\_aoede, af\_bella, af\_heart, af\_jessica, af\_kore, af\_nicole, af\_nova, af\_river, af\_sarah, af\_sky | **en-us** |
| 🇺🇸 👨 | am\_adam, am\_echo, am\_eric, am\_fenrir, am\_liam, am\_michael, am\_onyx, am\_puck | **en-us** |
| 🇬🇧 | bf\_alice, bf\_emma, bf\_isabella, bf\_lily, bm\_daniel, bm\_fable, bm\_george, bm\_lewis | **en-gb** |
| 🇫🇷 | ff\_siwis | **fr-fr** |
| 🇮🇹 | if\_sara, im\_nicola | **it** |
| 🇯🇵 | jf\_alpha, jf\_gongitsune, jf\_nezumi, jf\_tebukuro, jm\_kumo | **ja** |
| 🇨🇳 | zf\_xiaobei, zf\_xiaoni, zf\_xiaoxiao, zf\_xiaoyi, zm\_yunjian, zm\_yunxi, zm\_yunxia, zm\_yunyang | **cmn** |
## Usage
### Basic Usage
```bash
kokoro-tts <input_text_file> [<output_audio_file>] [options]
```
> [!NOTE]
> - If you installed via Method 1 (PyPI) or Method 2 (git install), use `kokoro-tts` directly
> - If you installed via Method 3 (local install), use `uv run kokoro-tts` or activate your virtual environment first
> - If you're using Method 4 (no install), use `uv run -m kokoro_tts` or `python -m kokoro_tts` with activated venv
### Commands
- `-h, --help`: Show help message
- `--help-languages`: List supported languages
- `--help-voices`: List available voices
- `--merge-chunks`: Merge existing chunks into chapter files
### Options
- `--stream`: Stream audio instead of saving to file
- `--speed <float>`: Set speech speed (default: 1.0)
- `--lang <str>`: Set language (default: en-us)
- `--voice <str>`: Set voice or blend voices (default: interactive selection)
- Single voice: Use voice name (e.g., "af_sarah")
- Blended voices: Use "voice1:weight,voice2:weight" format
- `--split-output <dir>`: Save each chunk as separate file in directory
- `--format <str>`: Audio format: wav or mp3 (default: wav)
- `--debug`: Show detailed debug information during processing
### Input Formats
- `.txt`: Text file input
- `.epub`: EPUB book input (will process chapters)
- `.pdf`: PDF document input (extracts chapters from TOC or content)
- `-` or `/dev/stdin` (Linux/macOS) or `CONIN$` (Windows): Standard input (stdin)
### Examples
```bash
# Basic usage with output file
kokoro-tts input.txt output.wav --speed 1.2 --lang en-us --voice af_sarah
# Read from standard input (stdin)
echo "Hello World" | kokoro-tts - --stream
cat input.txt | kokoro-tts - output.wav
# Cross-platform stdin support:
# Linux/macOS: echo "text" | kokoro-tts - --stream
# Windows: echo "text" | kokoro-tts - --stream
# All platforms also support: kokoro-tts /dev/stdin --stream (Linux/macOS) or kokoro-tts CONIN$ --stream (Windows)
# Use voice blending (60-40 mix)
kokoro-tts input.txt output.wav --voice "af_sarah:60,am_adam:40"
# Use equal voice blend (50-50)
kokoro-tts input.txt --stream --voice "am_adam,af_sarah"
# Process EPUB and split into chunks
kokoro-tts input.epub --split-output ./chunks/ --format mp3
# Stream audio directly
kokoro-tts input.txt --stream --speed 0.8
# Merge existing chunks
kokoro-tts --merge-chunks --split-output ./chunks/ --format wav
# Process EPUB with detailed debug output
kokoro-tts input.epub --split-output ./chunks/ --debug
# Process PDF and split into chapters
kokoro-tts input.pdf --split-output ./chunks/ --format mp3
# List available voices
kokoro-tts --help-voices
# List supported languages
kokoro-tts --help-languages
```
> [!TIP]
> If you're using Method 3, replace `kokoro-tts` with `uv run kokoro-tts` in the examples above.
> If you're using Method 4, replace `kokoro-tts` with `uv run -m kokoro_tts` or `python -m kokoro_tts` in the examples above.
## Features in Detail
### EPUB Processing
- Automatically extracts chapters from EPUB files
- Preserves chapter titles and structure
- Creates organized output for each chapter
- Detailed debug output available for troubleshooting
### Audio Processing
- Chunks long text into manageable segments
- Supports streaming for immediate playback
- Voice blending with customizable mix ratios
- Progress indicators for long processes
- Handles interruptions gracefully
### Output Options
- Single file output
- Split output with chapter organization
- Chunk merging capability
- Multiple audio format support
### Debug Mode
- Shows detailed information about file processing
- Displays NCX parsing details for EPUB files
- Lists all found chapters and their metadata
- Helps troubleshoot processing issues
### Input Options
- Text file input (.txt)
- EPUB book input (.epub)
- Standard input (stdin)
- Supports piping from other programs
## Contributing
This is a personal project. But if you want to contribute, please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
## Acknowledgments
- [Kokoro-ONNX](https://github.com/thewh1teagle/kokoro-onnx)
Raw data
{
"_id": null,
"home_page": null,
"name": "kokoro-tts",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.9",
"maintainer_email": null,
"keywords": "audio, cli, epub, kokoro, pdf, text-to-speech, tts",
"author": null,
"author_email": "Nazmul Hossain <nazdridoy@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/54/5d/9f2898473632151e9343ce870f76de014775f35339a2c9438d2493c21c6b/kokoro_tts-2.2.1.tar.gz",
"platform": null,
"description": "# Kokoro TTS\n\nA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.\n\n\n\n## Features\n\n- Multiple language and voice support\n- Voice blending with customizable weights\n- EPUB, PDF and TXT file input support\n- Standard input (stdin) and `|` piping from other programs\n- Streaming audio playback\n- Split output into chapters\n- Adjustable speech speed\n- WAV and MP3 output formats\n- Chapter merging capability\n- Detailed debug output option\n- GPU Support\n\n## Demo\n\nKokoro TTS is an open-source CLI tool that delivers high-quality text-to-speech right from your terminal. Think of it as your personal voice studio, capable of transforming any text into natural-sounding speech with minimal effort.\n\nhttps://github.com/user-attachments/assets/8413e640-59e9-490e-861d-49187e967526\n\n[Demo Audio (MP3)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.mp3) | [Demo Audio (WAV)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.wav)\n\n## TODO\n\n- [x] Add GPU support\n- [x] Add PDF support\n- [ ] Add GUI\n\n## Prerequisites\n\n- Python 3.9-3.12 (Python 3.13+ is not currently supported)\n\n## Installation\n\n### Method 1: Install from PyPI (Recommended)\n\nThe easiest way to install Kokoro TTS is from PyPI:\n\n```bash\n# Using uv (recommended)\nuv tool install kokoro-tts\n\n# Using pip\npip install kokoro-tts\n```\n\nAfter installation, you can run:\n```bash\nkokoro-tts --help\n```\n\n### Method 2: Install from Git\n\nInstall directly from the repository:\n\n```bash\n# Using uv (recommended)\nuv tool install git+https://github.com/nazdridoy/kokoro-tts\n\n# Using pip\npip install git+https://github.com/nazdridoy/kokoro-tts\n```\n\n### Method 3: Clone and Install Locally\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/nazdridoy/kokoro-tts.git\ncd kokoro-tts\n```\n\n2. Install the package:\n\n**With `uv` (recommended):**\n```bash\nuv venv\nuv pip install -e .\n```\n\n**With `pip`:**\n```bash\npython -m venv .venv\nsource .venv/bin/activate # On Windows: .venv\\Scripts\\activate\npip install -e .\n```\n\n3. Run the tool:\n```bash\n# If using uv\nuv run kokoro-tts --help\n\n# If using pip with activated venv\nkokoro-tts --help\n```\n\n### Method 4: Run Without Installation\n\nIf you prefer to run without installing:\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/nazdridoy/kokoro-tts.git\ncd kokoro-tts\n```\n\n2. Install dependencies only:\n\n**With `uv`:**\n```bash\nuv venv\nuv sync\n```\n\n**With `pip`:**\n```bash\npython -m venv .venv\nsource .venv/bin/activate # On Windows: .venv\\Scripts\\activate\npip install -r requirements.txt\n```\n\n3. Run directly:\n```bash\n# With uv\nuv run -m kokoro_tts --help\n\n# With pip (venv activated)\npython -m kokoro_tts --help\n```\n\n### Download Model Files\n\nAfter installation, download the required model files to your working directory:\n\n```bash\n# Download voice data (bin format is preferred)\nwget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/voices-v1.0.bin\n\n# Download the model\nwget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/kokoro-v1.0.onnx\n```\n\n> [!NOTE]\n> The script will automatically use `voices-v1.0.bin` if present, falling back to `voices.json` if the bin file is not available. Place these files in the same directory where you run the `kokoro-tts` command.\n\n## Supported voices:\n\n| **Category** | **Voices** | **Language Code** |\n| --- | --- | --- |\n| \ud83c\uddfa\ud83c\uddf8 \ud83d\udc69 | af\\_alloy, af\\_aoede, af\\_bella, af\\_heart, af\\_jessica, af\\_kore, af\\_nicole, af\\_nova, af\\_river, af\\_sarah, af\\_sky | **en-us** |\n| \ud83c\uddfa\ud83c\uddf8 \ud83d\udc68 | am\\_adam, am\\_echo, am\\_eric, am\\_fenrir, am\\_liam, am\\_michael, am\\_onyx, am\\_puck | **en-us** |\n| \ud83c\uddec\ud83c\udde7 | bf\\_alice, bf\\_emma, bf\\_isabella, bf\\_lily, bm\\_daniel, bm\\_fable, bm\\_george, bm\\_lewis | **en-gb** |\n| \ud83c\uddeb\ud83c\uddf7 | ff\\_siwis | **fr-fr** |\n| \ud83c\uddee\ud83c\uddf9 | if\\_sara, im\\_nicola | **it** |\n| \ud83c\uddef\ud83c\uddf5 | jf\\_alpha, jf\\_gongitsune, jf\\_nezumi, jf\\_tebukuro, jm\\_kumo | **ja** |\n| \ud83c\udde8\ud83c\uddf3 | zf\\_xiaobei, zf\\_xiaoni, zf\\_xiaoxiao, zf\\_xiaoyi, zm\\_yunjian, zm\\_yunxi, zm\\_yunxia, zm\\_yunyang | **cmn** |\n\n## Usage\n\n### Basic Usage\n\n```bash\nkokoro-tts <input_text_file> [<output_audio_file>] [options]\n```\n\n> [!NOTE]\n> - If you installed via Method 1 (PyPI) or Method 2 (git install), use `kokoro-tts` directly\n> - If you installed via Method 3 (local install), use `uv run kokoro-tts` or activate your virtual environment first\n> - If you're using Method 4 (no install), use `uv run -m kokoro_tts` or `python -m kokoro_tts` with activated venv\n\n### Commands\n\n- `-h, --help`: Show help message\n- `--help-languages`: List supported languages\n- `--help-voices`: List available voices\n- `--merge-chunks`: Merge existing chunks into chapter files\n\n### Options\n\n- `--stream`: Stream audio instead of saving to file\n- `--speed <float>`: Set speech speed (default: 1.0)\n- `--lang <str>`: Set language (default: en-us)\n- `--voice <str>`: Set voice or blend voices (default: interactive selection)\n - Single voice: Use voice name (e.g., \"af_sarah\")\n - Blended voices: Use \"voice1:weight,voice2:weight\" format\n- `--split-output <dir>`: Save each chunk as separate file in directory\n- `--format <str>`: Audio format: wav or mp3 (default: wav)\n- `--debug`: Show detailed debug information during processing\n\n### Input Formats\n\n- `.txt`: Text file input\n- `.epub`: EPUB book input (will process chapters)\n- `.pdf`: PDF document input (extracts chapters from TOC or content)\n- `-` or `/dev/stdin` (Linux/macOS) or `CONIN$` (Windows): Standard input (stdin)\n\n### Examples\n\n```bash\n# Basic usage with output file\nkokoro-tts input.txt output.wav --speed 1.2 --lang en-us --voice af_sarah\n\n# Read from standard input (stdin)\necho \"Hello World\" | kokoro-tts - --stream\ncat input.txt | kokoro-tts - output.wav\n\n# Cross-platform stdin support:\n# Linux/macOS: echo \"text\" | kokoro-tts - --stream\n# Windows: echo \"text\" | kokoro-tts - --stream\n# All platforms also support: kokoro-tts /dev/stdin --stream (Linux/macOS) or kokoro-tts CONIN$ --stream (Windows)\n\n# Use voice blending (60-40 mix)\nkokoro-tts input.txt output.wav --voice \"af_sarah:60,am_adam:40\"\n\n# Use equal voice blend (50-50)\nkokoro-tts input.txt --stream --voice \"am_adam,af_sarah\"\n\n# Process EPUB and split into chunks\nkokoro-tts input.epub --split-output ./chunks/ --format mp3\n\n# Stream audio directly\nkokoro-tts input.txt --stream --speed 0.8\n\n# Merge existing chunks\nkokoro-tts --merge-chunks --split-output ./chunks/ --format wav\n\n# Process EPUB with detailed debug output\nkokoro-tts input.epub --split-output ./chunks/ --debug\n\n# Process PDF and split into chapters\nkokoro-tts input.pdf --split-output ./chunks/ --format mp3\n\n# List available voices\nkokoro-tts --help-voices\n\n# List supported languages\nkokoro-tts --help-languages\n```\n\n> [!TIP]\n> If you're using Method 3, replace `kokoro-tts` with `uv run kokoro-tts` in the examples above.\n> If you're using Method 4, replace `kokoro-tts` with `uv run -m kokoro_tts` or `python -m kokoro_tts` in the examples above.\n\n## Features in Detail\n\n### EPUB Processing\n- Automatically extracts chapters from EPUB files\n- Preserves chapter titles and structure\n- Creates organized output for each chapter\n- Detailed debug output available for troubleshooting\n\n### Audio Processing\n- Chunks long text into manageable segments\n- Supports streaming for immediate playback\n- Voice blending with customizable mix ratios\n- Progress indicators for long processes\n- Handles interruptions gracefully\n\n### Output Options\n- Single file output\n- Split output with chapter organization\n- Chunk merging capability\n- Multiple audio format support\n\n### Debug Mode\n- Shows detailed information about file processing\n- Displays NCX parsing details for EPUB files\n- Lists all found chapters and their metadata\n- Helps troubleshoot processing issues\n\n### Input Options\n- Text file input (.txt)\n- EPUB book input (.epub)\n- Standard input (stdin)\n- Supports piping from other programs\n\n## Contributing\n\nThis is a personal project. But if you want to contribute, please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- [Kokoro-ONNX](https://github.com/thewh1teagle/kokoro-onnx)\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.",
"version": "2.2.1",
"project_urls": {
"Homepage": "https://github.com/nazdridoy/kokoro-tts",
"Issues": "https://github.com/nazdridoy/kokoro-tts/issues",
"Repository": "https://github.com/nazdridoy/kokoro-tts"
},
"split_keywords": [
"audio",
" cli",
" epub",
" kokoro",
" pdf",
" text-to-speech",
" tts"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5ce781a57f57bc4b9f7d197c952c46dbb0e6370a9fc896d814bcaa677c70cbf0",
"md5": "1d6939e494e3fffd4c7721768d378b82",
"sha256": "e02e6073a7e2e0f31f09195b1767bb675667a2797bd2ec43170c1f44bd99e3b8"
},
"downloads": -1,
"filename": "kokoro_tts-2.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1d6939e494e3fffd4c7721768d378b82",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.9",
"size": 17959,
"upload_time": "2025-08-14T22:12:58",
"upload_time_iso_8601": "2025-08-14T22:12:58.883281Z",
"url": "https://files.pythonhosted.org/packages/5c/e7/81a57f57bc4b9f7d197c952c46dbb0e6370a9fc896d814bcaa677c70cbf0/kokoro_tts-2.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "545d9f2898473632151e9343ce870f76de014775f35339a2c9438d2493c21c6b",
"md5": "44c87b950028add71a969c8fd370f274",
"sha256": "8288ca6d0f6b8ff51f205c30f251a64baf3788af25772633f5bf91ba662f2bda"
},
"downloads": -1,
"filename": "kokoro_tts-2.2.1.tar.gz",
"has_sig": false,
"md5_digest": "44c87b950028add71a969c8fd370f274",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.9",
"size": 1151937,
"upload_time": "2025-08-14T22:13:00",
"upload_time_iso_8601": "2025-08-14T22:13:00.262134Z",
"url": "https://files.pythonhosted.org/packages/54/5d/9f2898473632151e9343ce870f76de014775f35339a2c9438d2493c21c6b/kokoro_tts-2.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-14 22:13:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nazdridoy",
"github_project": "kokoro-tts",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "beautifulsoup4",
"specs": []
},
{
"name": "ebooklib",
"specs": []
},
{
"name": "PyMuPDF",
"specs": []
},
{
"name": "kokoro-onnx",
"specs": [
[
"==",
"0.3.9"
]
]
},
{
"name": "pymupdf4llm",
"specs": []
},
{
"name": "sounddevice",
"specs": []
},
{
"name": "soundfile",
"specs": []
}
],
"lcname": "kokoro-tts"
}