kokoro-tts


Namekokoro-tts JSON
Version 2.2.1 PyPI version JSON
download
home_pageNone
SummaryA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
upload_time2025-08-14 22:13:00
maintainerNone
docs_urlNone
authorNone
requires_python<3.13,>=3.9
licenseMIT
keywords audio cli epub kokoro pdf text-to-speech tts
VCS
bugtrack_url
requirements beautifulsoup4 ebooklib PyMuPDF kokoro-onnx pymupdf4llm sounddevice soundfile
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Kokoro TTS

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

![ngpt-s-c](https://raw.githubusercontent.com/nazdridoy/kokoro-tts/main/previews/kokoro-tts-h.png)

## Features

- Multiple language and voice support
- Voice blending with customizable weights
- EPUB, PDF and TXT file input support
- Standard input (stdin) and `|` piping from other programs
- Streaming audio playback
- Split output into chapters
- Adjustable speech speed
- WAV and MP3 output formats
- Chapter merging capability
- Detailed debug output option
- GPU Support

## Demo

Kokoro TTS is an open-source CLI tool that delivers high-quality text-to-speech right from your terminal. Think of it as your personal voice studio, capable of transforming any text into natural-sounding speech with minimal effort.

https://github.com/user-attachments/assets/8413e640-59e9-490e-861d-49187e967526

[Demo Audio (MP3)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.mp3) | [Demo Audio (WAV)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.wav)

## TODO

- [x] Add GPU support
- [x] Add PDF support
- [ ] Add GUI

## Prerequisites

- Python 3.9-3.12 (Python 3.13+ is not currently supported)

## Installation

### Method 1: Install from PyPI (Recommended)

The easiest way to install Kokoro TTS is from PyPI:

```bash
# Using uv (recommended)
uv tool install kokoro-tts

# Using pip
pip install kokoro-tts
```

After installation, you can run:
```bash
kokoro-tts --help
```

### Method 2: Install from Git

Install directly from the repository:

```bash
# Using uv (recommended)
uv tool install git+https://github.com/nazdridoy/kokoro-tts

# Using pip
pip install git+https://github.com/nazdridoy/kokoro-tts
```

### Method 3: Clone and Install Locally

1. Clone the repository:
```bash
git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts
```

2. Install the package:

**With `uv` (recommended):**
```bash
uv venv
uv pip install -e .
```

**With `pip`:**
```bash
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -e .
```

3. Run the tool:
```bash
# If using uv
uv run kokoro-tts --help

# If using pip with activated venv
kokoro-tts --help
```

### Method 4: Run Without Installation

If you prefer to run without installing:

1. Clone the repository:
```bash
git clone https://github.com/nazdridoy/kokoro-tts.git
cd kokoro-tts
```

2. Install dependencies only:

**With `uv`:**
```bash
uv venv
uv sync
```

**With `pip`:**
```bash
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
```

3. Run directly:
```bash
# With uv
uv run -m kokoro_tts --help

# With pip (venv activated)
python -m kokoro_tts --help
```

### Download Model Files

After installation, download the required model files to your working directory:

```bash
# Download voice data (bin format is preferred)
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/voices-v1.0.bin

# Download the model
wget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/kokoro-v1.0.onnx
```

> [!NOTE]
> The script will automatically use `voices-v1.0.bin` if present, falling back to `voices.json` if the bin file is not available. Place these files in the same directory where you run the `kokoro-tts` command.

## Supported voices:

| **Category** | **Voices** | **Language Code** |
| --- | --- | --- |
| 🇺🇸 👩 | af\_alloy, af\_aoede, af\_bella, af\_heart, af\_jessica, af\_kore, af\_nicole, af\_nova, af\_river, af\_sarah, af\_sky | **en-us** |
| 🇺🇸 👨 | am\_adam, am\_echo, am\_eric, am\_fenrir, am\_liam, am\_michael, am\_onyx, am\_puck | **en-us** |
| 🇬🇧 | bf\_alice, bf\_emma, bf\_isabella, bf\_lily, bm\_daniel, bm\_fable, bm\_george, bm\_lewis | **en-gb** |
| 🇫🇷 | ff\_siwis | **fr-fr** |
| 🇮🇹 | if\_sara, im\_nicola | **it** |
| 🇯🇵 | jf\_alpha, jf\_gongitsune, jf\_nezumi, jf\_tebukuro, jm\_kumo | **ja** |
| 🇨🇳 | zf\_xiaobei, zf\_xiaoni, zf\_xiaoxiao, zf\_xiaoyi, zm\_yunjian, zm\_yunxi, zm\_yunxia, zm\_yunyang | **cmn** |

## Usage

### Basic Usage

```bash
kokoro-tts <input_text_file> [<output_audio_file>] [options]
```

> [!NOTE]
> - If you installed via Method 1 (PyPI) or Method 2 (git install), use `kokoro-tts` directly
> - If you installed via Method 3 (local install), use `uv run kokoro-tts` or activate your virtual environment first
> - If you're using Method 4 (no install), use `uv run -m kokoro_tts` or `python -m kokoro_tts` with activated venv

### Commands

- `-h, --help`: Show help message
- `--help-languages`: List supported languages
- `--help-voices`: List available voices
- `--merge-chunks`: Merge existing chunks into chapter files

### Options

- `--stream`: Stream audio instead of saving to file
- `--speed <float>`: Set speech speed (default: 1.0)
- `--lang <str>`: Set language (default: en-us)
- `--voice <str>`: Set voice or blend voices (default: interactive selection)
  - Single voice: Use voice name (e.g., "af_sarah")
  - Blended voices: Use "voice1:weight,voice2:weight" format
- `--split-output <dir>`: Save each chunk as separate file in directory
- `--format <str>`: Audio format: wav or mp3 (default: wav)
- `--debug`: Show detailed debug information during processing

### Input Formats

- `.txt`: Text file input
- `.epub`: EPUB book input (will process chapters)
- `.pdf`: PDF document input (extracts chapters from TOC or content)
- `-` or `/dev/stdin` (Linux/macOS) or `CONIN$` (Windows): Standard input (stdin)

### Examples

```bash
# Basic usage with output file
kokoro-tts input.txt output.wav --speed 1.2 --lang en-us --voice af_sarah

# Read from standard input (stdin)
echo "Hello World" | kokoro-tts - --stream
cat input.txt | kokoro-tts - output.wav

# Cross-platform stdin support:
# Linux/macOS: echo "text" | kokoro-tts - --stream
# Windows: echo "text" | kokoro-tts - --stream
# All platforms also support: kokoro-tts /dev/stdin --stream (Linux/macOS) or kokoro-tts CONIN$ --stream (Windows)

# Use voice blending (60-40 mix)
kokoro-tts input.txt output.wav --voice "af_sarah:60,am_adam:40"

# Use equal voice blend (50-50)
kokoro-tts input.txt --stream --voice "am_adam,af_sarah"

# Process EPUB and split into chunks
kokoro-tts input.epub --split-output ./chunks/ --format mp3

# Stream audio directly
kokoro-tts input.txt --stream --speed 0.8

# Merge existing chunks
kokoro-tts --merge-chunks --split-output ./chunks/ --format wav

# Process EPUB with detailed debug output
kokoro-tts input.epub --split-output ./chunks/ --debug

# Process PDF and split into chapters
kokoro-tts input.pdf --split-output ./chunks/ --format mp3

# List available voices
kokoro-tts --help-voices

# List supported languages
kokoro-tts --help-languages
```

> [!TIP]
> If you're using Method 3, replace `kokoro-tts` with `uv run kokoro-tts` in the examples above.
> If you're using Method 4, replace `kokoro-tts` with `uv run -m kokoro_tts` or `python -m kokoro_tts` in the examples above.

## Features in Detail

### EPUB Processing
- Automatically extracts chapters from EPUB files
- Preserves chapter titles and structure
- Creates organized output for each chapter
- Detailed debug output available for troubleshooting

### Audio Processing
- Chunks long text into manageable segments
- Supports streaming for immediate playback
- Voice blending with customizable mix ratios
- Progress indicators for long processes
- Handles interruptions gracefully

### Output Options
- Single file output
- Split output with chapter organization
- Chunk merging capability
- Multiple audio format support

### Debug Mode
- Shows detailed information about file processing
- Displays NCX parsing details for EPUB files
- Lists all found chapters and their metadata
- Helps troubleshoot processing issues

### Input Options
- Text file input (.txt)
- EPUB book input (.epub)
- Standard input (stdin)
- Supports piping from other programs

## Contributing

This is a personal project. But if you want to contribute, please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Acknowledgments

- [Kokoro-ONNX](https://github.com/thewh1teagle/kokoro-onnx)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "kokoro-tts",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.9",
    "maintainer_email": null,
    "keywords": "audio, cli, epub, kokoro, pdf, text-to-speech, tts",
    "author": null,
    "author_email": "Nazmul Hossain <nazdridoy@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/54/5d/9f2898473632151e9343ce870f76de014775f35339a2c9438d2493c21c6b/kokoro_tts-2.2.1.tar.gz",
    "platform": null,
    "description": "# Kokoro TTS\n\nA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.\n\n![ngpt-s-c](https://raw.githubusercontent.com/nazdridoy/kokoro-tts/main/previews/kokoro-tts-h.png)\n\n## Features\n\n- Multiple language and voice support\n- Voice blending with customizable weights\n- EPUB, PDF and TXT file input support\n- Standard input (stdin) and `|` piping from other programs\n- Streaming audio playback\n- Split output into chapters\n- Adjustable speech speed\n- WAV and MP3 output formats\n- Chapter merging capability\n- Detailed debug output option\n- GPU Support\n\n## Demo\n\nKokoro TTS is an open-source CLI tool that delivers high-quality text-to-speech right from your terminal. Think of it as your personal voice studio, capable of transforming any text into natural-sounding speech with minimal effort.\n\nhttps://github.com/user-attachments/assets/8413e640-59e9-490e-861d-49187e967526\n\n[Demo Audio (MP3)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.mp3) | [Demo Audio (WAV)](https://github.com/nazdridoy/kokoro-tts/raw/main/previews/demo.wav)\n\n## TODO\n\n- [x] Add GPU support\n- [x] Add PDF support\n- [ ] Add GUI\n\n## Prerequisites\n\n- Python 3.9-3.12 (Python 3.13+ is not currently supported)\n\n## Installation\n\n### Method 1: Install from PyPI (Recommended)\n\nThe easiest way to install Kokoro TTS is from PyPI:\n\n```bash\n# Using uv (recommended)\nuv tool install kokoro-tts\n\n# Using pip\npip install kokoro-tts\n```\n\nAfter installation, you can run:\n```bash\nkokoro-tts --help\n```\n\n### Method 2: Install from Git\n\nInstall directly from the repository:\n\n```bash\n# Using uv (recommended)\nuv tool install git+https://github.com/nazdridoy/kokoro-tts\n\n# Using pip\npip install git+https://github.com/nazdridoy/kokoro-tts\n```\n\n### Method 3: Clone and Install Locally\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/nazdridoy/kokoro-tts.git\ncd kokoro-tts\n```\n\n2. Install the package:\n\n**With `uv` (recommended):**\n```bash\nuv venv\nuv pip install -e .\n```\n\n**With `pip`:**\n```bash\npython -m venv .venv\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\npip install -e .\n```\n\n3. Run the tool:\n```bash\n# If using uv\nuv run kokoro-tts --help\n\n# If using pip with activated venv\nkokoro-tts --help\n```\n\n### Method 4: Run Without Installation\n\nIf you prefer to run without installing:\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/nazdridoy/kokoro-tts.git\ncd kokoro-tts\n```\n\n2. Install dependencies only:\n\n**With `uv`:**\n```bash\nuv venv\nuv sync\n```\n\n**With `pip`:**\n```bash\npython -m venv .venv\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\npip install -r requirements.txt\n```\n\n3. Run directly:\n```bash\n# With uv\nuv run -m kokoro_tts --help\n\n# With pip (venv activated)\npython -m kokoro_tts --help\n```\n\n### Download Model Files\n\nAfter installation, download the required model files to your working directory:\n\n```bash\n# Download voice data (bin format is preferred)\nwget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/voices-v1.0.bin\n\n# Download the model\nwget https://github.com/nazdridoy/kokoro-tts/releases/download/v1.0.0/kokoro-v1.0.onnx\n```\n\n> [!NOTE]\n> The script will automatically use `voices-v1.0.bin` if present, falling back to `voices.json` if the bin file is not available. Place these files in the same directory where you run the `kokoro-tts` command.\n\n## Supported voices:\n\n| **Category** | **Voices** | **Language Code** |\n| --- | --- | --- |\n| \ud83c\uddfa\ud83c\uddf8 \ud83d\udc69 | af\\_alloy, af\\_aoede, af\\_bella, af\\_heart, af\\_jessica, af\\_kore, af\\_nicole, af\\_nova, af\\_river, af\\_sarah, af\\_sky | **en-us** |\n| \ud83c\uddfa\ud83c\uddf8 \ud83d\udc68 | am\\_adam, am\\_echo, am\\_eric, am\\_fenrir, am\\_liam, am\\_michael, am\\_onyx, am\\_puck | **en-us** |\n| \ud83c\uddec\ud83c\udde7 | bf\\_alice, bf\\_emma, bf\\_isabella, bf\\_lily, bm\\_daniel, bm\\_fable, bm\\_george, bm\\_lewis | **en-gb** |\n| \ud83c\uddeb\ud83c\uddf7 | ff\\_siwis | **fr-fr** |\n| \ud83c\uddee\ud83c\uddf9 | if\\_sara, im\\_nicola | **it** |\n| \ud83c\uddef\ud83c\uddf5 | jf\\_alpha, jf\\_gongitsune, jf\\_nezumi, jf\\_tebukuro, jm\\_kumo | **ja** |\n| \ud83c\udde8\ud83c\uddf3 | zf\\_xiaobei, zf\\_xiaoni, zf\\_xiaoxiao, zf\\_xiaoyi, zm\\_yunjian, zm\\_yunxi, zm\\_yunxia, zm\\_yunyang | **cmn** |\n\n## Usage\n\n### Basic Usage\n\n```bash\nkokoro-tts <input_text_file> [<output_audio_file>] [options]\n```\n\n> [!NOTE]\n> - If you installed via Method 1 (PyPI) or Method 2 (git install), use `kokoro-tts` directly\n> - If you installed via Method 3 (local install), use `uv run kokoro-tts` or activate your virtual environment first\n> - If you're using Method 4 (no install), use `uv run -m kokoro_tts` or `python -m kokoro_tts` with activated venv\n\n### Commands\n\n- `-h, --help`: Show help message\n- `--help-languages`: List supported languages\n- `--help-voices`: List available voices\n- `--merge-chunks`: Merge existing chunks into chapter files\n\n### Options\n\n- `--stream`: Stream audio instead of saving to file\n- `--speed <float>`: Set speech speed (default: 1.0)\n- `--lang <str>`: Set language (default: en-us)\n- `--voice <str>`: Set voice or blend voices (default: interactive selection)\n  - Single voice: Use voice name (e.g., \"af_sarah\")\n  - Blended voices: Use \"voice1:weight,voice2:weight\" format\n- `--split-output <dir>`: Save each chunk as separate file in directory\n- `--format <str>`: Audio format: wav or mp3 (default: wav)\n- `--debug`: Show detailed debug information during processing\n\n### Input Formats\n\n- `.txt`: Text file input\n- `.epub`: EPUB book input (will process chapters)\n- `.pdf`: PDF document input (extracts chapters from TOC or content)\n- `-` or `/dev/stdin` (Linux/macOS) or `CONIN$` (Windows): Standard input (stdin)\n\n### Examples\n\n```bash\n# Basic usage with output file\nkokoro-tts input.txt output.wav --speed 1.2 --lang en-us --voice af_sarah\n\n# Read from standard input (stdin)\necho \"Hello World\" | kokoro-tts - --stream\ncat input.txt | kokoro-tts - output.wav\n\n# Cross-platform stdin support:\n# Linux/macOS: echo \"text\" | kokoro-tts - --stream\n# Windows: echo \"text\" | kokoro-tts - --stream\n# All platforms also support: kokoro-tts /dev/stdin --stream (Linux/macOS) or kokoro-tts CONIN$ --stream (Windows)\n\n# Use voice blending (60-40 mix)\nkokoro-tts input.txt output.wav --voice \"af_sarah:60,am_adam:40\"\n\n# Use equal voice blend (50-50)\nkokoro-tts input.txt --stream --voice \"am_adam,af_sarah\"\n\n# Process EPUB and split into chunks\nkokoro-tts input.epub --split-output ./chunks/ --format mp3\n\n# Stream audio directly\nkokoro-tts input.txt --stream --speed 0.8\n\n# Merge existing chunks\nkokoro-tts --merge-chunks --split-output ./chunks/ --format wav\n\n# Process EPUB with detailed debug output\nkokoro-tts input.epub --split-output ./chunks/ --debug\n\n# Process PDF and split into chapters\nkokoro-tts input.pdf --split-output ./chunks/ --format mp3\n\n# List available voices\nkokoro-tts --help-voices\n\n# List supported languages\nkokoro-tts --help-languages\n```\n\n> [!TIP]\n> If you're using Method 3, replace `kokoro-tts` with `uv run kokoro-tts` in the examples above.\n> If you're using Method 4, replace `kokoro-tts` with `uv run -m kokoro_tts` or `python -m kokoro_tts` in the examples above.\n\n## Features in Detail\n\n### EPUB Processing\n- Automatically extracts chapters from EPUB files\n- Preserves chapter titles and structure\n- Creates organized output for each chapter\n- Detailed debug output available for troubleshooting\n\n### Audio Processing\n- Chunks long text into manageable segments\n- Supports streaming for immediate playback\n- Voice blending with customizable mix ratios\n- Progress indicators for long processes\n- Handles interruptions gracefully\n\n### Output Options\n- Single file output\n- Split output with chapter organization\n- Chunk merging capability\n- Multiple audio format support\n\n### Debug Mode\n- Shows detailed information about file processing\n- Displays NCX parsing details for EPUB files\n- Lists all found chapters and their metadata\n- Helps troubleshoot processing issues\n\n### Input Options\n- Text file input (.txt)\n- EPUB book input (.epub)\n- Standard input (stdin)\n- Supports piping from other programs\n\n## Contributing\n\nThis is a personal project. But if you want to contribute, please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- [Kokoro-ONNX](https://github.com/thewh1teagle/kokoro-onnx)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.",
    "version": "2.2.1",
    "project_urls": {
        "Homepage": "https://github.com/nazdridoy/kokoro-tts",
        "Issues": "https://github.com/nazdridoy/kokoro-tts/issues",
        "Repository": "https://github.com/nazdridoy/kokoro-tts"
    },
    "split_keywords": [
        "audio",
        " cli",
        " epub",
        " kokoro",
        " pdf",
        " text-to-speech",
        " tts"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5ce781a57f57bc4b9f7d197c952c46dbb0e6370a9fc896d814bcaa677c70cbf0",
                "md5": "1d6939e494e3fffd4c7721768d378b82",
                "sha256": "e02e6073a7e2e0f31f09195b1767bb675667a2797bd2ec43170c1f44bd99e3b8"
            },
            "downloads": -1,
            "filename": "kokoro_tts-2.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1d6939e494e3fffd4c7721768d378b82",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.9",
            "size": 17959,
            "upload_time": "2025-08-14T22:12:58",
            "upload_time_iso_8601": "2025-08-14T22:12:58.883281Z",
            "url": "https://files.pythonhosted.org/packages/5c/e7/81a57f57bc4b9f7d197c952c46dbb0e6370a9fc896d814bcaa677c70cbf0/kokoro_tts-2.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "545d9f2898473632151e9343ce870f76de014775f35339a2c9438d2493c21c6b",
                "md5": "44c87b950028add71a969c8fd370f274",
                "sha256": "8288ca6d0f6b8ff51f205c30f251a64baf3788af25772633f5bf91ba662f2bda"
            },
            "downloads": -1,
            "filename": "kokoro_tts-2.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "44c87b950028add71a969c8fd370f274",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.9",
            "size": 1151937,
            "upload_time": "2025-08-14T22:13:00",
            "upload_time_iso_8601": "2025-08-14T22:13:00.262134Z",
            "url": "https://files.pythonhosted.org/packages/54/5d/9f2898473632151e9343ce870f76de014775f35339a2c9438d2493c21c6b/kokoro_tts-2.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-14 22:13:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nazdridoy",
    "github_project": "kokoro-tts",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "beautifulsoup4",
            "specs": []
        },
        {
            "name": "ebooklib",
            "specs": []
        },
        {
            "name": "PyMuPDF",
            "specs": []
        },
        {
            "name": "kokoro-onnx",
            "specs": [
                [
                    "==",
                    "0.3.9"
                ]
            ]
        },
        {
            "name": "pymupdf4llm",
            "specs": []
        },
        {
            "name": "sounddevice",
            "specs": []
        },
        {
            "name": "soundfile",
            "specs": []
        }
    ],
    "lcname": "kokoro-tts"
}
        
Elapsed time: 1.67291s