ttsfm


Namettsfm JSON
Version 3.2.7 PyPI version JSON
download
home_pageNone
SummaryText-to-Speech API Client with OpenAI compatibility
upload_time2025-08-24 06:46:18
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords tts text-to-speech speech-synthesis openai api-client audio voice speech
VCS
bugtrack_url
requirements requests aiohttp fake-useragent
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TTSFM - Text-to-Speech API Client

> **Language / ่ฏญ่จ€**: [English](README.md) | [ไธญๆ–‡](README.zh.md)

[![Docker Pulls](https://img.shields.io/docker/pulls/dbcccc/ttsfm?style=flat-square&logo=docker)](https://hub.docker.com/r/dbcccc/ttsfm)
[![GitHub Stars](https://img.shields.io/github/stars/dbccccccc/ttsfm?style=social)](https://github.com/dbccccccc/ttsfm)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)

## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=dbccccccc/ttsfm&type=Date)](https://www.star-history.com/#dbccccccc/ttsfm&Date)

๐ŸŽค **A modern, free Text-to-Speech API client with OpenAI compatibility**

TTSFM provides both synchronous and asynchronous Python clients for text-to-speech generation using the reverse-engineered openai.fm service. No API keys required - completely free to use!

## โœจ Key Features

- ๐Ÿ†“ **Completely Free** - Uses reverse-engineered openai.fm service (no API keys needed)
- ๐ŸŽฏ **OpenAI-Compatible** - Drop-in replacement for OpenAI's TTS API (`/v1/audio/speech`)
- โšก **Async & Sync** - Both `asyncio` and synchronous clients available
- ๐Ÿ—ฃ๏ธ **11 Voices** - All OpenAI-compatible voices (alloy, echo, fable, onyx, nova, shimmer, etc.)
- ๐ŸŽต **6 Audio Formats** - MP3, WAV, OPUS, AAC, FLAC, PCM support
- ๐Ÿณ **Docker Ready** - One-command deployment with web interface
- ๐ŸŒ **Web Interface** - Interactive playground for testing voices and formats
- ๐Ÿ”ง **CLI Tool** - Command-line interface for quick TTS generation
- ๐Ÿ“ฆ **Type Hints** - Full type annotation support for better IDE experience
- ๐Ÿ›ก๏ธ **Error Handling** - Comprehensive exception hierarchy with retry logic
- โœจ **Auto-Combine** - Automatically handles long text with seamless audio combining
- ๐Ÿ“Š **Text Validation** - Automatic text length validation and splitting
- ๐Ÿ” **API Key Protection** - Optional OpenAI-compatible authentication for secure deployments

## ๐Ÿ“ฆ Installation

### Quick Install

```bash
pip install ttsfm
```

### Installation Options

```bash
# Basic installation (sync client only)
pip install ttsfm

# With web application support
pip install ttsfm[web]

# With development tools
pip install ttsfm[dev]

# With documentation tools
pip install ttsfm[docs]

# Install all optional dependencies
pip install ttsfm[web,dev,docs]
```

### System Requirements

- **Python**: 3.8+ (tested on 3.8, 3.9, 3.10, 3.11, 3.12)
- **OS**: Windows, macOS, Linux
- **Dependencies**: `requests`, `aiohttp`, `fake-useragent`

## ๐Ÿš€ Quick Start

### ๐Ÿณ Docker (Recommended)

Run TTSFM with web interface and OpenAI-compatible API:

```bash
# Using GitHub Container Registry
docker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest

# Using Docker Hub
docker run -p 8000:8000 dbcccc/ttsfm:latest
```

**Available endpoints:**
- ๐ŸŒ **Web Interface**: http://localhost:8000
- ๐Ÿ”— **OpenAI API**: http://localhost:8000/v1/audio/speech
- ๐Ÿ“Š **Health Check**: http://localhost:8000/api/health

**Test the API:**

```bash
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini-tts","input":"Hello world!","voice":"alloy"}' \
  --output speech.mp3
```

### ๐Ÿ“ฆ Python Package

#### Synchronous Client

```python
from ttsfm import TTSClient, Voice, AudioFormat

# Create client (uses free openai.fm service)
client = TTSClient()

# Generate speech
response = client.generate_speech(
    text="Hello! This is TTSFM - a free TTS service.",
    voice=Voice.CORAL,
    response_format=AudioFormat.MP3
)

# Save the audio file
response.save_to_file("output")  # Saves as output.mp3

# Or get raw audio data
audio_bytes = response.audio_data
print(f"Generated {len(audio_bytes)} bytes of audio")
```

#### Asynchronous Client

```python
import asyncio
from ttsfm import AsyncTTSClient, Voice

async def generate_speech():
    async with AsyncTTSClient() as client:
        response = await client.generate_speech(
            text="Async TTS generation!",
            voice=Voice.NOVA
        )
        response.save_to_file("async_output")

# Run async function
asyncio.run(generate_speech())
```

#### Long Text Processing (Python Package)

For developers who need fine-grained control over text splitting:

```python
from ttsfm import TTSClient, Voice, AudioFormat

# Create client
client = TTSClient()

# Generate speech from long text (creates separate files for each chunk)
responses = client.generate_speech_long_text(
    text="Very long text that exceeds 4096 characters...",
    voice=Voice.ALLOY,
    response_format=AudioFormat.MP3,
    max_length=2000,
    preserve_words=True
)

# Save each chunk as separate files
for i, response in enumerate(responses, 1):
    response.save_to_file(f"part_{i:03d}")  # Saves as part_001.mp3, part_002.mp3, etc.

print(f"Generated {len(responses)} audio files from long text")
```

#### OpenAI Python Client Compatibility

```python
from openai import OpenAI

# Point to TTSFM Docker container (no API key required by default)
client = OpenAI(
    api_key="not-needed",  # TTSFM is free by default
    base_url="http://localhost:8000/v1"
)

# When API key protection is enabled
client_with_auth = OpenAI(
    api_key="your-secret-api-key",  # Your TTSFM API key
    base_url="http://localhost:8000/v1"
)

# Generate speech (exactly like OpenAI)
response = client.audio.speech.create(
    model="gpt-4o-mini-tts",
    voice="alloy",
    input="Hello from TTSFM!"
)

response.stream_to_file("output.mp3")
```

#### Auto-Combine Feature for Long Text

TTSFM automatically handles long text (>4096 characters) with the new auto-combine feature:

```python
from openai import OpenAI

client = OpenAI(
    api_key="not-needed",
    base_url="http://localhost:8000/v1"
)

# Long text is automatically split and combined into a single audio file
long_article = """
Your very long article or document content here...
This can be thousands of characters long and TTSFM will
automatically split it into chunks, generate audio for each,
and combine them into a single seamless audio file.
""" * 100  # Make it really long

# This works seamlessly - no manual splitting needed!
response = client.audio.speech.create(
    model="gpt-4o-mini-tts",
    voice="nova",
    input=long_article,
    # auto_combine=True is the default
)

response.stream_to_file("long_article.mp3")  # Single combined file!

# Disable auto-combine for strict OpenAI compatibility
response = client.audio.speech.create(
    model="gpt-4o-mini-tts",
    voice="nova",
    input="Short text only",
    auto_combine=False  # Will error if text > 4096 chars
)
```

### ๐Ÿ–ฅ๏ธ Command Line Interface

```bash
# Basic usage
ttsfm "Hello, world!" --output hello.mp3

# Specify voice and format
ttsfm "Hello, world!" --voice nova --format wav --output hello.wav

# From file
ttsfm --text-file input.txt --output speech.mp3

# Custom service URL
ttsfm "Hello, world!" --url http://localhost:7000 --output hello.mp3

# List available voices
ttsfm --list-voices

# Get help
ttsfm --help
```

## โš™๏ธ Configuration

TTSFM automatically uses the free openai.fm service - **no configuration or API keys required by default!**

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `REQUIRE_API_KEY` | `false` | Enable API key protection |
| `TTSFM_API_KEY` | `None` | Your secret API key |
| `HOST` | `localhost` | Server host |
| `PORT` | `8000` | Server port |
| `DEBUG` | `false` | Debug mode |

### Python Client Configuration

```python
from ttsfm import TTSClient

# Default client (uses openai.fm, no API key needed)
client = TTSClient()

# Custom configuration
client = TTSClient(
    base_url="https://www.openai.fm",  # Default
    timeout=30.0,                     # Request timeout
    max_retries=3,                    # Retry attempts
    verify_ssl=True                   # SSL verification
)

# For TTSFM server with API key protection
protected_client = TTSClient(
    base_url="http://localhost:8000",
    api_key="your-ttsfm-api-key"
)

# For other custom TTS services
custom_client = TTSClient(
    base_url="http://your-tts-service.com",
    api_key="your-api-key-if-needed"
)
```

## ๐Ÿ—ฃ๏ธ Available Voices

TTSFM supports all **11 OpenAI-compatible voices**:

| Voice | Description | Best For |
|-------|-------------|----------|
| `alloy` | Balanced and versatile | General purpose, neutral tone |
| `ash` | Clear and articulate | Professional, business content |
| `ballad` | Smooth and melodic | Storytelling, audiobooks |
| `coral` | Warm and friendly | Customer service, tutorials |
| `echo` | Resonant and clear | Announcements, presentations |
| `fable` | Expressive and dynamic | Creative content, entertainment |
| `nova` | Bright and energetic | Marketing, upbeat content |
| `onyx` | Deep and authoritative | News, serious content |
| `sage` | Wise and measured | Educational, informative |
| `shimmer` | Light and airy | Casual, conversational |
| `verse` | Rhythmic and flowing | Poetry, artistic content |

```python
from ttsfm import Voice

# Use enum values
response = client.generate_speech("Hello!", voice=Voice.CORAL)

# Or use string values
response = client.generate_speech("Hello!", voice="coral")

# Test different voices
for voice in Voice:
    response = client.generate_speech(f"This is {voice.value} voice", voice=voice)
    response.save_to_file(f"test_{voice.value}")
```

## ๐ŸŽต Audio Formats

TTSFM supports **6 audio formats** with different quality and compression options:

| Format | Extension | Quality | File Size | Use Case |
|--------|-----------|---------|-----------|----------|
| `mp3` | `.mp3` | Good | Small | Web, mobile apps, general use |
| `opus` | `.opus` | Excellent | Small | Web streaming, VoIP |
| `aac` | `.aac` | Good | Medium | Apple devices, streaming |
| `flac` | `.flac` | Lossless | Large | High-quality archival |
| `wav` | `.wav` | Lossless | Large | Professional audio |
| `pcm` | `.pcm` | Raw | Large | Audio processing |

### **Usage Examples**

```python
from ttsfm import TTSClient, AudioFormat

client = TTSClient()

# Generate in different formats
formats = [
    AudioFormat.MP3,   # Most common
    AudioFormat.OPUS,  # Best compression
    AudioFormat.AAC,   # Apple compatible
    AudioFormat.FLAC,  # Lossless
    AudioFormat.WAV,   # Uncompressed
    AudioFormat.PCM    # Raw audio
]

for fmt in formats:
    response = client.generate_speech(
        text="Testing audio format",
        response_format=fmt
    )
    response.save_to_file(f"test.{fmt.value}")
```

### **Format Selection Guide**

- **Choose MP3** for:
  - Web applications
  - Mobile apps
  - Smaller file sizes
  - General-purpose audio

- **Choose OPUS** for:
  - Web streaming
  - VoIP applications
  - Best compression ratio
  - Real-time audio

- **Choose AAC** for:
  - Apple devices
  - Streaming services
  - Good quality/size balance

- **Choose FLAC** for:
  - Archival purposes
  - Lossless compression
  - Professional workflows

- **Choose WAV** for:
  - Professional audio production
  - Maximum compatibility
  - When file size is not a concern

- **Choose PCM** for:
  - Audio processing
  - Raw audio data
  - Custom applications

> **Note**: The library automatically optimizes requests to deliver the best quality for your chosen format. Files are always saved with the correct extension based on the audio format.



## ๐ŸŒ Web Interface

TTSFM includes a **beautiful web interface** for testing and experimentation:

![Web Interface](https://img.shields.io/badge/Web%20Interface-Available-brightgreen?style=flat-square)

**Features:**
- ๐ŸŽฎ **Interactive Playground** - Test voices and formats in real-time
- ๐Ÿ“ **Text Validation** - Character count and length validation
- ๐ŸŽ›๏ธ **Advanced Options** - Voice instructions, auto-split long text
- ๐Ÿ“Š **Audio Player** - Built-in player with duration and file size info
- ๐Ÿ“ฅ **Download Support** - Download individual or batch audio files
- ๐ŸŽฒ **Random Text** - Generate random sample text for testing
- ๐Ÿ“ฑ **Responsive Design** - Works on desktop, tablet, and mobile

Access at: http://localhost:8000 (when running Docker container)

## ๐Ÿ”— API Endpoints

When running the Docker container, these endpoints are available:

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/` | GET | Web interface |
| `/playground` | GET | Interactive TTS playground |
| `/v1/audio/speech` | POST | OpenAI-compatible TTS API |
| `/v1/models` | GET | List available models |
| `/api/health` | GET | Health check endpoint |
| `/api/voices` | GET | List available voices |
| `/api/formats` | GET | List supported audio formats |
| `/api/validate-text` | POST | Validate text length |

### OpenAI-Compatible API

```bash
# Generate speech (short text) - no API key required by default
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini-tts",
    "input": "Hello, this is a test!",
    "voice": "alloy",
    "response_format": "mp3"
  }' \
  --output speech.mp3

# Generate speech with API key (when protection is enabled)
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-secret-api-key" \
  -d '{
    "model": "gpt-4o-mini-tts",
    "input": "Hello, this is a test!",
    "voice": "alloy",
    "response_format": "mp3"
  }' \
  --output speech.mp3

# Generate speech from long text with auto-combine (default behavior)
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini-tts",
    "input": "This is a very long text that exceeds the 4096 character limit...",
    "voice": "alloy",
    "response_format": "mp3",
    "auto_combine": true
  }' \
  --output long_speech.mp3

# Generate speech from long text without auto-combine (will return error if text > 4096 chars)
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini-tts",
    "input": "Your text here...",
    "voice": "alloy",
    "response_format": "mp3",
    "auto_combine": false
  }' \
  --output speech.mp3

# List models
curl http://localhost:8000/v1/models

# Health check
curl http://localhost:8000/api/health
```

#### **New Parameter: `auto_combine`**

TTSFM extends the OpenAI API with an optional `auto_combine` parameter:

- **`auto_combine`** (boolean, optional, default: `true`)
  - When `true`: Automatically splits long text (>4096 chars) into chunks, generates audio for each chunk, and combines them into a single seamless audio file
  - When `false`: Returns an error if text exceeds the 4096 character limit (standard OpenAI behavior)
  - **Benefits**: No need to manually manage text splitting or audio file merging for long content

## ๐Ÿณ Docker Deployment

### Quick Start

```bash
# Run with default settings (no API key required)
docker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest

# Run with API key protection enabled
docker run -p 8000:8000 \
  -e REQUIRE_API_KEY=true \
  -e TTSFM_API_KEY=your-secret-api-key \
  ghcr.io/dbccccccc/ttsfm:latest

# Run with custom port
docker run -p 3000:8000 ghcr.io/dbccccccc/ttsfm:latest

# Run in background
docker run -d -p 8000:8000 --name ttsfm ghcr.io/dbccccccc/ttsfm:latest
```

### Docker Compose

```yaml
version: '3.8'
services:
  ttsfm:
    image: ghcr.io/dbccccccc/ttsfm:latest
    ports:
      - "8000:8000"
    environment:
      - PORT=8000
      # Optional: Enable API key protection
      - REQUIRE_API_KEY=false
      - TTSFM_API_KEY=your-secret-api-key-here
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
```

### Available Images

| Registry | Image | Description |
|----------|-------|-------------|
| GitHub Container Registry | `ghcr.io/dbccccccc/ttsfm:latest` | Latest stable release |
| Docker Hub | `dbcccc/ttsfm:latest` | Mirror on Docker Hub |
| GitHub Container Registry | `ghcr.io/dbccccccc/ttsfm:v3.2.2` | Specific version |

## ๐Ÿ› ๏ธ Advanced Usage

### Error Handling

```python
from ttsfm import TTSClient, TTSException, APIException, NetworkException

client = TTSClient()

try:
    response = client.generate_speech("Hello, world!")
    response.save_to_file("output")
except NetworkException as e:
    print(f"Network error: {e}")
except APIException as e:
    print(f"API error: {e}")
except TTSException as e:
    print(f"TTS error: {e}")
```

### Text Validation and Splitting

```python
from ttsfm.utils import validate_text_length, split_text_by_length

# Validate text length
text = "Your long text here..."
is_valid, length = validate_text_length(text, max_length=4096)

if not is_valid:
    # Split long text into chunks
    chunks = split_text_by_length(text, max_length=4000)

    # Generate speech for each chunk
    for i, chunk in enumerate(chunks):
        response = client.generate_speech(chunk)
        response.save_to_file(f"output_part_{i}")
```

### Custom Headers and User Agents

```python
from ttsfm import TTSClient

# Client automatically uses realistic headers
client = TTSClient()

# Headers include:
# - Realistic User-Agent strings
# - Accept headers for audio content
# - Connection keep-alive
# - Accept-Encoding for compression
```

## ๐Ÿ”ง Development

### Local Development

```bash
# Clone repository
git clone https://github.com/dbccccccc/ttsfm.git
cd ttsfm

# Install in development mode
pip install -e .[dev]

# Run tests
pytest

# Run web application
cd ttsfm-web
python app.py
```

### Building Docker Image

```bash
# Build image
docker build -t ttsfm:local .

# Run local image
docker run -p 8000:8000 ttsfm:local
```

### Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## ๐Ÿ“Š Performance

### Benchmarks

- **Latency**: ~1-3 seconds for typical text (depends on openai.fm service)
- **Throughput**: Supports concurrent requests with async client
- **Text Limits**: No limits with auto-combine! Handles text of any length automatically
- **Audio Quality**: High-quality synthesis comparable to OpenAI

### Optimization Tips

```python
# Use async client for better performance
async with AsyncTTSClient() as client:
    # Process multiple requests concurrently
    tasks = [
        client.generate_speech(f"Text {i}")
        for i in range(10)
    ]
    responses = await asyncio.gather(*tasks)

# Reuse client instances
client = TTSClient()
for text in texts:
    response = client.generate_speech(text)  # Reuses connection
```

## ๐Ÿ” API Key Protection (Optional)

TTSFM supports **OpenAI-compatible API key authentication** for secure deployments:

### Quick Setup

```bash
# Enable API key protection
export REQUIRE_API_KEY=true
export TTSFM_API_KEY=your-secret-api-key

# Run with protection enabled
docker run -p 8000:8000 \
  -e REQUIRE_API_KEY=true \
  -e TTSFM_API_KEY=your-secret-api-key \
  ghcr.io/dbccccccc/ttsfm:latest
```

### Authentication Methods

API keys are accepted in **OpenAI-compatible format**:

```python
from openai import OpenAI

# Standard OpenAI format
client = OpenAI(
    api_key="your-secret-api-key",
    base_url="http://localhost:8000/v1"
)

# Or using curl
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Authorization: Bearer your-secret-api-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini-tts","input":"Hello!","voice":"alloy"}'
```

### Features

- ๐Ÿ”‘ **OpenAI-Compatible**: Uses standard `Authorization: Bearer` header
- ๐Ÿ›ก๏ธ **Multiple Auth Methods**: Header, query param, or JSON body
- ๐ŸŽ›๏ธ **Configurable**: Easy enable/disable via environment variables
- ๐Ÿ“Š **Security Logging**: Tracks invalid access attempts
- ๐ŸŒ **Web Interface**: Automatic API key field detection

### Protected Endpoints

When enabled, these endpoints require authentication:
- `POST /v1/audio/speech` - Speech generation
- `POST /api/generate` - Legacy speech generation
- `POST /api/generate-combined` - Combined speech generation

### Public Endpoints

These remain accessible without authentication:
- `GET /` - Web interface
- `GET /playground` - Interactive playground
- `GET /api/health` - Health check
- `GET /api/voices` - Available voices
- `GET /api/formats` - Supported formats

## ๐Ÿ”’ Security & Privacy

- **Optional API Keys**: Free by default, secure when needed
- **No Data Storage**: Audio is generated on-demand, not stored
- **HTTPS Support**: Secure connections to TTS service
- **No Tracking**: TTSFM doesn't collect or store user data
- **Open Source**: Full source code available for audit

## ๐Ÿ“‹ Changelog

See [CHANGELOG.md](CHANGELOG.md) for detailed version history.

### Latest Changes (v3.2.3)

- โœจ **Auto-Combine by Default**: Long text is now automatically split and combined into single audio files
- ๐Ÿ”„ **Unified API Endpoint**: Single `/v1/audio/speech` endpoint handles both short and long text intelligently
- ๐ŸŽ›๏ธ **Configurable Behavior**: New `auto_combine` parameter (default: `true`) for full control
- ๐Ÿค– **Enhanced OpenAI Compatibility**: Drop-in replacement with intelligent long-text handling
- ๐Ÿ“Š **Rich Response Headers**: `X-Auto-Combine`, `X-Chunks-Combined`, and processing metadata
- ๐Ÿงน **Streamlined Web Interface**: Removed legacy batch processing for cleaner user experience
- ๐Ÿ“– **Simplified Documentation**: Web docs emphasize modern auto-combine approach
- ๐ŸŽฎ **Enhanced Playground**: Clean interface focused on auto-combine functionality
- ๐Ÿ” **API Key Protection**: Optional OpenAI-compatible authentication for secure deployments
- ๐Ÿ›ก๏ธ **Security Features**: Comprehensive access control with detailed logging

## ๐Ÿค Support & Community

- ๐Ÿ› **Bug Reports**: [GitHub Issues](https://github.com/dbccccccc/ttsfm/issues)
- ๐Ÿ’ฌ **Discussions**: [GitHub Discussions](https://github.com/dbccccccc/ttsfm/discussions)
- ๐Ÿ‘ค **Author**: [@dbcccc](https://github.com/dbccccccc)
- โญ **Star the Project**: If you find TTSFM useful, please star it on GitHub!

## ๐Ÿ“„ License

MIT License - see [LICENSE](LICENSE) file for details.

## ๐Ÿ™ Acknowledgments

- **OpenAI**: For the original TTS API design
- **openai.fm**: For providing the free TTS service
- **Community**: Thanks to all users and contributors who help improve TTSFM

---

<div align="center">

**TTSFM** - Free Text-to-Speech API with OpenAI Compatibility

[![GitHub](https://img.shields.io/badge/GitHub-dbccccccc/ttsfm-blue?style=flat-square&logo=github)](https://github.com/dbccccccc/ttsfm)
[![PyPI](https://img.shields.io/badge/PyPI-ttsfm-blue?style=flat-square&logo=pypi)](https://pypi.org/project/ttsfm/)
[![Docker](https://img.shields.io/badge/Docker-dbcccc/ttsfm-blue?style=flat-square&logo=docker)](https://hub.docker.com/r/dbcccc/ttsfm)

---

## ๐Ÿ“– Documentation

- ๐Ÿ‡บ๐Ÿ‡ธ **English**: [README.md](README.md)
- ๐Ÿ‡จ๐Ÿ‡ณ **ไธญๆ–‡**: [README.zh.md](README.zh.md)

Made with โค๏ธ by [@dbcccc](https://github.com/dbccccccc)

</div>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "ttsfm",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "dbcccc <120614547+dbccccccc@users.noreply.github.com>",
    "keywords": "tts, text-to-speech, speech-synthesis, openai, api-client, audio, voice, speech",
    "author": null,
    "author_email": "dbcccc <120614547+dbccccccc@users.noreply.github.com>",
    "download_url": "https://files.pythonhosted.org/packages/fc/b9/1f6b437993ffba9e8a83f395e33c6af80971620cb344ae614558b79a13bf/ttsfm-3.2.7.tar.gz",
    "platform": null,
    "description": "# TTSFM - Text-to-Speech API Client\n\n> **Language / \u8bed\u8a00**: [English](README.md) | [\u4e2d\u6587](README.zh.md)\n\n[![Docker Pulls](https://img.shields.io/docker/pulls/dbcccc/ttsfm?style=flat-square&logo=docker)](https://hub.docker.com/r/dbcccc/ttsfm)\n[![GitHub Stars](https://img.shields.io/github/stars/dbccccccc/ttsfm?style=social)](https://github.com/dbccccccc/ttsfm)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=dbccccccc/ttsfm&type=Date)](https://www.star-history.com/#dbccccccc/ttsfm&Date)\n\n\ud83c\udfa4 **A modern, free Text-to-Speech API client with OpenAI compatibility**\n\nTTSFM provides both synchronous and asynchronous Python clients for text-to-speech generation using the reverse-engineered openai.fm service. No API keys required - completely free to use!\n\n## \u2728 Key Features\n\n- \ud83c\udd93 **Completely Free** - Uses reverse-engineered openai.fm service (no API keys needed)\n- \ud83c\udfaf **OpenAI-Compatible** - Drop-in replacement for OpenAI's TTS API (`/v1/audio/speech`)\n- \u26a1 **Async & Sync** - Both `asyncio` and synchronous clients available\n- \ud83d\udde3\ufe0f **11 Voices** - All OpenAI-compatible voices (alloy, echo, fable, onyx, nova, shimmer, etc.)\n- \ud83c\udfb5 **6 Audio Formats** - MP3, WAV, OPUS, AAC, FLAC, PCM support\n- \ud83d\udc33 **Docker Ready** - One-command deployment with web interface\n- \ud83c\udf10 **Web Interface** - Interactive playground for testing voices and formats\n- \ud83d\udd27 **CLI Tool** - Command-line interface for quick TTS generation\n- \ud83d\udce6 **Type Hints** - Full type annotation support for better IDE experience\n- \ud83d\udee1\ufe0f **Error Handling** - Comprehensive exception hierarchy with retry logic\n- \u2728 **Auto-Combine** - Automatically handles long text with seamless audio combining\n- \ud83d\udcca **Text Validation** - Automatic text length validation and splitting\n- \ud83d\udd10 **API Key Protection** - Optional OpenAI-compatible authentication for secure deployments\n\n## \ud83d\udce6 Installation\n\n### Quick Install\n\n```bash\npip install ttsfm\n```\n\n### Installation Options\n\n```bash\n# Basic installation (sync client only)\npip install ttsfm\n\n# With web application support\npip install ttsfm[web]\n\n# With development tools\npip install ttsfm[dev]\n\n# With documentation tools\npip install ttsfm[docs]\n\n# Install all optional dependencies\npip install ttsfm[web,dev,docs]\n```\n\n### System Requirements\n\n- **Python**: 3.8+ (tested on 3.8, 3.9, 3.10, 3.11, 3.12)\n- **OS**: Windows, macOS, Linux\n- **Dependencies**: `requests`, `aiohttp`, `fake-useragent`\n\n## \ud83d\ude80 Quick Start\n\n### \ud83d\udc33 Docker (Recommended)\n\nRun TTSFM with web interface and OpenAI-compatible API:\n\n```bash\n# Using GitHub Container Registry\ndocker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest\n\n# Using Docker Hub\ndocker run -p 8000:8000 dbcccc/ttsfm:latest\n```\n\n**Available endpoints:**\n- \ud83c\udf10 **Web Interface**: http://localhost:8000\n- \ud83d\udd17 **OpenAI API**: http://localhost:8000/v1/audio/speech\n- \ud83d\udcca **Health Check**: http://localhost:8000/api/health\n\n**Test the API:**\n\n```bash\ncurl -X POST http://localhost:8000/v1/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"gpt-4o-mini-tts\",\"input\":\"Hello world!\",\"voice\":\"alloy\"}' \\\n  --output speech.mp3\n```\n\n### \ud83d\udce6 Python Package\n\n#### Synchronous Client\n\n```python\nfrom ttsfm import TTSClient, Voice, AudioFormat\n\n# Create client (uses free openai.fm service)\nclient = TTSClient()\n\n# Generate speech\nresponse = client.generate_speech(\n    text=\"Hello! This is TTSFM - a free TTS service.\",\n    voice=Voice.CORAL,\n    response_format=AudioFormat.MP3\n)\n\n# Save the audio file\nresponse.save_to_file(\"output\")  # Saves as output.mp3\n\n# Or get raw audio data\naudio_bytes = response.audio_data\nprint(f\"Generated {len(audio_bytes)} bytes of audio\")\n```\n\n#### Asynchronous Client\n\n```python\nimport asyncio\nfrom ttsfm import AsyncTTSClient, Voice\n\nasync def generate_speech():\n    async with AsyncTTSClient() as client:\n        response = await client.generate_speech(\n            text=\"Async TTS generation!\",\n            voice=Voice.NOVA\n        )\n        response.save_to_file(\"async_output\")\n\n# Run async function\nasyncio.run(generate_speech())\n```\n\n#### Long Text Processing (Python Package)\n\nFor developers who need fine-grained control over text splitting:\n\n```python\nfrom ttsfm import TTSClient, Voice, AudioFormat\n\n# Create client\nclient = TTSClient()\n\n# Generate speech from long text (creates separate files for each chunk)\nresponses = client.generate_speech_long_text(\n    text=\"Very long text that exceeds 4096 characters...\",\n    voice=Voice.ALLOY,\n    response_format=AudioFormat.MP3,\n    max_length=2000,\n    preserve_words=True\n)\n\n# Save each chunk as separate files\nfor i, response in enumerate(responses, 1):\n    response.save_to_file(f\"part_{i:03d}\")  # Saves as part_001.mp3, part_002.mp3, etc.\n\nprint(f\"Generated {len(responses)} audio files from long text\")\n```\n\n#### OpenAI Python Client Compatibility\n\n```python\nfrom openai import OpenAI\n\n# Point to TTSFM Docker container (no API key required by default)\nclient = OpenAI(\n    api_key=\"not-needed\",  # TTSFM is free by default\n    base_url=\"http://localhost:8000/v1\"\n)\n\n# When API key protection is enabled\nclient_with_auth = OpenAI(\n    api_key=\"your-secret-api-key\",  # Your TTSFM API key\n    base_url=\"http://localhost:8000/v1\"\n)\n\n# Generate speech (exactly like OpenAI)\nresponse = client.audio.speech.create(\n    model=\"gpt-4o-mini-tts\",\n    voice=\"alloy\",\n    input=\"Hello from TTSFM!\"\n)\n\nresponse.stream_to_file(\"output.mp3\")\n```\n\n#### Auto-Combine Feature for Long Text\n\nTTSFM automatically handles long text (>4096 characters) with the new auto-combine feature:\n\n```python\nfrom openai import OpenAI\n\nclient = OpenAI(\n    api_key=\"not-needed\",\n    base_url=\"http://localhost:8000/v1\"\n)\n\n# Long text is automatically split and combined into a single audio file\nlong_article = \"\"\"\nYour very long article or document content here...\nThis can be thousands of characters long and TTSFM will\nautomatically split it into chunks, generate audio for each,\nand combine them into a single seamless audio file.\n\"\"\" * 100  # Make it really long\n\n# This works seamlessly - no manual splitting needed!\nresponse = client.audio.speech.create(\n    model=\"gpt-4o-mini-tts\",\n    voice=\"nova\",\n    input=long_article,\n    # auto_combine=True is the default\n)\n\nresponse.stream_to_file(\"long_article.mp3\")  # Single combined file!\n\n# Disable auto-combine for strict OpenAI compatibility\nresponse = client.audio.speech.create(\n    model=\"gpt-4o-mini-tts\",\n    voice=\"nova\",\n    input=\"Short text only\",\n    auto_combine=False  # Will error if text > 4096 chars\n)\n```\n\n### \ud83d\udda5\ufe0f Command Line Interface\n\n```bash\n# Basic usage\nttsfm \"Hello, world!\" --output hello.mp3\n\n# Specify voice and format\nttsfm \"Hello, world!\" --voice nova --format wav --output hello.wav\n\n# From file\nttsfm --text-file input.txt --output speech.mp3\n\n# Custom service URL\nttsfm \"Hello, world!\" --url http://localhost:7000 --output hello.mp3\n\n# List available voices\nttsfm --list-voices\n\n# Get help\nttsfm --help\n```\n\n## \u2699\ufe0f Configuration\n\nTTSFM automatically uses the free openai.fm service - **no configuration or API keys required by default!**\n\n### Environment Variables\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `REQUIRE_API_KEY` | `false` | Enable API key protection |\n| `TTSFM_API_KEY` | `None` | Your secret API key |\n| `HOST` | `localhost` | Server host |\n| `PORT` | `8000` | Server port |\n| `DEBUG` | `false` | Debug mode |\n\n### Python Client Configuration\n\n```python\nfrom ttsfm import TTSClient\n\n# Default client (uses openai.fm, no API key needed)\nclient = TTSClient()\n\n# Custom configuration\nclient = TTSClient(\n    base_url=\"https://www.openai.fm\",  # Default\n    timeout=30.0,                     # Request timeout\n    max_retries=3,                    # Retry attempts\n    verify_ssl=True                   # SSL verification\n)\n\n# For TTSFM server with API key protection\nprotected_client = TTSClient(\n    base_url=\"http://localhost:8000\",\n    api_key=\"your-ttsfm-api-key\"\n)\n\n# For other custom TTS services\ncustom_client = TTSClient(\n    base_url=\"http://your-tts-service.com\",\n    api_key=\"your-api-key-if-needed\"\n)\n```\n\n## \ud83d\udde3\ufe0f Available Voices\n\nTTSFM supports all **11 OpenAI-compatible voices**:\n\n| Voice | Description | Best For |\n|-------|-------------|----------|\n| `alloy` | Balanced and versatile | General purpose, neutral tone |\n| `ash` | Clear and articulate | Professional, business content |\n| `ballad` | Smooth and melodic | Storytelling, audiobooks |\n| `coral` | Warm and friendly | Customer service, tutorials |\n| `echo` | Resonant and clear | Announcements, presentations |\n| `fable` | Expressive and dynamic | Creative content, entertainment |\n| `nova` | Bright and energetic | Marketing, upbeat content |\n| `onyx` | Deep and authoritative | News, serious content |\n| `sage` | Wise and measured | Educational, informative |\n| `shimmer` | Light and airy | Casual, conversational |\n| `verse` | Rhythmic and flowing | Poetry, artistic content |\n\n```python\nfrom ttsfm import Voice\n\n# Use enum values\nresponse = client.generate_speech(\"Hello!\", voice=Voice.CORAL)\n\n# Or use string values\nresponse = client.generate_speech(\"Hello!\", voice=\"coral\")\n\n# Test different voices\nfor voice in Voice:\n    response = client.generate_speech(f\"This is {voice.value} voice\", voice=voice)\n    response.save_to_file(f\"test_{voice.value}\")\n```\n\n## \ud83c\udfb5 Audio Formats\n\nTTSFM supports **6 audio formats** with different quality and compression options:\n\n| Format | Extension | Quality | File Size | Use Case |\n|--------|-----------|---------|-----------|----------|\n| `mp3` | `.mp3` | Good | Small | Web, mobile apps, general use |\n| `opus` | `.opus` | Excellent | Small | Web streaming, VoIP |\n| `aac` | `.aac` | Good | Medium | Apple devices, streaming |\n| `flac` | `.flac` | Lossless | Large | High-quality archival |\n| `wav` | `.wav` | Lossless | Large | Professional audio |\n| `pcm` | `.pcm` | Raw | Large | Audio processing |\n\n### **Usage Examples**\n\n```python\nfrom ttsfm import TTSClient, AudioFormat\n\nclient = TTSClient()\n\n# Generate in different formats\nformats = [\n    AudioFormat.MP3,   # Most common\n    AudioFormat.OPUS,  # Best compression\n    AudioFormat.AAC,   # Apple compatible\n    AudioFormat.FLAC,  # Lossless\n    AudioFormat.WAV,   # Uncompressed\n    AudioFormat.PCM    # Raw audio\n]\n\nfor fmt in formats:\n    response = client.generate_speech(\n        text=\"Testing audio format\",\n        response_format=fmt\n    )\n    response.save_to_file(f\"test.{fmt.value}\")\n```\n\n### **Format Selection Guide**\n\n- **Choose MP3** for:\n  - Web applications\n  - Mobile apps\n  - Smaller file sizes\n  - General-purpose audio\n\n- **Choose OPUS** for:\n  - Web streaming\n  - VoIP applications\n  - Best compression ratio\n  - Real-time audio\n\n- **Choose AAC** for:\n  - Apple devices\n  - Streaming services\n  - Good quality/size balance\n\n- **Choose FLAC** for:\n  - Archival purposes\n  - Lossless compression\n  - Professional workflows\n\n- **Choose WAV** for:\n  - Professional audio production\n  - Maximum compatibility\n  - When file size is not a concern\n\n- **Choose PCM** for:\n  - Audio processing\n  - Raw audio data\n  - Custom applications\n\n> **Note**: The library automatically optimizes requests to deliver the best quality for your chosen format. Files are always saved with the correct extension based on the audio format.\n\n\n\n## \ud83c\udf10 Web Interface\n\nTTSFM includes a **beautiful web interface** for testing and experimentation:\n\n![Web Interface](https://img.shields.io/badge/Web%20Interface-Available-brightgreen?style=flat-square)\n\n**Features:**\n- \ud83c\udfae **Interactive Playground** - Test voices and formats in real-time\n- \ud83d\udcdd **Text Validation** - Character count and length validation\n- \ud83c\udf9b\ufe0f **Advanced Options** - Voice instructions, auto-split long text\n- \ud83d\udcca **Audio Player** - Built-in player with duration and file size info\n- \ud83d\udce5 **Download Support** - Download individual or batch audio files\n- \ud83c\udfb2 **Random Text** - Generate random sample text for testing\n- \ud83d\udcf1 **Responsive Design** - Works on desktop, tablet, and mobile\n\nAccess at: http://localhost:8000 (when running Docker container)\n\n## \ud83d\udd17 API Endpoints\n\nWhen running the Docker container, these endpoints are available:\n\n| Endpoint | Method | Description |\n|----------|--------|-------------|\n| `/` | GET | Web interface |\n| `/playground` | GET | Interactive TTS playground |\n| `/v1/audio/speech` | POST | OpenAI-compatible TTS API |\n| `/v1/models` | GET | List available models |\n| `/api/health` | GET | Health check endpoint |\n| `/api/voices` | GET | List available voices |\n| `/api/formats` | GET | List supported audio formats |\n| `/api/validate-text` | POST | Validate text length |\n\n### OpenAI-Compatible API\n\n```bash\n# Generate speech (short text) - no API key required by default\ncurl -X POST http://localhost:8000/v1/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"gpt-4o-mini-tts\",\n    \"input\": \"Hello, this is a test!\",\n    \"voice\": \"alloy\",\n    \"response_format\": \"mp3\"\n  }' \\\n  --output speech.mp3\n\n# Generate speech with API key (when protection is enabled)\ncurl -X POST http://localhost:8000/v1/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -H \"Authorization: Bearer your-secret-api-key\" \\\n  -d '{\n    \"model\": \"gpt-4o-mini-tts\",\n    \"input\": \"Hello, this is a test!\",\n    \"voice\": \"alloy\",\n    \"response_format\": \"mp3\"\n  }' \\\n  --output speech.mp3\n\n# Generate speech from long text with auto-combine (default behavior)\ncurl -X POST http://localhost:8000/v1/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"gpt-4o-mini-tts\",\n    \"input\": \"This is a very long text that exceeds the 4096 character limit...\",\n    \"voice\": \"alloy\",\n    \"response_format\": \"mp3\",\n    \"auto_combine\": true\n  }' \\\n  --output long_speech.mp3\n\n# Generate speech from long text without auto-combine (will return error if text > 4096 chars)\ncurl -X POST http://localhost:8000/v1/audio/speech \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"gpt-4o-mini-tts\",\n    \"input\": \"Your text here...\",\n    \"voice\": \"alloy\",\n    \"response_format\": \"mp3\",\n    \"auto_combine\": false\n  }' \\\n  --output speech.mp3\n\n# List models\ncurl http://localhost:8000/v1/models\n\n# Health check\ncurl http://localhost:8000/api/health\n```\n\n#### **New Parameter: `auto_combine`**\n\nTTSFM extends the OpenAI API with an optional `auto_combine` parameter:\n\n- **`auto_combine`** (boolean, optional, default: `true`)\n  - When `true`: Automatically splits long text (>4096 chars) into chunks, generates audio for each chunk, and combines them into a single seamless audio file\n  - When `false`: Returns an error if text exceeds the 4096 character limit (standard OpenAI behavior)\n  - **Benefits**: No need to manually manage text splitting or audio file merging for long content\n\n## \ud83d\udc33 Docker Deployment\n\n### Quick Start\n\n```bash\n# Run with default settings (no API key required)\ndocker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest\n\n# Run with API key protection enabled\ndocker run -p 8000:8000 \\\n  -e REQUIRE_API_KEY=true \\\n  -e TTSFM_API_KEY=your-secret-api-key \\\n  ghcr.io/dbccccccc/ttsfm:latest\n\n# Run with custom port\ndocker run -p 3000:8000 ghcr.io/dbccccccc/ttsfm:latest\n\n# Run in background\ndocker run -d -p 8000:8000 --name ttsfm ghcr.io/dbccccccc/ttsfm:latest\n```\n\n### Docker Compose\n\n```yaml\nversion: '3.8'\nservices:\n  ttsfm:\n    image: ghcr.io/dbccccccc/ttsfm:latest\n    ports:\n      - \"8000:8000\"\n    environment:\n      - PORT=8000\n      # Optional: Enable API key protection\n      - REQUIRE_API_KEY=false\n      - TTSFM_API_KEY=your-secret-api-key-here\n    restart: unless-stopped\n    healthcheck:\n      test: [\"CMD\", \"curl\", \"-f\", \"http://localhost:8000/api/health\"]\n      interval: 30s\n      timeout: 10s\n      retries: 3\n```\n\n### Available Images\n\n| Registry | Image | Description |\n|----------|-------|-------------|\n| GitHub Container Registry | `ghcr.io/dbccccccc/ttsfm:latest` | Latest stable release |\n| Docker Hub | `dbcccc/ttsfm:latest` | Mirror on Docker Hub |\n| GitHub Container Registry | `ghcr.io/dbccccccc/ttsfm:v3.2.2` | Specific version |\n\n## \ud83d\udee0\ufe0f Advanced Usage\n\n### Error Handling\n\n```python\nfrom ttsfm import TTSClient, TTSException, APIException, NetworkException\n\nclient = TTSClient()\n\ntry:\n    response = client.generate_speech(\"Hello, world!\")\n    response.save_to_file(\"output\")\nexcept NetworkException as e:\n    print(f\"Network error: {e}\")\nexcept APIException as e:\n    print(f\"API error: {e}\")\nexcept TTSException as e:\n    print(f\"TTS error: {e}\")\n```\n\n### Text Validation and Splitting\n\n```python\nfrom ttsfm.utils import validate_text_length, split_text_by_length\n\n# Validate text length\ntext = \"Your long text here...\"\nis_valid, length = validate_text_length(text, max_length=4096)\n\nif not is_valid:\n    # Split long text into chunks\n    chunks = split_text_by_length(text, max_length=4000)\n\n    # Generate speech for each chunk\n    for i, chunk in enumerate(chunks):\n        response = client.generate_speech(chunk)\n        response.save_to_file(f\"output_part_{i}\")\n```\n\n### Custom Headers and User Agents\n\n```python\nfrom ttsfm import TTSClient\n\n# Client automatically uses realistic headers\nclient = TTSClient()\n\n# Headers include:\n# - Realistic User-Agent strings\n# - Accept headers for audio content\n# - Connection keep-alive\n# - Accept-Encoding for compression\n```\n\n## \ud83d\udd27 Development\n\n### Local Development\n\n```bash\n# Clone repository\ngit clone https://github.com/dbccccccc/ttsfm.git\ncd ttsfm\n\n# Install in development mode\npip install -e .[dev]\n\n# Run tests\npytest\n\n# Run web application\ncd ttsfm-web\npython app.py\n```\n\n### Building Docker Image\n\n```bash\n# Build image\ndocker build -t ttsfm:local .\n\n# Run local image\ndocker run -p 8000:8000 ttsfm:local\n```\n\n### Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## \ud83d\udcca Performance\n\n### Benchmarks\n\n- **Latency**: ~1-3 seconds for typical text (depends on openai.fm service)\n- **Throughput**: Supports concurrent requests with async client\n- **Text Limits**: No limits with auto-combine! Handles text of any length automatically\n- **Audio Quality**: High-quality synthesis comparable to OpenAI\n\n### Optimization Tips\n\n```python\n# Use async client for better performance\nasync with AsyncTTSClient() as client:\n    # Process multiple requests concurrently\n    tasks = [\n        client.generate_speech(f\"Text {i}\")\n        for i in range(10)\n    ]\n    responses = await asyncio.gather(*tasks)\n\n# Reuse client instances\nclient = TTSClient()\nfor text in texts:\n    response = client.generate_speech(text)  # Reuses connection\n```\n\n## \ud83d\udd10 API Key Protection (Optional)\n\nTTSFM supports **OpenAI-compatible API key authentication** for secure deployments:\n\n### Quick Setup\n\n```bash\n# Enable API key protection\nexport REQUIRE_API_KEY=true\nexport TTSFM_API_KEY=your-secret-api-key\n\n# Run with protection enabled\ndocker run -p 8000:8000 \\\n  -e REQUIRE_API_KEY=true \\\n  -e TTSFM_API_KEY=your-secret-api-key \\\n  ghcr.io/dbccccccc/ttsfm:latest\n```\n\n### Authentication Methods\n\nAPI keys are accepted in **OpenAI-compatible format**:\n\n```python\nfrom openai import OpenAI\n\n# Standard OpenAI format\nclient = OpenAI(\n    api_key=\"your-secret-api-key\",\n    base_url=\"http://localhost:8000/v1\"\n)\n\n# Or using curl\ncurl -X POST http://localhost:8000/v1/audio/speech \\\n  -H \"Authorization: Bearer your-secret-api-key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"model\":\"gpt-4o-mini-tts\",\"input\":\"Hello!\",\"voice\":\"alloy\"}'\n```\n\n### Features\n\n- \ud83d\udd11 **OpenAI-Compatible**: Uses standard `Authorization: Bearer` header\n- \ud83d\udee1\ufe0f **Multiple Auth Methods**: Header, query param, or JSON body\n- \ud83c\udf9b\ufe0f **Configurable**: Easy enable/disable via environment variables\n- \ud83d\udcca **Security Logging**: Tracks invalid access attempts\n- \ud83c\udf10 **Web Interface**: Automatic API key field detection\n\n### Protected Endpoints\n\nWhen enabled, these endpoints require authentication:\n- `POST /v1/audio/speech` - Speech generation\n- `POST /api/generate` - Legacy speech generation\n- `POST /api/generate-combined` - Combined speech generation\n\n### Public Endpoints\n\nThese remain accessible without authentication:\n- `GET /` - Web interface\n- `GET /playground` - Interactive playground\n- `GET /api/health` - Health check\n- `GET /api/voices` - Available voices\n- `GET /api/formats` - Supported formats\n\n## \ud83d\udd12 Security & Privacy\n\n- **Optional API Keys**: Free by default, secure when needed\n- **No Data Storage**: Audio is generated on-demand, not stored\n- **HTTPS Support**: Secure connections to TTS service\n- **No Tracking**: TTSFM doesn't collect or store user data\n- **Open Source**: Full source code available for audit\n\n## \ud83d\udccb Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for detailed version history.\n\n### Latest Changes (v3.2.3)\n\n- \u2728 **Auto-Combine by Default**: Long text is now automatically split and combined into single audio files\n- \ud83d\udd04 **Unified API Endpoint**: Single `/v1/audio/speech` endpoint handles both short and long text intelligently\n- \ud83c\udf9b\ufe0f **Configurable Behavior**: New `auto_combine` parameter (default: `true`) for full control\n- \ud83e\udd16 **Enhanced OpenAI Compatibility**: Drop-in replacement with intelligent long-text handling\n- \ud83d\udcca **Rich Response Headers**: `X-Auto-Combine`, `X-Chunks-Combined`, and processing metadata\n- \ud83e\uddf9 **Streamlined Web Interface**: Removed legacy batch processing for cleaner user experience\n- \ud83d\udcd6 **Simplified Documentation**: Web docs emphasize modern auto-combine approach\n- \ud83c\udfae **Enhanced Playground**: Clean interface focused on auto-combine functionality\n- \ud83d\udd10 **API Key Protection**: Optional OpenAI-compatible authentication for secure deployments\n- \ud83d\udee1\ufe0f **Security Features**: Comprehensive access control with detailed logging\n\n## \ud83e\udd1d Support & Community\n\n- \ud83d\udc1b **Bug Reports**: [GitHub Issues](https://github.com/dbccccccc/ttsfm/issues)\n- \ud83d\udcac **Discussions**: [GitHub Discussions](https://github.com/dbccccccc/ttsfm/discussions)\n- \ud83d\udc64 **Author**: [@dbcccc](https://github.com/dbccccccc)\n- \u2b50 **Star the Project**: If you find TTSFM useful, please star it on GitHub!\n\n## \ud83d\udcc4 License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- **OpenAI**: For the original TTS API design\n- **openai.fm**: For providing the free TTS service\n- **Community**: Thanks to all users and contributors who help improve TTSFM\n\n---\n\n<div align=\"center\">\n\n**TTSFM** - Free Text-to-Speech API with OpenAI Compatibility\n\n[![GitHub](https://img.shields.io/badge/GitHub-dbccccccc/ttsfm-blue?style=flat-square&logo=github)](https://github.com/dbccccccc/ttsfm)\n[![PyPI](https://img.shields.io/badge/PyPI-ttsfm-blue?style=flat-square&logo=pypi)](https://pypi.org/project/ttsfm/)\n[![Docker](https://img.shields.io/badge/Docker-dbcccc/ttsfm-blue?style=flat-square&logo=docker)](https://hub.docker.com/r/dbcccc/ttsfm)\n\n---\n\n## \ud83d\udcd6 Documentation\n\n- \ud83c\uddfa\ud83c\uddf8 **English**: [README.md](README.md)\n- \ud83c\udde8\ud83c\uddf3 **\u4e2d\u6587**: [README.zh.md](README.zh.md)\n\nMade with \u2764\ufe0f by [@dbcccc](https://github.com/dbccccccc)\n\n</div>\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Text-to-Speech API Client with OpenAI compatibility",
    "version": "3.2.7",
    "project_urls": {
        "Bug Tracker": "https://github.com/dbccccccc/ttsfm/issues",
        "Documentation": "https://github.com/dbccccccc/ttsfm/blob/main/docs/",
        "Homepage": "https://github.com/dbccccccc/ttsfm",
        "Repository": "https://github.com/dbccccccc/ttsfm"
    },
    "split_keywords": [
        "tts",
        " text-to-speech",
        " speech-synthesis",
        " openai",
        " api-client",
        " audio",
        " voice",
        " speech"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fc5e86a4179f12ea469f452937ab44179980cee5401c7bf8d13146dc4024bc98",
                "md5": "0b15e10980d028180df23af195aa0153",
                "sha256": "8ffd2cef1cf3ff881c56296549254f5ac633251239144bf909936ba1207200ce"
            },
            "downloads": -1,
            "filename": "ttsfm-3.2.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0b15e10980d028180df23af195aa0153",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 33829,
            "upload_time": "2025-08-24T06:46:17",
            "upload_time_iso_8601": "2025-08-24T06:46:17.369624Z",
            "url": "https://files.pythonhosted.org/packages/fc/5e/86a4179f12ea469f452937ab44179980cee5401c7bf8d13146dc4024bc98/ttsfm-3.2.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fcb91f6b437993ffba9e8a83f395e33c6af80971620cb344ae614558b79a13bf",
                "md5": "473afcca9d67e70d77809c6e6e8b93bf",
                "sha256": "68eec3bc75a39a0c8c435ecc55366aee236fe9229a158e3334197f544334f730"
            },
            "downloads": -1,
            "filename": "ttsfm-3.2.7.tar.gz",
            "has_sig": false,
            "md5_digest": "473afcca9d67e70d77809c6e6e8b93bf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 278098,
            "upload_time": "2025-08-24T06:46:18",
            "upload_time_iso_8601": "2025-08-24T06:46:18.540319Z",
            "url": "https://files.pythonhosted.org/packages/fc/b9/1f6b437993ffba9e8a83f395e33c6af80971620cb344ae614558b79a13bf/ttsfm-3.2.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-24 06:46:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dbccccccc",
    "github_project": "ttsfm",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.25.0"
                ]
            ]
        },
        {
            "name": "aiohttp",
            "specs": [
                [
                    ">=",
                    "3.8.0"
                ]
            ]
        },
        {
            "name": "fake-useragent",
            "specs": [
                [
                    ">=",
                    "1.4.0"
                ]
            ]
        }
    ],
    "lcname": "ttsfm"
}
        
Elapsed time: 1.11648s