podcast-tts


Namepodcast-tts JSON
Version 0.0.1 PyPI version JSON
download
home_pagehttps://github.com/puntorigen/podcast_tts
SummaryGenerate high-quality TTS audio for podcasts and dialogues.
upload_time2024-11-20 20:10:32
maintainerNone
docs_urlNone
authorPablo Schaffner
requires_python>=3.8
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Podcast TTS

`podcast_tts` is a Python library for generating podcasts and dialogues using text-to-speech (TTS). It supports multiple speakers, background music, and precise audio mixing for professional-quality results.

## Features

- **Multi-Speaker Support**: Generate dialogues with distinct speaker profiles.
- **Premade Voices**: Use premade speaker profiles (male1, male2, female2) included with the library or create custom profiles.
- **Dynamic Speaker Generation**: Automatically generates new speaker profiles if the specified speaker does not exist, saving the profiles in the `voices` subfolder for future use.
- **Consistent Role Assignment**: Ensures consistency by assigning and reusing speaker profiles based on the speaker name.
- **Channel-Specific Playback**: Allows audio to be played on the left, right, or both channels for spatial separation.
- **Text Normalization**: Automatically normalize text, handle contractions, and format special cases.
- **Background Music Integration**: Add background music with fade-in/out and volume control.
- **MP3 and URL Support**: Use local MP3/WAV files or download music from a URL with caching.
- **Output Formats**: Save generated audio as WAV or MP3 files.


## Installation

```bash
pip install podcast_tts
```

## Usage

### Generating Audio for a Single Speaker

```python 
import asyncio
from podcast_tts import PodcastTTS

async def main():
    tts = PodcastTTS(speed=5)
    await tts.generate_wav(
        text="Hello! Welcome to our podcast.",
        speaker="male1",
        filename="output_audio.wav",
        channel="both"
    )

if __name__ == "__main__":
    asyncio.run(main())
``` 

### Example: Generating a Podcast with Music

The generate_podcast method combines dialogue and background music for a seamless podcast production.

```python 
import asyncio
from podcast_tts import PodcastTTS

async def main():
    tts = PodcastTTS(speed=5)

    # Define speakers and text
    texts = [
        {"male1": ["Welcome to the podcast!", "both"]},
        {"female2": ["Today, we discuss AI advancements.", "left"]},
        {"male2": ["Don't miss our exciting updates.", "right"]},
    ]

    # Define background music (local file or URL)
    music_config = ["https://example.com/background_music.mp3", 10, 3, 0.3]

    # Generate the podcast
    output_file = await tts.generate_podcast(
        texts=texts,
        music=music_config,
        filename="podcast_with_music.mp3",
        pause_duration=0.5,
        normalize=True
    )

    print(f"Podcast saved to: {output_file}")

if __name__ == "__main__":
    asyncio.run(main())
```

### Music Configuration:

- [file/url, full_volume_duration, fade_duration, target_volume]
    - **file/url**: Path to a local MP3/WAV file or a URL to download.
    - **full_volume_duration**: Time (seconds) at full volume before dialogue starts and after ends.
    - **fade_duration**: Time (seconds) for fade-in/out effects.
    - **target_volum**e: Volume level (0.0 to 1.0) during dialogue playback.

## Premade Voices

PodcastTTS includes the following premade speaker profiles:

- male1
- male2
- female2

These profiles are included in the package's **default_voices** directory and can be used without additional setup.


## Dynamic Speaker Generation

When a speaker profile is specified but does not exist, the library will automatically generate a new speaker profile and save it in the voices subfolder. This ensures consistent voice roles across different turns in a dialogue.
For example:

```python
texts = [
    {"Narrator": ["Welcome to this exciting episode.", "left"]},
    {"Expert": ["Today, we'll explore AI's impact on healthcare.", "right"]},
]
# If "Narrator" or "Expert" profiles do not exist, they will be generated dynamically.
```

The profiles are saved in the script's voices directory and reused automatically if the same speaker is used in the future for consistency.

## Loading Existing Speaker Profiles

You can load any speaker profile by specifying its filename (without the .txt extension). Profiles are stored in the voices subfolder, so you don't need to specify the path explicitly.

```python
# Assuming a speaker profile "Host.txt" exists in the voices subfolder
await tts.generate_wav("This is a test for an existing speaker.", "Host", "existing_speaker.wav")
```

## Additional Notes

- The library uses ChatTTS for high-quality TTS generation.
- Text is automatically cleaned and split into manageable chunks, making it easy to generate audio for long scripts or conversations.
- The generated audio files are saved in WAV format, with support for channel-specific playback.

## Contributing

Contributions are welcome! Feel free to submit issues or pull requests on the GitHub repository.

## License
This project is licensed under the MIT License. See the LICENSE file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/puntorigen/podcast_tts",
    "name": "podcast-tts",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Pablo Schaffner",
    "author_email": "pablo@puntorigen.com",
    "download_url": "https://files.pythonhosted.org/packages/20/96/48782f48c9b500e8ca9f722403ee5c6842113f1c7eb374e6b90956ea516b/podcast_tts-0.0.1.tar.gz",
    "platform": null,
    "description": "# Podcast TTS\n\n`podcast_tts` is a Python library for generating podcasts and dialogues using text-to-speech (TTS). It supports multiple speakers, background music, and precise audio mixing for professional-quality results.\n\n## Features\n\n- **Multi-Speaker Support**: Generate dialogues with distinct speaker profiles.\n- **Premade Voices**: Use premade speaker profiles (male1, male2, female2) included with the library or create custom profiles.\n- **Dynamic Speaker Generation**: Automatically generates new speaker profiles if the specified speaker does not exist, saving the profiles in the `voices` subfolder for future use.\n- **Consistent Role Assignment**: Ensures consistency by assigning and reusing speaker profiles based on the speaker name.\n- **Channel-Specific Playback**: Allows audio to be played on the left, right, or both channels for spatial separation.\n- **Text Normalization**: Automatically normalize text, handle contractions, and format special cases.\n- **Background Music Integration**: Add background music with fade-in/out and volume control.\n- **MP3 and URL Support**: Use local MP3/WAV files or download music from a URL with caching.\n- **Output Formats**: Save generated audio as WAV or MP3 files.\n\n\n## Installation\n\n```bash\npip install podcast_tts\n```\n\n## Usage\n\n### Generating Audio for a Single Speaker\n\n```python \nimport asyncio\nfrom podcast_tts import PodcastTTS\n\nasync def main():\n    tts = PodcastTTS(speed=5)\n    await tts.generate_wav(\n        text=\"Hello! Welcome to our podcast.\",\n        speaker=\"male1\",\n        filename=\"output_audio.wav\",\n        channel=\"both\"\n    )\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n``` \n\n### Example: Generating a Podcast with Music\n\nThe generate_podcast method combines dialogue and background music for a seamless podcast production.\n\n```python \nimport asyncio\nfrom podcast_tts import PodcastTTS\n\nasync def main():\n    tts = PodcastTTS(speed=5)\n\n    # Define speakers and text\n    texts = [\n        {\"male1\": [\"Welcome to the podcast!\", \"both\"]},\n        {\"female2\": [\"Today, we discuss AI advancements.\", \"left\"]},\n        {\"male2\": [\"Don't miss our exciting updates.\", \"right\"]},\n    ]\n\n    # Define background music (local file or URL)\n    music_config = [\"https://example.com/background_music.mp3\", 10, 3, 0.3]\n\n    # Generate the podcast\n    output_file = await tts.generate_podcast(\n        texts=texts,\n        music=music_config,\n        filename=\"podcast_with_music.mp3\",\n        pause_duration=0.5,\n        normalize=True\n    )\n\n    print(f\"Podcast saved to: {output_file}\")\n\nif __name__ == \"__main__\":\n    asyncio.run(main())\n```\n\n### Music Configuration:\n\n- [file/url, full_volume_duration, fade_duration, target_volume]\n    - **file/url**: Path to a local MP3/WAV file or a URL to download.\n    - **full_volume_duration**: Time (seconds) at full volume before dialogue starts and after ends.\n    - **fade_duration**: Time (seconds) for fade-in/out effects.\n    - **target_volum**e: Volume level (0.0 to 1.0) during dialogue playback.\n\n## Premade Voices\n\nPodcastTTS includes the following premade speaker profiles:\n\n- male1\n- male2\n- female2\n\nThese profiles are included in the package's **default_voices** directory and can be used without additional setup.\n\n\n## Dynamic Speaker Generation\n\nWhen a speaker profile is specified but does not exist, the library will automatically generate a new speaker profile and save it in the voices subfolder. This ensures consistent voice roles across different turns in a dialogue.\nFor example:\n\n```python\ntexts = [\n    {\"Narrator\": [\"Welcome to this exciting episode.\", \"left\"]},\n    {\"Expert\": [\"Today, we'll explore AI's impact on healthcare.\", \"right\"]},\n]\n# If \"Narrator\" or \"Expert\" profiles do not exist, they will be generated dynamically.\n```\n\nThe profiles are saved in the script's voices directory and reused automatically if the same speaker is used in the future for consistency.\n\n## Loading Existing Speaker Profiles\n\nYou can load any speaker profile by specifying its filename (without the .txt extension). Profiles are stored in the voices subfolder, so you don't need to specify the path explicitly.\n\n```python\n# Assuming a speaker profile \"Host.txt\" exists in the voices subfolder\nawait tts.generate_wav(\"This is a test for an existing speaker.\", \"Host\", \"existing_speaker.wav\")\n```\n\n## Additional Notes\n\n- The library uses ChatTTS for high-quality TTS generation.\n- Text is automatically cleaned and split into manageable chunks, making it easy to generate audio for long scripts or conversations.\n- The generated audio files are saved in WAV format, with support for channel-specific playback.\n\n## Contributing\n\nContributions are welcome! Feel free to submit issues or pull requests on the GitHub repository.\n\n## License\nThis project is licensed under the MIT License. See the LICENSE file for details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Generate high-quality TTS audio for podcasts and dialogues.",
    "version": "0.0.1",
    "project_urls": {
        "Homepage": "https://github.com/puntorigen/podcast_tts"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1726a66c4d4820e141acb0e192ae0136d57fa0a4bcf28bd00dcf89c7b4a38063",
                "md5": "f685c5ec9619b4707ac45ac911d25820",
                "sha256": "fe173e603c121535a93337c730a57951d462a936902165a0c95205803d46873b"
            },
            "downloads": -1,
            "filename": "podcast_tts-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f685c5ec9619b4707ac45ac911d25820",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 10642,
            "upload_time": "2024-11-20T20:10:30",
            "upload_time_iso_8601": "2024-11-20T20:10:30.460772Z",
            "url": "https://files.pythonhosted.org/packages/17/26/a66c4d4820e141acb0e192ae0136d57fa0a4bcf28bd00dcf89c7b4a38063/podcast_tts-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "209648782f48c9b500e8ca9f722403ee5c6842113f1c7eb374e6b90956ea516b",
                "md5": "f2b2f7dda0fbd2f5b38f7ec65e823b36",
                "sha256": "4b1eeb73fdb9ed1fced40fbdc30c888b9598234d80c950047262a7febb8bbf18"
            },
            "downloads": -1,
            "filename": "podcast_tts-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f2b2f7dda0fbd2f5b38f7ec65e823b36",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 12345,
            "upload_time": "2024-11-20T20:10:32",
            "upload_time_iso_8601": "2024-11-20T20:10:32.334468Z",
            "url": "https://files.pythonhosted.org/packages/20/96/48782f48c9b500e8ca9f722403ee5c6842113f1c7eb374e6b90956ea516b/podcast_tts-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-20 20:10:32",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "puntorigen",
    "github_project": "podcast_tts",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "podcast-tts"
}
        
Elapsed time: 0.41380s