fairytaler


Namefairytaler JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/painebenjamin/fairytaler
SummaryAn unofficial reimplementation of F5TTS
upload_time2024-12-09 03:38:00
maintainerNone
docs_urlNone
authorBenjamin Paine
requires_python>=3.8.0
licensecc-by-nc-4.0
keywords
VCS
bugtrack_url
requirements accelerate einops huggingface-hub librosa numpy pillow safetensors scipy scikit-learn torch torchaudio torchdiffeq torchvision transformers vocos
Travis-CI No Travis.
coveralls test coverage No coveralls.
            This is a re-implementation of [F5-TTS](https://github.com/SWivid/F5-TTS) aimed at reducing dependencies, increasing speed, reducing model size and improving usability.

# Installation

Fairytaler assumes you have a working CUDA environment to install into.

```
pip install fairytaler
```

# How to Use

You do not need to pre-download anything, necessary data will be downloaded at runtime. Weights will be fetched from [HuggingFace.](https://huggingface.co/benjamin-paine/fairytaler)

## Command Line

Use the `fairytaler` binary from the command line like so:

```sh
fairytaler examples/reference.wav examples/reference.txt "Hello, this is some test audio!"
```

Many options are available, for complete documentation run `fairytaler --help`.

## Python

```py
from fairytaler import F5TTSPipeline

pipeline = F5TTSPipeline.from_pretrained(
  "benjamin-paine/fairytaler",
  variant="fp16", # Omit for float32
  device="auto"
)
output_wav_file = pipeline(
  text="Hello, this is some test audio!",
  reference_audio="examples/reference.wav",
  reference_text="examples/reference.txt",
  output_save=True
)
print(f"Output saved to {output_wav_file}")
```

The full execution signature is:

```py
def __call__(
    self,
    text: Union[str, List[str]],
    reference_audio: AudioType,
    reference_text: str,
    reference_sample_rate: Optional[int]=None,
    seed: SeedType=None,
    speed: float=1.0,
    sway_sampling_coef: float=-1.0,
    target_rms: float=0.1,
    cross_fade_duration: float=0.15,
    punctuation_pause_duration: float=0.10,
    num_steps: int=32,
    cfg_strength: float=2.0,
    fix_duration: Optional[float]=None,
    use_tqdm: bool=False,
    output_format: AUDIO_OUTPUT_FORMAT_LITERAL="wav",
    output_save: bool=False,
) -> AudioResultType
```

Format values are `wav`, `ogg`, `flac`, `mp3`, `float` and `int`. Passing `output_save=True` will save to file, not passing it will return the data directly.

# Citation

```
@misc{chen2024f5ttsfairytalerfakesfluent,
      title={F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching}, 
      author={Yushen Chen and Zhikang Niu and Ziyang Ma and Keqi Deng and Chunhui Wang and Jian Zhao and Kai Yu and Xie Chen},
      year={2024},
      eprint={2410.06885},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2410.06885}, 
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/painebenjamin/fairytaler",
    "name": "fairytaler",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8.0",
    "maintainer_email": null,
    "keywords": null,
    "author": "Benjamin Paine",
    "author_email": "painebenjamin@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a3/f4/4d28f61a5ad24f136193ca9a94dcd94e451a4d70e9affc93164f594deb03/fairytaler-0.1.1.tar.gz",
    "platform": null,
    "description": "This is a re-implementation of [F5-TTS](https://github.com/SWivid/F5-TTS) aimed at reducing dependencies, increasing speed, reducing model size and improving usability.\n\n# Installation\n\nFairytaler assumes you have a working CUDA environment to install into.\n\n```\npip install fairytaler\n```\n\n# How to Use\n\nYou do not need to pre-download anything, necessary data will be downloaded at runtime. Weights will be fetched from [HuggingFace.](https://huggingface.co/benjamin-paine/fairytaler)\n\n## Command Line\n\nUse the `fairytaler` binary from the command line like so:\n\n```sh\nfairytaler examples/reference.wav examples/reference.txt \"Hello, this is some test audio!\"\n```\n\nMany options are available, for complete documentation run `fairytaler --help`.\n\n## Python\n\n```py\nfrom fairytaler import F5TTSPipeline\n\npipeline = F5TTSPipeline.from_pretrained(\n  \"benjamin-paine/fairytaler\",\n  variant=\"fp16\", # Omit for float32\n  device=\"auto\"\n)\noutput_wav_file = pipeline(\n  text=\"Hello, this is some test audio!\",\n  reference_audio=\"examples/reference.wav\",\n  reference_text=\"examples/reference.txt\",\n  output_save=True\n)\nprint(f\"Output saved to {output_wav_file}\")\n```\n\nThe full execution signature is:\n\n```py\ndef __call__(\n    self,\n    text: Union[str, List[str]],\n    reference_audio: AudioType,\n    reference_text: str,\n    reference_sample_rate: Optional[int]=None,\n    seed: SeedType=None,\n    speed: float=1.0,\n    sway_sampling_coef: float=-1.0,\n    target_rms: float=0.1,\n    cross_fade_duration: float=0.15,\n    punctuation_pause_duration: float=0.10,\n    num_steps: int=32,\n    cfg_strength: float=2.0,\n    fix_duration: Optional[float]=None,\n    use_tqdm: bool=False,\n    output_format: AUDIO_OUTPUT_FORMAT_LITERAL=\"wav\",\n    output_save: bool=False,\n) -> AudioResultType\n```\n\nFormat values are `wav`, `ogg`, `flac`, `mp3`, `float` and `int`. Passing `output_save=True` will save to file, not passing it will return the data directly.\n\n# Citation\n\n```\n@misc{chen2024f5ttsfairytalerfakesfluent,\n      title={F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching}, \n      author={Yushen Chen and Zhikang Niu and Ziyang Ma and Keqi Deng and Chunhui Wang and Jian Zhao and Kai Yu and Xie Chen},\n      year={2024},\n      eprint={2410.06885},\n      archivePrefix={arXiv},\n      primaryClass={eess.AS},\n      url={https://arxiv.org/abs/2410.06885}, \n}\n```\n",
    "bugtrack_url": null,
    "license": "cc-by-nc-4.0",
    "summary": "An unofficial reimplementation of F5TTS",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/painebenjamin/fairytaler"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a3f44d28f61a5ad24f136193ca9a94dcd94e451a4d70e9affc93164f594deb03",
                "md5": "86fde4a2dcce5d0873ee11dbbb151717",
                "sha256": "48bfead1770f6b6f3de2b6fe6dff516f5d7d163461faecc4f3ec8fe4c6d2f142"
            },
            "downloads": -1,
            "filename": "fairytaler-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "86fde4a2dcce5d0873ee11dbbb151717",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.0",
            "size": 39845,
            "upload_time": "2024-12-09T03:38:00",
            "upload_time_iso_8601": "2024-12-09T03:38:00.865585Z",
            "url": "https://files.pythonhosted.org/packages/a3/f4/4d28f61a5ad24f136193ca9a94dcd94e451a4d70e9affc93164f594deb03/fairytaler-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-09 03:38:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "painebenjamin",
    "github_project": "fairytaler",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "accelerate",
            "specs": [
                [
                    "~=",
                    "1.0"
                ]
            ]
        },
        {
            "name": "einops",
            "specs": [
                [
                    ">=",
                    "0.8"
                ]
            ]
        },
        {
            "name": "huggingface-hub",
            "specs": [
                [
                    "~=",
                    "0.26"
                ]
            ]
        },
        {
            "name": "librosa",
            "specs": [
                [
                    ">=",
                    "0.10"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "~=",
                    "1.22"
                ]
            ]
        },
        {
            "name": "pillow",
            "specs": [
                [
                    "~=",
                    "9.5"
                ]
            ]
        },
        {
            "name": "safetensors",
            "specs": [
                [
                    "~=",
                    "0.4"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.11"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    "~=",
                    "1.5"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "2.4"
                ]
            ]
        },
        {
            "name": "torchaudio",
            "specs": [
                [
                    ">=",
                    "2.4"
                ]
            ]
        },
        {
            "name": "torchdiffeq",
            "specs": [
                [
                    "~=",
                    "0.2"
                ]
            ]
        },
        {
            "name": "torchvision",
            "specs": [
                [
                    ">=",
                    "0.19"
                ]
            ]
        },
        {
            "name": "transformers",
            "specs": [
                [
                    ">=",
                    "4.41"
                ]
            ]
        },
        {
            "name": "vocos",
            "specs": [
                [
                    "~=",
                    "0.1"
                ]
            ]
        }
    ],
    "lcname": "fairytaler"
}
        
Elapsed time: 7.59294s