aligner-pytorch


Namealigner-pytorch JSON
Version 0.0.19 PyPI version JSON
download
home_pagehttps://github.com/archinetai/audio-diffusion-pytorch
SummaryAligner - PyTorch
upload_time2022-11-30 07:19:39
maintainer
docs_urlNone
authorFlavio Schneider
requires_python
licenseMIT
keywords artificial intelligence deep learning tts alignment
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Aligner - PyTorch

Sequence alignement methods with helpers for PyTorch.

## Install

```bash
pip install aligner-pytorch
```

[![PyPI - Python Version](https://img.shields.io/pypi/v/aligner-pytorch?style=flat&colorA=black&colorB=black)](https://pypi.org/project/aligner-pytorch/)


## Usage

### MAS

MAS (Monotonic Alignment Search) from GlowTTS. This can be used to get the alignment of any (similarity) matrix. Implementation in optimized Cython.

```py
from aligner_pytorch import mas

sim = torch.rand(1, 4, 6) # [batch_size, x_length, y_length]
alignment = mas(sim)

"""
sim = tensor([[
    [0.2, 0.8, 0.9, 0.9, 0.9, 0.4],
    [0.6, 0.8, 0.9, 0.7, 0.1, 0.4],
    [1.0, 0.4, 0.4, 0.2, 1.0, 0.7],
    [0.1, 0.3, 0.1, 0.7, 0.6, 0.9]
]])

alignment = tensor([[
    [1, 0, 0, 0, 0, 0],
    [0, 1, 1, 1, 0, 0],
    [0, 0, 0, 0, 1, 0],
    [0, 0, 0, 0, 0, 1]
]], dtype=torch.int32)
"""
```

### XY Embedding to Alignment
Used during training to get the alignement of a `x_embedding` with `y_embedding`, computes the log probability from a normal distribution and the alignment with MAS.
```py
from aligner_pytorch import get_alignment_from_embeddings

x_embedding = torch.randn(1, 4, 10)
y_embedding = torch.randn(1, 6, 10)

alignment = get_alignment_from_embeddings(
    x_embedding=torch.randn(1, 4, 10),  # [batch_size, x_length, features]
    y_embedding=torch.randn(1, 6, 10),  # [batch_size, y_length, features]
)                                       # [batch_size, x_length, y_length]

"""
alignment = tensor([[
    [1, 0, 0, 0, 0, 0],
    [0, 1, 0, 0, 0, 0],
    [0, 0, 1, 0, 0, 0],
    [0, 0, 0, 1, 1, 1]
]], dtype=torch.int32)
"""
```

### Duration Embedding to Alignment
Used during inference to compute the alignment from a trained duration embedding.
```py
from aligner_pytorch import get_alignment_from_duration_embedding

alignment = get_alignment_from_duration_embedding(
    embedding=torch.randn(1, 5),    # Embedding: [batch_size, x_length]
    scale=1.0,                      # Duration scale
    y_length=10                     # (Optional) fixes maximum output y_length
)                                   # Output alignment [batch_size, x_length, y_length]

"""
alignment  = tensor([[
    [1, 1, 1, 0, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0, 1, 1, 1, 0, 0],
    [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]
]])
"""
```


## Citations

Monotonic Alignment Search
```bibtex
@misc{2005.11129,
Author = {Jaehyeon Kim and Sungwon Kim and Jungil Kong and Sungroh Yoon},
Title = {Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search},
Year = {2020},
Eprint = {arXiv:2005.11129},
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/archinetai/audio-diffusion-pytorch",
    "name": "aligner-pytorch",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "artificial intelligence,deep learning,TTS,alignment",
    "author": "Flavio Schneider",
    "author_email": "archinetai@protonmail.com",
    "download_url": "https://files.pythonhosted.org/packages/6e/c3/2aa2dce18b99182a36df4fa8c24f2df72bb1da03c16374f6c1b368bc839b/aligner-pytorch-0.0.19.tar.gz",
    "platform": null,
    "description": "\n# Aligner - PyTorch\n\nSequence alignement methods with helpers for PyTorch.\n\n## Install\n\n```bash\npip install aligner-pytorch\n```\n\n[![PyPI - Python Version](https://img.shields.io/pypi/v/aligner-pytorch?style=flat&colorA=black&colorB=black)](https://pypi.org/project/aligner-pytorch/)\n\n\n## Usage\n\n### MAS\n\nMAS (Monotonic Alignment Search) from GlowTTS. This can be used to get the alignment of any (similarity) matrix. Implementation in optimized Cython.\n\n```py\nfrom aligner_pytorch import mas\n\nsim = torch.rand(1, 4, 6) # [batch_size, x_length, y_length]\nalignment = mas(sim)\n\n\"\"\"\nsim = tensor([[\n    [0.2, 0.8, 0.9, 0.9, 0.9, 0.4],\n    [0.6, 0.8, 0.9, 0.7, 0.1, 0.4],\n    [1.0, 0.4, 0.4, 0.2, 1.0, 0.7],\n    [0.1, 0.3, 0.1, 0.7, 0.6, 0.9]\n]])\n\nalignment = tensor([[\n    [1, 0, 0, 0, 0, 0],\n    [0, 1, 1, 1, 0, 0],\n    [0, 0, 0, 0, 1, 0],\n    [0, 0, 0, 0, 0, 1]\n]], dtype=torch.int32)\n\"\"\"\n```\n\n### XY Embedding to Alignment\nUsed during training to get the alignement of a `x_embedding` with `y_embedding`, computes the log probability from a normal distribution and the alignment with MAS.\n```py\nfrom aligner_pytorch import get_alignment_from_embeddings\n\nx_embedding = torch.randn(1, 4, 10)\ny_embedding = torch.randn(1, 6, 10)\n\nalignment = get_alignment_from_embeddings(\n    x_embedding=torch.randn(1, 4, 10),  # [batch_size, x_length, features]\n    y_embedding=torch.randn(1, 6, 10),  # [batch_size, y_length, features]\n)                                       # [batch_size, x_length, y_length]\n\n\"\"\"\nalignment = tensor([[\n    [1, 0, 0, 0, 0, 0],\n    [0, 1, 0, 0, 0, 0],\n    [0, 0, 1, 0, 0, 0],\n    [0, 0, 0, 1, 1, 1]\n]], dtype=torch.int32)\n\"\"\"\n```\n\n### Duration Embedding to Alignment\nUsed during inference to compute the alignment from a trained duration embedding.\n```py\nfrom aligner_pytorch import get_alignment_from_duration_embedding\n\nalignment = get_alignment_from_duration_embedding(\n    embedding=torch.randn(1, 5),    # Embedding: [batch_size, x_length]\n    scale=1.0,                      # Duration scale\n    y_length=10                     # (Optional) fixes maximum output y_length\n)                                   # Output alignment [batch_size, x_length, y_length]\n\n\"\"\"\nalignment  = tensor([[\n    [1, 1, 1, 0, 0, 0, 0, 0, 0, 0],\n    [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],\n    [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],\n    [0, 0, 0, 0, 0, 1, 1, 1, 0, 0],\n    [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]\n]])\n\"\"\"\n```\n\n\n## Citations\n\nMonotonic Alignment Search\n```bibtex\n@misc{2005.11129,\nAuthor = {Jaehyeon Kim and Sungwon Kim and Jungil Kong and Sungroh Yoon},\nTitle = {Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search},\nYear = {2020},\nEprint = {arXiv:2005.11129},\n}\n```\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Aligner - PyTorch",
    "version": "0.0.19",
    "split_keywords": [
        "artificial intelligence",
        "deep learning",
        "tts",
        "alignment"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "618f989d74ab8e70fce9a31789fae9d4",
                "sha256": "0a7e41b3586aa062a91c8c3c0346275378fa896a96d81315c2b21d0c3d678804"
            },
            "downloads": -1,
            "filename": "aligner-pytorch-0.0.19.tar.gz",
            "has_sig": false,
            "md5_digest": "618f989d74ab8e70fce9a31789fae9d4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 108522,
            "upload_time": "2022-11-30T07:19:39",
            "upload_time_iso_8601": "2022-11-30T07:19:39.142475Z",
            "url": "https://files.pythonhosted.org/packages/6e/c3/2aa2dce18b99182a36df4fa8c24f2df72bb1da03c16374f6c1b368bc839b/aligner-pytorch-0.0.19.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-11-30 07:19:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "archinetai",
    "github_project": "audio-diffusion-pytorch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "aligner-pytorch"
}
        
Elapsed time: 0.03190s