descript-mlx


Namedescript-mlx JSON
Version 0.0.2 PyPI version JSON
download
home_pageNone
SummaryDescript Audio Codec - MLX
upload_time2024-10-28 20:25:35
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords artificial intelligence asr audio-generation deep learning transformers text-to-speech
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Descript Audio Codec — MLX

Implementation of the [Descript Audio Codec](https://arxiv.org/abs/2306.06546), with the [MLX](https://github.com/ml-explore/mlx) framework.

Descript can compress 44kHz audio into discrete codes at 8kbps and produces high quality reconstructions at a 90:1 compression ratio compared to the raw audio.

This repository is based on the original Pytorch implementation available [here](https://github.com/descriptinc/descript-audio-codec).

## Installation

```bash
pip install descript-mlx
```

## Usage

You can load a pretrained model from Python like this:

```python
import mlx.core as mx

from descript_mlx import DAC

dac = DAC.from_pretrained("44khz") # or "24khz" / "16khz"
audio = mx.array(...)

# encode into latents and codes
z, codes, latents, commitment_loss, codebook_loss = dac.encode(audio)

# reconstruct from latents/codes to audio
reconstucted_audio = dac.decode(z)

# compress audio to a DAC file
dac_file = dac.compress(audio)
dac_file.save("/path/to/file.dac")

# decompress audio from a DAC file
reconstructed_audio = dac.decompress("/path/to/file.dac")
```

## Citations

```bibtex
@misc{kumar2023highfidelityaudiocompressionimproved,
      title={High-Fidelity Audio Compression with Improved RVQGAN}, 
      author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
      year={2023},
      eprint={2306.06546},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2306.06546}, 
}
```

## License

The code in this repository is released under the MIT license as found in the
[LICENSE](LICENSE) file.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "descript-mlx",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "artificial intelligence, asr, audio-generation, deep learning, transformers, text-to-speech",
    "author": null,
    "author_email": "Lucas Newman <lucasnewman@me.com>",
    "download_url": "https://files.pythonhosted.org/packages/f4/f7/674d5825e2f5d822c3c608a380667477a348494e19e42711ccb7545b7dba/descript_mlx-0.0.2.tar.gz",
    "platform": null,
    "description": "# Descript Audio Codec \u2014 MLX\n\nImplementation of the [Descript Audio Codec](https://arxiv.org/abs/2306.06546), with the [MLX](https://github.com/ml-explore/mlx) framework.\n\nDescript can compress 44kHz audio into discrete codes at 8kbps and produces high quality reconstructions at a 90:1 compression ratio compared to the raw audio.\n\nThis repository is based on the original Pytorch implementation available [here](https://github.com/descriptinc/descript-audio-codec).\n\n## Installation\n\n```bash\npip install descript-mlx\n```\n\n## Usage\n\nYou can load a pretrained model from Python like this:\n\n```python\nimport mlx.core as mx\n\nfrom descript_mlx import DAC\n\ndac = DAC.from_pretrained(\"44khz\") # or \"24khz\" / \"16khz\"\naudio = mx.array(...)\n\n# encode into latents and codes\nz, codes, latents, commitment_loss, codebook_loss = dac.encode(audio)\n\n# reconstruct from latents/codes to audio\nreconstucted_audio = dac.decode(z)\n\n# compress audio to a DAC file\ndac_file = dac.compress(audio)\ndac_file.save(\"/path/to/file.dac\")\n\n# decompress audio from a DAC file\nreconstructed_audio = dac.decompress(\"/path/to/file.dac\")\n```\n\n## Citations\n\n```bibtex\n@misc{kumar2023highfidelityaudiocompressionimproved,\n      title={High-Fidelity Audio Compression with Improved RVQGAN}, \n      author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},\n      year={2023},\n      eprint={2306.06546},\n      archivePrefix={arXiv},\n      primaryClass={cs.SD},\n      url={https://arxiv.org/abs/2306.06546}, \n}\n```\n\n## License\n\nThe code in this repository is released under the MIT license as found in the\n[LICENSE](LICENSE) file.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Descript Audio Codec - MLX",
    "version": "0.0.2",
    "project_urls": {
        "Homepage": "https://github.com/lucasnewman/descript-mlx"
    },
    "split_keywords": [
        "artificial intelligence",
        " asr",
        " audio-generation",
        " deep learning",
        " transformers",
        " text-to-speech"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "496aded846da947dec670912058d0485add578e5ca0485a9d80639c32eac243b",
                "md5": "6e02c6b1d759c1c0008c602e5b88ab0e",
                "sha256": "ac6ac83814797b71cd6fcdcc10c0bc32f6a03632a8dda71469a10a4e823a5678"
            },
            "downloads": -1,
            "filename": "descript_mlx-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6e02c6b1d759c1c0008c602e5b88ab0e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 11858,
            "upload_time": "2024-10-28T20:25:34",
            "upload_time_iso_8601": "2024-10-28T20:25:34.106184Z",
            "url": "https://files.pythonhosted.org/packages/49/6a/ded846da947dec670912058d0485add578e5ca0485a9d80639c32eac243b/descript_mlx-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f4f7674d5825e2f5d822c3c608a380667477a348494e19e42711ccb7545b7dba",
                "md5": "491e81ae042274ed48355c76387b07fc",
                "sha256": "1a5024313bff8cf1bba26b5fb81a2d0da65ec1f06249ecb732346432869050fe"
            },
            "downloads": -1,
            "filename": "descript_mlx-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "491e81ae042274ed48355c76387b07fc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 11598,
            "upload_time": "2024-10-28T20:25:35",
            "upload_time_iso_8601": "2024-10-28T20:25:35.538357Z",
            "url": "https://files.pythonhosted.org/packages/f4/f7/674d5825e2f5d822c3c608a380667477a348494e19e42711ccb7545b7dba/descript_mlx-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-28 20:25:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lucasnewman",
    "github_project": "descript-mlx",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "descript-mlx"
}
        
Elapsed time: 0.40270s