# Descript Audio Codec — MLX
Implementation of the [Descript Audio Codec](https://arxiv.org/abs/2306.06546), with the [MLX](https://github.com/ml-explore/mlx) framework.
Descript can compress 44kHz audio into discrete codes at 8kbps and produces high quality reconstructions at a 90:1 compression ratio compared to the raw audio.
This repository is based on the original Pytorch implementation available [here](https://github.com/descriptinc/descript-audio-codec).
## Installation
```bash
pip install descript-mlx
```
## Usage
You can load a pretrained model from Python like this:
```python
import mlx.core as mx
from descript_mlx import DAC
dac = DAC.from_pretrained("44khz") # or "24khz" / "16khz"
audio = mx.array(...)
# encode into latents and codes
z, codes, latents, commitment_loss, codebook_loss = dac.encode(audio)
# reconstruct from latents/codes to audio
reconstucted_audio = dac.decode(z)
# compress audio to a DAC file
dac_file = dac.compress(audio)
dac_file.save("/path/to/file.dac")
# decompress audio from a DAC file
reconstructed_audio = dac.decompress("/path/to/file.dac")
```
## Citations
```bibtex
@misc{kumar2023highfidelityaudiocompressionimproved,
title={High-Fidelity Audio Compression with Improved RVQGAN},
author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},
year={2023},
eprint={2306.06546},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2306.06546},
}
```
## License
The code in this repository is released under the MIT license as found in the
[LICENSE](LICENSE) file.
Raw data
{
"_id": null,
"home_page": null,
"name": "descript-mlx",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "artificial intelligence, asr, audio-generation, deep learning, transformers, text-to-speech",
"author": null,
"author_email": "Lucas Newman <lucasnewman@me.com>",
"download_url": "https://files.pythonhosted.org/packages/f4/f7/674d5825e2f5d822c3c608a380667477a348494e19e42711ccb7545b7dba/descript_mlx-0.0.2.tar.gz",
"platform": null,
"description": "# Descript Audio Codec \u2014 MLX\n\nImplementation of the [Descript Audio Codec](https://arxiv.org/abs/2306.06546), with the [MLX](https://github.com/ml-explore/mlx) framework.\n\nDescript can compress 44kHz audio into discrete codes at 8kbps and produces high quality reconstructions at a 90:1 compression ratio compared to the raw audio.\n\nThis repository is based on the original Pytorch implementation available [here](https://github.com/descriptinc/descript-audio-codec).\n\n## Installation\n\n```bash\npip install descript-mlx\n```\n\n## Usage\n\nYou can load a pretrained model from Python like this:\n\n```python\nimport mlx.core as mx\n\nfrom descript_mlx import DAC\n\ndac = DAC.from_pretrained(\"44khz\") # or \"24khz\" / \"16khz\"\naudio = mx.array(...)\n\n# encode into latents and codes\nz, codes, latents, commitment_loss, codebook_loss = dac.encode(audio)\n\n# reconstruct from latents/codes to audio\nreconstucted_audio = dac.decode(z)\n\n# compress audio to a DAC file\ndac_file = dac.compress(audio)\ndac_file.save(\"/path/to/file.dac\")\n\n# decompress audio from a DAC file\nreconstructed_audio = dac.decompress(\"/path/to/file.dac\")\n```\n\n## Citations\n\n```bibtex\n@misc{kumar2023highfidelityaudiocompressionimproved,\n title={High-Fidelity Audio Compression with Improved RVQGAN}, \n author={Rithesh Kumar and Prem Seetharaman and Alejandro Luebs and Ishaan Kumar and Kundan Kumar},\n year={2023},\n eprint={2306.06546},\n archivePrefix={arXiv},\n primaryClass={cs.SD},\n url={https://arxiv.org/abs/2306.06546}, \n}\n```\n\n## License\n\nThe code in this repository is released under the MIT license as found in the\n[LICENSE](LICENSE) file.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Descript Audio Codec - MLX",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/lucasnewman/descript-mlx"
},
"split_keywords": [
"artificial intelligence",
" asr",
" audio-generation",
" deep learning",
" transformers",
" text-to-speech"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "496aded846da947dec670912058d0485add578e5ca0485a9d80639c32eac243b",
"md5": "6e02c6b1d759c1c0008c602e5b88ab0e",
"sha256": "ac6ac83814797b71cd6fcdcc10c0bc32f6a03632a8dda71469a10a4e823a5678"
},
"downloads": -1,
"filename": "descript_mlx-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6e02c6b1d759c1c0008c602e5b88ab0e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 11858,
"upload_time": "2024-10-28T20:25:34",
"upload_time_iso_8601": "2024-10-28T20:25:34.106184Z",
"url": "https://files.pythonhosted.org/packages/49/6a/ded846da947dec670912058d0485add578e5ca0485a9d80639c32eac243b/descript_mlx-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f4f7674d5825e2f5d822c3c608a380667477a348494e19e42711ccb7545b7dba",
"md5": "491e81ae042274ed48355c76387b07fc",
"sha256": "1a5024313bff8cf1bba26b5fb81a2d0da65ec1f06249ecb732346432869050fe"
},
"downloads": -1,
"filename": "descript_mlx-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "491e81ae042274ed48355c76387b07fc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 11598,
"upload_time": "2024-10-28T20:25:35",
"upload_time_iso_8601": "2024-10-28T20:25:35.538357Z",
"url": "https://files.pythonhosted.org/packages/f4/f7/674d5825e2f5d822c3c608a380667477a348494e19e42711ccb7545b7dba/descript_mlx-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-28 20:25:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lucasnewman",
"github_project": "descript-mlx",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "descript-mlx"
}