dasp-pytorch

Name	dasp-pytorch JSON
Version	0.0.1 JSON
	download
home_page	https://github.com/csteinmetz1/dasp-pytorch
Summary	Differentiable audio processors in PyTorch.
upload_time	2023-11-12 17:40:06
maintainer
docs_url	None
author	Christian Steinmetz
requires_python	>=3.8.0
license	Apache License 2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
<div align="center">

<img src="docs/assets/dasp-no-bg.png" width="200px">

# dasp

<i> Differentiable audio signal processors in PyTorch </i>

</div>

<img src="docs/assets/box.svg" width="30px"> &nbsp; Includes reverberation, distortion, dynamic range processing, equalization, stereo processing.

<img src="docs/assets/gear.svg" width="30px"> &nbsp; Enables virtual analog modeling, blind parameter estimation, automated DSP, and style transfer.

<img src="docs/assets/gpu-card.svg" width="30px"> &nbsp; Batching with operation on both CPU and GPU accelerators for fast training and reduced bottlenecks.

<img src="docs/assets/code-slash.svg" width="30px"> &nbsp; Open source and free to use for academic and commercial applications under Apache 2.0 license.

## Installation 

```
git clone https://github.com/csteinmetz1/dasp-pytorch
cd dasp-pytorch
pip install -e .
```

Note: Coming to PyPi soon to enable `pip install dasp-pytorch`.

## Examples

`dasp-pytorch` is a Python library for constructing differentiable audio signal processors using PyTorch. 
These differentiable processors can be used standalone or within the computation graph of neural networks. 
We provide purely functional interfaces for all processors that enables ease-of-use and portability across projects. 
Unless oterhwise stated, all effect functions expect 3-dim tensors with shape `(batch_size, num_channels, num_samples)` as input and output. 
Using an effect in your computation graph is as simple as calling the function with the input tensor as argument. 

### Quickstart

Here is a minimal example to demonstrate reverse engineering the drive value of a simple distortion effect using gradient descent. 

```python
import torch
import torchaudio
import dasp_pytorch

# Load audio
x, sr = torchaudio.load("audio/short_riff.wav")

# create batch dim
# (batch_size, n_channels, n_samples)
x = x.unsqueeze(0)

# apply some distortion with 16 dB drive
drive = torch.tensor([16.0])
y = dasp_pytorch.functional.distortion(x, drive)

# create a parameter to optimizer
drive_hat = torch.nn.Parameter(torch.tensor(0.0))
optimizer = torch.optim.Adam([drive_hat], lr=0.01)

# optimize the parameter
n_iters = 2500
for n in range(n_iters):
    # apply distortion with the estimated parameter
    y_hat = dasp_pytorch.functional.distortion(x, drive_hat)

    # compute distance between estimate and target
    loss = torch.nn.functional.mse_loss(y_hat, y)

    # optimize
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print(
        f"step: {n+1}/{n_iters}, loss: {loss.item():.3f}, drive: {drive_hat.item():.3f}"
    )
```

For the remaining examples we will use the [GuitarSet](https://guitarset.weebly.com/) dataset. 
You can download the data using the following commands:
```bash
mkdir data
wget https://zenodo.org/records/3371780/files/audio_mono-mic.zip
unzip audio_mono-mic.zip
rm audio_mono-mic.zip
```

### More examples

- [Virtual Analog Modeling](examples/virtual_analog.py)
- [Automatic Equalization](examples/auto_eq.py)
- [Audio Production Style Transfer](examples/style_transfer.py)

## Audio Processors

<table>
    <tr>
        <th>Audio Processor</th>
        <th>Functional Interface</th>
    </tr>
    <tr>
        <td>Gain</td>
        <td><code>gain()</code></td>
    </tr>
    <tr>
        <td>Distortion</td>
        <td><code>distortion()</code></td>
    </tr>
    <tr>
        <td>Parametric Equalizer</td>
        <td><code>parametric_eq()</code></td>
    </tr>
    <tr>
        <td>Dynamic range compressor</td>
        <td><code>compressor()</code></td>
    </tr>
    <tr>
        <td>Dynamic range expander</td>
        <td><code>expander()</code></td>
    </tr>    
    <tr>
        <td>Reverberation</td>
        <td><code>noise_shaped_reverberation()</code></td>
    </tr>
    <tr>
        <td>Stereo Widener</td>
        <td><code>stereo_widener()</code></td>
    </tr>
    <tr>
        <td>Stereo Panner</td>
        <td><code>stereo_panner()</code></td>
    </tr>
    <tr>
        <td>Stereo Bus</td>
        <td><code>stereo_bus()</code></td>
    </tr>
</table>

## Citations

If you use this library consider citing these papers:

Differnetiable parametric EQ and dynamic range compressor
```bibtex
@article{steinmetz2022style,
  title={Style transfer of audio effects with differentiable signal processing},
  author={Steinmetz, Christian J and Bryan, Nicholas J and Reiss, Joshua D},
  journal={arXiv preprint arXiv:2207.08759},
  year={2022}
}
```

Differentiable artificial reveberation with frequency-band noise shaping
```bibtex
@inproceedings{steinmetz2021filtered,
  title={Filtered noise shaping for time domain room impulse 
         response estimation from reverberant speech},
  author={Steinmetz, Christian J and Ithapu, Vamsi Krishna and Calamia, Paul},
  booktitle={WASPAA},
  year={2021},
  organization={IEEE}
}
```

Differnetiable IIR filters
```bibtex
@inproceedings{nercessian2020neural,
  title={Neural parametric equalizer matching using differentiable biquads},
  author={Nercessian, Shahan},
  booktitle={DAFx},
  year={2020}
}
```

```bibtex
@inproceedings{colonel2022direct,
  title={Direct design of biquad filter cascades with deep learning 
          by sampling random polynomials},
  author={Colonel, Joseph T and Steinmetz, Christian J and 
          Michelen, Marcus and Reiss, Joshua D},
  booktitle={ICASSP},
  year={2022},
  organization={IEEE}
```

## Acknowledgements

Supported by the EPSRC UKRI Centre for Doctoral Training in Artificial Intelligence and Music (EP/S022694/1).

<p float="left">
    <img src="docs/assets/logos/qm.png" height="50px"> &nbsp; &nbsp; 
    <img src="docs/assets/logos/aim.png"  height="50px"> &nbsp; &nbsp; 
    <img src="docs/assets/logos/ukri.png"  height="50px"> &nbsp; &nbsp; 
</p>

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/csteinmetz1/dasp-pytorch",
    "name": "dasp-pytorch",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Christian Steinmetz",
    "author_email": "c.j.steinmetz@qmul.ac.uk",
    "download_url": "https://files.pythonhosted.org/packages/91/5e/ca9bd836fa78b0d2e9d7561f33e52ff3014db22600bc05719bd68432fe22/dasp-pytorch-0.0.1.tar.gz",
    "platform": null,
    "description": "\n<div align=\"center\">\n\n<img src=\"docs/assets/dasp-no-bg.png\" width=\"200px\">\n\n# dasp\n\n<i> Differentiable audio signal processors in PyTorch </i>\n\n</div>\n\n<img src=\"docs/assets/box.svg\" width=\"30px\"> &nbsp; Includes reverberation, distortion, dynamic range processing, equalization, stereo processing.\n\n<img src=\"docs/assets/gear.svg\" width=\"30px\"> &nbsp; Enables virtual analog modeling, blind parameter estimation, automated DSP, and style transfer.\n\n<img src=\"docs/assets/gpu-card.svg\" width=\"30px\"> &nbsp; Batching with operation on both CPU and GPU accelerators for fast training and reduced bottlenecks.\n\n<img src=\"docs/assets/code-slash.svg\" width=\"30px\"> &nbsp; Open source and free to use for academic and commercial applications under Apache 2.0 license.\n\n## Installation \n\n```\ngit clone https://github.com/csteinmetz1/dasp-pytorch\ncd dasp-pytorch\npip install -e .\n```\n\nNote: Coming to PyPi soon to enable `pip install dasp-pytorch`.\n\n## Examples\n\n`dasp-pytorch` is a Python library for constructing differentiable audio signal processors using PyTorch. \nThese differentiable processors can be used standalone or within the computation graph of neural networks. \nWe provide purely functional interfaces for all processors that enables ease-of-use and portability across projects. \nUnless oterhwise stated, all effect functions expect 3-dim tensors with shape `(batch_size, num_channels, num_samples)` as input and output. \nUsing an effect in your computation graph is as simple as calling the function with the input tensor as argument. \n\n### Quickstart\n\nHere is a minimal example to demonstrate reverse engineering the drive value of a simple distortion effect using gradient descent. \n\n```python\nimport torch\nimport torchaudio\nimport dasp_pytorch\n\n# Load audio\nx, sr = torchaudio.load(\"audio/short_riff.wav\")\n\n# create batch dim\n# (batch_size, n_channels, n_samples)\nx = x.unsqueeze(0)\n\n# apply some distortion with 16 dB drive\ndrive = torch.tensor([16.0])\ny = dasp_pytorch.functional.distortion(x, drive)\n\n# create a parameter to optimizer\ndrive_hat = torch.nn.Parameter(torch.tensor(0.0))\noptimizer = torch.optim.Adam([drive_hat], lr=0.01)\n\n# optimize the parameter\nn_iters = 2500\nfor n in range(n_iters):\n    # apply distortion with the estimated parameter\n    y_hat = dasp_pytorch.functional.distortion(x, drive_hat)\n\n    # compute distance between estimate and target\n    loss = torch.nn.functional.mse_loss(y_hat, y)\n\n    # optimize\n    optimizer.zero_grad()\n    loss.backward()\n    optimizer.step()\n    print(\n        f\"step: {n+1}/{n_iters}, loss: {loss.item():.3f}, drive: {drive_hat.item():.3f}\"\n    )\n```\n\nFor the remaining examples we will use the [GuitarSet](https://guitarset.weebly.com/) dataset. \nYou can download the data using the following commands:\n```bash\nmkdir data\nwget https://zenodo.org/records/3371780/files/audio_mono-mic.zip\nunzip audio_mono-mic.zip\nrm audio_mono-mic.zip\n```\n\n### More examples\n\n- [Virtual Analog Modeling](examples/virtual_analog.py)\n- [Automatic Equalization](examples/auto_eq.py)\n- [Audio Production Style Transfer](examples/style_transfer.py)\n\n## Audio Processors\n\n<table>\n    <tr>\n        <th>Audio Processor</th>\n        <th>Functional Interface</th>\n    </tr>\n    <tr>\n        <td>Gain</td>\n        <td><code>gain()</code></td>\n    </tr>\n    <tr>\n        <td>Distortion</td>\n        <td><code>distortion()</code></td>\n    </tr>\n    <tr>\n        <td>Parametric Equalizer</td>\n        <td><code>parametric_eq()</code></td>\n    </tr>\n    <tr>\n        <td>Dynamic range compressor</td>\n        <td><code>compressor()</code></td>\n    </tr>\n    <tr>\n        <td>Dynamic range expander</td>\n        <td><code>expander()</code></td>\n    </tr>    \n    <tr>\n        <td>Reverberation</td>\n        <td><code>noise_shaped_reverberation()</code></td>\n    </tr>\n    <tr>\n        <td>Stereo Widener</td>\n        <td><code>stereo_widener()</code></td>\n    </tr>\n    <tr>\n        <td>Stereo Panner</td>\n        <td><code>stereo_panner()</code></td>\n    </tr>\n    <tr>\n        <td>Stereo Bus</td>\n        <td><code>stereo_bus()</code></td>\n    </tr>\n</table>\n\n## Citations\n\nIf you use this library consider citing these papers:\n\nDiffernetiable parametric EQ and dynamic range compressor\n```bibtex\n@article{steinmetz2022style,\n  title={Style transfer of audio effects with differentiable signal processing},\n  author={Steinmetz, Christian J and Bryan, Nicholas J and Reiss, Joshua D},\n  journal={arXiv preprint arXiv:2207.08759},\n  year={2022}\n}\n```\n\nDifferentiable artificial reveberation with frequency-band noise shaping\n```bibtex\n@inproceedings{steinmetz2021filtered,\n  title={Filtered noise shaping for time domain room impulse \n         response estimation from reverberant speech},\n  author={Steinmetz, Christian J and Ithapu, Vamsi Krishna and Calamia, Paul},\n  booktitle={WASPAA},\n  year={2021},\n  organization={IEEE}\n}\n```\n\nDiffernetiable IIR filters\n```bibtex\n@inproceedings{nercessian2020neural,\n  title={Neural parametric equalizer matching using differentiable biquads},\n  author={Nercessian, Shahan},\n  booktitle={DAFx},\n  year={2020}\n}\n```\n\n```bibtex\n@inproceedings{colonel2022direct,\n  title={Direct design of biquad filter cascades with deep learning \n          by sampling random polynomials},\n  author={Colonel, Joseph T and Steinmetz, Christian J and \n          Michelen, Marcus and Reiss, Joshua D},\n  booktitle={ICASSP},\n  year={2022},\n  organization={IEEE}\n```\n\n## Acknowledgements\n\nSupported by the EPSRC UKRI Centre for Doctoral Training in Artificial Intelligence and Music (EP/S022694/1).\n\n<p float=\"left\">\n    <img src=\"docs/assets/logos/qm.png\" height=\"50px\"> &nbsp; &nbsp; \n    <img src=\"docs/assets/logos/aim.png\"  height=\"50px\"> &nbsp; &nbsp; \n    <img src=\"docs/assets/logos/ukri.png\"  height=\"50px\"> &nbsp; &nbsp; \n</p>\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Differentiable audio processors in PyTorch.",
    "version": "0.0.1",
    "project_urls": {
        "Homepage": "https://github.com/csteinmetz1/dasp-pytorch"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c3b4b3a3f60af0bc82178fce8d99328c1504e4c732e3fb85aece8fddd3943a8c",
                "md5": "76da0043323b2405a85b7271eca63ee3",
                "sha256": "2116a253c40dbab74eecfc93ebae9c00f0bb7df698c2cc1f9796aeaef5cb6f48"
            },
            "downloads": -1,
            "filename": "dasp_pytorch-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "76da0043323b2405a85b7271eca63ee3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8.0",
            "size": 18452,
            "upload_time": "2023-11-12T17:40:05",
            "upload_time_iso_8601": "2023-11-12T17:40:05.046105Z",
            "url": "https://files.pythonhosted.org/packages/c3/b4/b3a3f60af0bc82178fce8d99328c1504e4c732e3fb85aece8fddd3943a8c/dasp_pytorch-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "915eca9bd836fa78b0d2e9d7561f33e52ff3014db22600bc05719bd68432fe22",
                "md5": "2f08c6829155a7bfdb5a6d41930f0a7b",
                "sha256": "23a3853550bcc7caa0964440a83588662e6cb2d0aa458cdfc4aabd1afa1c9eba"
            },
            "downloads": -1,
            "filename": "dasp-pytorch-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "2f08c6829155a7bfdb5a6d41930f0a7b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.0",
            "size": 19757,
            "upload_time": "2023-11-12T17:40:06",
            "upload_time_iso_8601": "2023-11-12T17:40:06.454016Z",
            "url": "https://files.pythonhosted.org/packages/91/5e/ca9bd836fa78b0d2e9d7561f33e52ff3014db22600bc05719bd68432fe22/dasp-pytorch-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-12 17:40:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "csteinmetz1",
    "github_project": "dasp-pytorch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "dasp-pytorch"
}

Christian Steinmetz