torchlibrosa


Nametorchlibrosa JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/qiuqiangkong/torchlibrosa
SummaryPyTorch implemention of part of librosa functions.
upload_time2023-02-21 09:40:54
maintainer
docs_urlNone
authorQiuqiang Kong
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TorchLibrosa: PyTorch implementation of Librosa

This codebase provides PyTorch implementation of some librosa functions. If users previously used for training cpu-extracted features from librosa, but want to add GPU acceleration during training and evaluation, TorchLibrosa will provide almost identical features to standard torchlibrosa functions (numerical difference less than 1e-5).

## Install
```bash
$ pip install torchlibrosa
```

## Examples 1

Extract Log mel spectrogram with TorchLibrosa.

```python
import torch
import torchlibrosa as tl

batch_size = 16
sample_rate = 22050
win_length = 2048
hop_length = 512
n_mels = 128

batch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1)  # (batch_size, sample_rate)

# TorchLibrosa feature extractor the same as librosa.feature.melspectrogram()
feature_extractor = torch.nn.Sequential(
    tl.Spectrogram(
        hop_length=hop_length,
        win_length=win_length,
    ), tl.LogmelFilterBank(
        sr=sample_rate,
        n_mels=n_mels,
        is_log=False, # Default is true
    ))
batch_feature = feature_extractor(batch_audio) # (batch_size, 1, time_steps, mel_bins)
```

## Examples 2

Extracting spectrogram, then log mel spectrogram, STFT and ISTFT with TorchLibrosa.

```python
import torch
import torchlibrosa as tl

batch_size = 16
sample_rate = 22050
win_length = 2048
hop_length = 512
n_mels = 128

batch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1)  # (batch_size, sample_rate)

# Spectrogram
spectrogram_extractor = tl.Spectrogram(n_fft=win_length, hop_length=hop_length)
sp = spectrogram_extractor.forward(batch_audio)   # (batch_size, 1, time_steps, freq_bins)

# Log mel spectrogram
logmel_extractor = tl.LogmelFilterBank(sr=sample_rate, n_fft=win_length, n_mels=n_mels)
logmel = logmel_extractor.forward(sp)   # (batch_size, 1, time_steps, mel_bins)

# STFT
stft_extractor = tl.STFT(n_fft=win_length, hop_length=hop_length)
(real, imag) = stft_extractor.forward(batch_audio)
# real: (batch_size, 1, time_steps, freq_bins), imag: (batch_size, 1, time_steps, freq_bins) #

# ISTFT
istft_extractor = tl.ISTFT(n_fft=win_length, hop_length=hop_length)
y = istft_extractor.forward(real, imag, length=batch_audio.shape[-1])    # (batch_size, samples_num)
```

## Example 3

Check the compability of TorchLibrosa to Librosa. The numerical difference should be less than 1e-5.

```python
python3 torchlibrosa/stft.py --device='cuda'    # --device='cpu' | 'cuda'
```

## Contact
Qiuqiang Kong, qiuqiangkong@gmail.com

## Cite
[1] Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-scale pretrained audio neural networks for audio pattern recognition." IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2880-2894.

## External links
Other related repos include:

torchaudio: https://github.com/pytorch/audio

Asteroid-filterbanks: https://github.com/asteroid-team/asteroid-filterbanks

Kapre: https://github.com/keunwoochoi/kapre


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/qiuqiangkong/torchlibrosa",
    "name": "torchlibrosa",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Qiuqiang Kong",
    "author_email": "qiuqiangkong@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a4/67/e4c79da3f15777b9bc2b655d47dac553fc31e40360500fef5e66d6877ce8/torchlibrosa-0.1.0.tar.gz",
    "platform": null,
    "description": "# TorchLibrosa: PyTorch implementation of Librosa\n\nThis codebase provides PyTorch implementation of some librosa functions. If users previously used for training cpu-extracted features from librosa, but want to add GPU acceleration during training and evaluation, TorchLibrosa will provide almost identical features to standard torchlibrosa functions (numerical difference less than 1e-5).\n\n## Install\n```bash\n$ pip install torchlibrosa\n```\n\n## Examples 1\n\nExtract Log mel spectrogram with TorchLibrosa.\n\n```python\nimport torch\nimport torchlibrosa as tl\n\nbatch_size = 16\nsample_rate = 22050\nwin_length = 2048\nhop_length = 512\nn_mels = 128\n\nbatch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1)  # (batch_size, sample_rate)\n\n# TorchLibrosa feature extractor the same as librosa.feature.melspectrogram()\nfeature_extractor = torch.nn.Sequential(\n    tl.Spectrogram(\n        hop_length=hop_length,\n        win_length=win_length,\n    ), tl.LogmelFilterBank(\n        sr=sample_rate,\n        n_mels=n_mels,\n        is_log=False, # Default is true\n    ))\nbatch_feature = feature_extractor(batch_audio) # (batch_size, 1, time_steps, mel_bins)\n```\n\n## Examples 2\n\nExtracting spectrogram, then log mel spectrogram, STFT and ISTFT with TorchLibrosa.\n\n```python\nimport torch\nimport torchlibrosa as tl\n\nbatch_size = 16\nsample_rate = 22050\nwin_length = 2048\nhop_length = 512\nn_mels = 128\n\nbatch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1)  # (batch_size, sample_rate)\n\n# Spectrogram\nspectrogram_extractor = tl.Spectrogram(n_fft=win_length, hop_length=hop_length)\nsp = spectrogram_extractor.forward(batch_audio)   # (batch_size, 1, time_steps, freq_bins)\n\n# Log mel spectrogram\nlogmel_extractor = tl.LogmelFilterBank(sr=sample_rate, n_fft=win_length, n_mels=n_mels)\nlogmel = logmel_extractor.forward(sp)   # (batch_size, 1, time_steps, mel_bins)\n\n# STFT\nstft_extractor = tl.STFT(n_fft=win_length, hop_length=hop_length)\n(real, imag) = stft_extractor.forward(batch_audio)\n# real: (batch_size, 1, time_steps, freq_bins), imag: (batch_size, 1, time_steps, freq_bins) #\n\n# ISTFT\nistft_extractor = tl.ISTFT(n_fft=win_length, hop_length=hop_length)\ny = istft_extractor.forward(real, imag, length=batch_audio.shape[-1])    # (batch_size, samples_num)\n```\n\n## Example 3\n\nCheck the compability of TorchLibrosa to Librosa. The numerical difference should be less than 1e-5.\n\n```python\npython3 torchlibrosa/stft.py --device='cuda'    # --device='cpu' | 'cuda'\n```\n\n## Contact\nQiuqiang Kong, qiuqiangkong@gmail.com\n\n## Cite\n[1] Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. \"PANNs: Large-scale pretrained audio neural networks for audio pattern recognition.\" IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2880-2894.\n\n## External links\nOther related repos include:\n\ntorchaudio: https://github.com/pytorch/audio\n\nAsteroid-filterbanks: https://github.com/asteroid-team/asteroid-filterbanks\n\nKapre: https://github.com/keunwoochoi/kapre\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "PyTorch implemention of part of librosa functions.",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/qiuqiangkong/torchlibrosa"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3eafccf007edf442c3c0cd3a98be2c82bc99edc957c04436a759b6e1e01077e0",
                "md5": "116f50a51ba16a1b7d52bbdbd988aaae",
                "sha256": "89b65fd28b833ceb6bc74a3d0d87e2924ddc5a845d0a246b194952a4e12a38cb"
            },
            "downloads": -1,
            "filename": "torchlibrosa-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "116f50a51ba16a1b7d52bbdbd988aaae",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 11741,
            "upload_time": "2023-02-21T09:40:52",
            "upload_time_iso_8601": "2023-02-21T09:40:52.580115Z",
            "url": "https://files.pythonhosted.org/packages/3e/af/ccf007edf442c3c0cd3a98be2c82bc99edc957c04436a759b6e1e01077e0/torchlibrosa-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a467e4c79da3f15777b9bc2b655d47dac553fc31e40360500fef5e66d6877ce8",
                "md5": "fa1203ea0dc148d4d51bf0b27df3ac9c",
                "sha256": "62a8beedf9c9b4141a06234df3f10229f7ba86e67678ccee02489ec4ef044028"
            },
            "downloads": -1,
            "filename": "torchlibrosa-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "fa1203ea0dc148d4d51bf0b27df3ac9c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 11719,
            "upload_time": "2023-02-21T09:40:54",
            "upload_time_iso_8601": "2023-02-21T09:40:54.968003Z",
            "url": "https://files.pythonhosted.org/packages/a4/67/e4c79da3f15777b9bc2b655d47dac553fc31e40360500fef5e66d6877ce8/torchlibrosa-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-21 09:40:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "qiuqiangkong",
    "github_project": "torchlibrosa",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "torchlibrosa"
}
        
Elapsed time: 0.22178s