# TorchLibrosa: PyTorch implementation of Librosa
This codebase provides PyTorch implementation of some librosa functions. If users previously used for training cpu-extracted features from librosa, but want to add GPU acceleration during training and evaluation, TorchLibrosa will provide almost identical features to standard torchlibrosa functions (numerical difference less than 1e-5).
## Install
```bash
$ pip install torchlibrosa
```
## Examples 1
Extract Log mel spectrogram with TorchLibrosa.
```python
import torch
import torchlibrosa as tl
batch_size = 16
sample_rate = 22050
win_length = 2048
hop_length = 512
n_mels = 128
batch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1) # (batch_size, sample_rate)
# TorchLibrosa feature extractor the same as librosa.feature.melspectrogram()
feature_extractor = torch.nn.Sequential(
tl.Spectrogram(
hop_length=hop_length,
win_length=win_length,
), tl.LogmelFilterBank(
sr=sample_rate,
n_mels=n_mels,
is_log=False, # Default is true
))
batch_feature = feature_extractor(batch_audio) # (batch_size, 1, time_steps, mel_bins)
```
## Examples 2
Extracting spectrogram, then log mel spectrogram, STFT and ISTFT with TorchLibrosa.
```python
import torch
import torchlibrosa as tl
batch_size = 16
sample_rate = 22050
win_length = 2048
hop_length = 512
n_mels = 128
batch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1) # (batch_size, sample_rate)
# Spectrogram
spectrogram_extractor = tl.Spectrogram(n_fft=win_length, hop_length=hop_length)
sp = spectrogram_extractor.forward(batch_audio) # (batch_size, 1, time_steps, freq_bins)
# Log mel spectrogram
logmel_extractor = tl.LogmelFilterBank(sr=sample_rate, n_fft=win_length, n_mels=n_mels)
logmel = logmel_extractor.forward(sp) # (batch_size, 1, time_steps, mel_bins)
# STFT
stft_extractor = tl.STFT(n_fft=win_length, hop_length=hop_length)
(real, imag) = stft_extractor.forward(batch_audio)
# real: (batch_size, 1, time_steps, freq_bins), imag: (batch_size, 1, time_steps, freq_bins) #
# ISTFT
istft_extractor = tl.ISTFT(n_fft=win_length, hop_length=hop_length)
y = istft_extractor.forward(real, imag, length=batch_audio.shape[-1]) # (batch_size, samples_num)
```
## Example 3
Check the compability of TorchLibrosa to Librosa. The numerical difference should be less than 1e-5.
```python
python3 torchlibrosa/stft.py --device='cuda' # --device='cpu' | 'cuda'
```
## Contact
Qiuqiang Kong, qiuqiangkong@gmail.com
## Cite
[1] Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-scale pretrained audio neural networks for audio pattern recognition." IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2880-2894.
## External links
Other related repos include:
torchaudio: https://github.com/pytorch/audio
Asteroid-filterbanks: https://github.com/asteroid-team/asteroid-filterbanks
Kapre: https://github.com/keunwoochoi/kapre
Raw data
{
"_id": null,
"home_page": "https://github.com/qiuqiangkong/torchlibrosa",
"name": "torchlibrosa",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Qiuqiang Kong",
"author_email": "qiuqiangkong@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a4/67/e4c79da3f15777b9bc2b655d47dac553fc31e40360500fef5e66d6877ce8/torchlibrosa-0.1.0.tar.gz",
"platform": null,
"description": "# TorchLibrosa: PyTorch implementation of Librosa\n\nThis codebase provides PyTorch implementation of some librosa functions. If users previously used for training cpu-extracted features from librosa, but want to add GPU acceleration during training and evaluation, TorchLibrosa will provide almost identical features to standard torchlibrosa functions (numerical difference less than 1e-5).\n\n## Install\n```bash\n$ pip install torchlibrosa\n```\n\n## Examples 1\n\nExtract Log mel spectrogram with TorchLibrosa.\n\n```python\nimport torch\nimport torchlibrosa as tl\n\nbatch_size = 16\nsample_rate = 22050\nwin_length = 2048\nhop_length = 512\nn_mels = 128\n\nbatch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1) # (batch_size, sample_rate)\n\n# TorchLibrosa feature extractor the same as librosa.feature.melspectrogram()\nfeature_extractor = torch.nn.Sequential(\n tl.Spectrogram(\n hop_length=hop_length,\n win_length=win_length,\n ), tl.LogmelFilterBank(\n sr=sample_rate,\n n_mels=n_mels,\n is_log=False, # Default is true\n ))\nbatch_feature = feature_extractor(batch_audio) # (batch_size, 1, time_steps, mel_bins)\n```\n\n## Examples 2\n\nExtracting spectrogram, then log mel spectrogram, STFT and ISTFT with TorchLibrosa.\n\n```python\nimport torch\nimport torchlibrosa as tl\n\nbatch_size = 16\nsample_rate = 22050\nwin_length = 2048\nhop_length = 512\nn_mels = 128\n\nbatch_audio = torch.empty(batch_size, sample_rate).uniform_(-1, 1) # (batch_size, sample_rate)\n\n# Spectrogram\nspectrogram_extractor = tl.Spectrogram(n_fft=win_length, hop_length=hop_length)\nsp = spectrogram_extractor.forward(batch_audio) # (batch_size, 1, time_steps, freq_bins)\n\n# Log mel spectrogram\nlogmel_extractor = tl.LogmelFilterBank(sr=sample_rate, n_fft=win_length, n_mels=n_mels)\nlogmel = logmel_extractor.forward(sp) # (batch_size, 1, time_steps, mel_bins)\n\n# STFT\nstft_extractor = tl.STFT(n_fft=win_length, hop_length=hop_length)\n(real, imag) = stft_extractor.forward(batch_audio)\n# real: (batch_size, 1, time_steps, freq_bins), imag: (batch_size, 1, time_steps, freq_bins) #\n\n# ISTFT\nistft_extractor = tl.ISTFT(n_fft=win_length, hop_length=hop_length)\ny = istft_extractor.forward(real, imag, length=batch_audio.shape[-1]) # (batch_size, samples_num)\n```\n\n## Example 3\n\nCheck the compability of TorchLibrosa to Librosa. The numerical difference should be less than 1e-5.\n\n```python\npython3 torchlibrosa/stft.py --device='cuda' # --device='cpu' | 'cuda'\n```\n\n## Contact\nQiuqiang Kong, qiuqiangkong@gmail.com\n\n## Cite\n[1] Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. \"PANNs: Large-scale pretrained audio neural networks for audio pattern recognition.\" IEEE/ACM Transactions on Audio, Speech, and Language Processing 28 (2020): 2880-2894.\n\n## External links\nOther related repos include:\n\ntorchaudio: https://github.com/pytorch/audio\n\nAsteroid-filterbanks: https://github.com/asteroid-team/asteroid-filterbanks\n\nKapre: https://github.com/keunwoochoi/kapre\n\n",
"bugtrack_url": null,
"license": "",
"summary": "PyTorch implemention of part of librosa functions.",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/qiuqiangkong/torchlibrosa"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3eafccf007edf442c3c0cd3a98be2c82bc99edc957c04436a759b6e1e01077e0",
"md5": "116f50a51ba16a1b7d52bbdbd988aaae",
"sha256": "89b65fd28b833ceb6bc74a3d0d87e2924ddc5a845d0a246b194952a4e12a38cb"
},
"downloads": -1,
"filename": "torchlibrosa-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "116f50a51ba16a1b7d52bbdbd988aaae",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 11741,
"upload_time": "2023-02-21T09:40:52",
"upload_time_iso_8601": "2023-02-21T09:40:52.580115Z",
"url": "https://files.pythonhosted.org/packages/3e/af/ccf007edf442c3c0cd3a98be2c82bc99edc957c04436a759b6e1e01077e0/torchlibrosa-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a467e4c79da3f15777b9bc2b655d47dac553fc31e40360500fef5e66d6877ce8",
"md5": "fa1203ea0dc148d4d51bf0b27df3ac9c",
"sha256": "62a8beedf9c9b4141a06234df3f10229f7ba86e67678ccee02489ec4ef044028"
},
"downloads": -1,
"filename": "torchlibrosa-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "fa1203ea0dc148d4d51bf0b27df3ac9c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 11719,
"upload_time": "2023-02-21T09:40:54",
"upload_time_iso_8601": "2023-02-21T09:40:54.968003Z",
"url": "https://files.pythonhosted.org/packages/a4/67/e4c79da3f15777b9bc2b655d47dac553fc31e40360500fef5e66d6877ce8/torchlibrosa-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-21 09:40:54",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "qiuqiangkong",
"github_project": "torchlibrosa",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "torchlibrosa"
}