
[](https://badge.fury.io/py/hybra)
[](https://pypi.org/project/hybra/)
[](https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/LICENSE)
[](https://github.com/danedane-haider/HybrA-Filterbanks/actions)
[](https://pypi.org/project/hybra/)
[](https://github.com/danedane-haider/HybrA-Filterbanks)
**Auditory-inspired filterbanks for deep learning**
Welcome to HybrA-Filterbanks, a PyTorch library providing state-of-the-art auditory-inspired filterbanks for audio processing and deep learning applications.
## Overview
This library contains the official implementations of:
* **ISAC** ([paper](https://arxiv.org/abs/2505.07709)): Invertible and Stable Auditory filterbank with Customizable kernels for ML integration
* **HybrA** ([paper](https://arxiv.org/abs/2408.17358)): Hybrid Auditory filterbank that extends ISAC with learnable filters
* **ISACSpec**: Spectrogram variant with temporal averaging for robust feature extraction
* **ISACCC**: Cepstral coefficient extractor for speech recognition applications
## Key Features
✨ **PyTorch Integration**: All filterbanks are implemented as `nn.Module` for seamless integration into neural networks
🎯 **Auditory Modeling**: Based on human auditory perception principles (mel, ERB, bark scales)
⚡ **Fast Implementation**: Optimized using FFT-based circular convolution
🔧 **Flexible Configuration**: Customizable kernel sizes, frequency ranges, and scales
📊 **Frame Theory**: Built-in functions for frame bounds, condition numbers, and stability analysis
🎨 **Visualization**: Rich plotting capabilities for filter responses and time-frequency representations
## Documentation
[https://github.com/danedane-haider/HybrA-Filterbanks](https://danedane-haider.github.io/HybrA-Filterbanks/main/)
## Installation
We publish all releases on PyPi. You can install the current version by running:
```
pip install hybra
```
## Quick Start
### Basic ISAC Filterbank
```python
import torch
import torchaudio
from hybra import ISAC
# Load audio signal
x, fs = torchaudio.load("your_audio.wav")
x = torch.tensor(x, dtype=torch.float32).unsqueeze(0)
L = x.shape[-1]
# Create ISAC filterbank
isac_fb = ISAC(
kernel_size=1024,
num_channels=128,
L=L,
fs=fs,
scale='mel'
)
# Visualize frequency response
isac_fb.plot_response()
```
Condition number: 1.01
<img src="https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/ISAC_response.png?raw=true" width="100%">
```python
# Forward transform
y = isac_fb(x)
x_reconstructed = isac_fb.decoder(y)
# Visualize time-frequency representation
isac_fb.ISACgram(x, log_scale=True)
```
<img src="https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/ISAC_coeff.png?raw=true" width="100%">
### HybrA with Learnable Filters
```python
from hybra import HybrA
# Create hybrid filterbank with learnable components
hybra_fb = HybrA(
kernel_size=1024,
learned_kernel_size=23,
num_channels=128,
L=L,
fs=fs,
tighten=True
)
# Visualize frequency response
hybra_fb.plot_response()
# Check condition number for stability
print(f"Condition number: {hybra_fb.condition_number():.2f}")
```
Condition number: 1.06
<img src="https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/HybrA_response.png?raw=true" width="100%">
```python
# Forward pass (supports gradients for training)
y = hybra_fb(x)
x_reconstructed = hybra_fb.decoder(y)
# Visualize time-frequency representation
hybra_fb.ISACgram(x, log_scale=True)
```
<img src="https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/HybrA_coeff.png?raw=true" width="100%">
### ISAC Spectrograms and MFCCs
```python
from hybra import ISACSpec, ISACCC
# Spectrogram with temporal averaging for robust feature extraction
spectrogram = ISACSpec(
kernel_size=1024,
num_channels=40,
L=L,
fs=fs,
power=2.0,
is_log=True
)
# MFCC-like cepstral coefficients for speech recognition
mfcc_extractor = ISACCC(
kernel_size=1024,
num_channels=40,
num_cc=13,
L=L,
fs=fs
)
# Extract features
spec_coeffs = spectrogram(x)
mfcc_coeffs = mfcc_extractor(x)
print(f"Spectrogram shape: {spec_coeffs.shape}")
print(f"MFCC shape: {mfcc_coeffs.shape}")
```
### Integration with Neural Networks
Filterbanks can be easily integrated into neural networks as encoder/decoder pairs:
```python
import torch
import torch.nn as nn
import torchaudio
from hybra import HybrA
class Net(nn.Module):
def __init__(self):
super().__init__()
self.linear_before = nn.Linear(40, 400)
self.gru = nn.GRU(
input_size=400,
hidden_size=400,
num_layers=2,
batch_first=True,
)
self.linear_after = nn.Linear(400, 600)
self.linear_after2 = nn.Linear(600, 600)
self.linear_after3 = nn.Linear(600, 40)
def forward(self, x):
x = x.permute(0, 2, 1)
x = torch.relu(self.linear_before(x))
x, _ = self.gru(x)
x = torch.relu(self.linear_after(x))
x = torch.relu(self.linear_after2(x))
x = torch.sigmoid(self.linear_after3(x))
x = x.permute(0, 2, 1)
return x
class HybridfilterbankModel(nn.Module):
def __init__(self):
super().__init__()
self.nsnet = Net()
self.fb = HybrA(fs=16000)
def forward(self, x):
x = self.fb(x)
mask = self.nsnet(torch.log10(torch.max(x.abs()**2, 1e-8 * torch.ones_like(x, dtype=torch.float32))))
return self.fb.decoder(x*mask)
if __name__ == '__main__':
audio, fs = torchaudio.load('your_audio.wav')
model = HybridfilterbankModel()
model(audio)
```
## Citation
If you find our work valuable and use HybrA or ISAC in your work, please cite
```
@inproceedings{haider2024holdmetight,
author = {Haider, Daniel and Perfler, Felix and Lostanlen, Vincent and Ehler, Martin and Balazs, Peter},
booktitle = {Annual Conference of the International Speech Communication Association (Interspeech)},
year = {2024},
title = {Hold me tight: Stable encoder/decoder design for speech enhancement},
}
@inproceedings{haider2025isac,
author = {Haider, Daniel and Perfler, Felix and Balazs, Peter and Hollomey, Clara and Holighaus, Nicki},
title = {{ISAC}: An Invertible and Stable Auditory Filter
Bank with Customizable Kernels for ML Integration},
booktitle = {International Conference on Sampling Theory and Applications (SampTA)},
year = {2025}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "hybra",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "filterbank, filterbank learning, feature extraction, hybrid filterbanks, encoder, decoder, auditory, frames, audio front-end, audio, machine-learning, pytorch, speech, music, \u26f7\ufe0f",
"author": "Daniel Haider and Felix Perfler",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/bd/88/74ad16e9255e807512df7617a745b7087a1ebae047c5b5c8374f205b82e4/hybra-2025.9.5.tar.gz",
"platform": null,
"description": "\n\n[](https://badge.fury.io/py/hybra)\n[](https://pypi.org/project/hybra/)\n[](https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/LICENSE)\n[](https://github.com/danedane-haider/HybrA-Filterbanks/actions)\n[](https://pypi.org/project/hybra/)\n[](https://github.com/danedane-haider/HybrA-Filterbanks)\n\n**Auditory-inspired filterbanks for deep learning**\n\nWelcome to HybrA-Filterbanks, a PyTorch library providing state-of-the-art auditory-inspired filterbanks for audio processing and deep learning applications.\n\n## Overview\n\nThis library contains the official implementations of:\n\n* **ISAC** ([paper](https://arxiv.org/abs/2505.07709)): Invertible and Stable Auditory filterbank with Customizable kernels for ML integration\n* **HybrA** ([paper](https://arxiv.org/abs/2408.17358)): Hybrid Auditory filterbank that extends ISAC with learnable filters\n* **ISACSpec**: Spectrogram variant with temporal averaging for robust feature extraction \n* **ISACCC**: Cepstral coefficient extractor for speech recognition applications\n\n## Key Features\n\n\u2728 **PyTorch Integration**: All filterbanks are implemented as `nn.Module` for seamless integration into neural networks\n\n\ud83c\udfaf **Auditory Modeling**: Based on human auditory perception principles (mel, ERB, bark scales)\n\n\u26a1 **Fast Implementation**: Optimized using FFT-based circular convolution\n\n\ud83d\udd27 **Flexible Configuration**: Customizable kernel sizes, frequency ranges, and scales\n\n\ud83d\udcca **Frame Theory**: Built-in functions for frame bounds, condition numbers, and stability analysis\n\n\ud83c\udfa8 **Visualization**: Rich plotting capabilities for filter responses and time-frequency representations \n\n## Documentation\n[https://github.com/danedane-haider/HybrA-Filterbanks](https://danedane-haider.github.io/HybrA-Filterbanks/main/)\n\n## Installation\nWe publish all releases on PyPi. You can install the current version by running:\n```\npip install hybra\n```\n\n## Quick Start\n\n### Basic ISAC Filterbank\n\n```python\nimport torch\nimport torchaudio\nfrom hybra import ISAC\n\n# Load audio signal\nx, fs = torchaudio.load(\"your_audio.wav\")\nx = torch.tensor(x, dtype=torch.float32).unsqueeze(0)\nL = x.shape[-1]\n\n# Create ISAC filterbank\nisac_fb = ISAC(\n kernel_size=1024, \n num_channels=128, \n L=L, \n fs=fs,\n scale='mel'\n)\n\n# Visualize frequency response\nisac_fb.plot_response()\n```\nCondition number: 1.01\n<img src=\"https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/ISAC_response.png?raw=true\" width=\"100%\">\n\n```python\n# Forward transform\ny = isac_fb(x)\nx_reconstructed = isac_fb.decoder(y)\n\n# Visualize time-frequency representation\nisac_fb.ISACgram(x, log_scale=True)\n```\n\n<img src=\"https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/ISAC_coeff.png?raw=true\" width=\"100%\">\n\n### HybrA with Learnable Filters\n\n```python\nfrom hybra import HybrA\n\n# Create hybrid filterbank with learnable components\nhybra_fb = HybrA(\n kernel_size=1024,\n learned_kernel_size=23, \n num_channels=128, \n L=L, \n fs=fs, \n tighten=True\n)\n\n# Visualize frequency response\nhybra_fb.plot_response()\n\n# Check condition number for stability\nprint(f\"Condition number: {hybra_fb.condition_number():.2f}\")\n```\nCondition number: 1.06\n<img src=\"https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/HybrA_response.png?raw=true\" width=\"100%\">\n\n```python\n# Forward pass (supports gradients for training)\ny = hybra_fb(x)\nx_reconstructed = hybra_fb.decoder(y)\n\n# Visualize time-frequency representation\nhybra_fb.ISACgram(x, log_scale=True)\n```\n\n<img src=\"https://github.com/danedane-haider/HybrA-Filterbanks/blob/main/plots/HybrA_coeff.png?raw=true\" width=\"100%\">\n\n### ISAC Spectrograms and MFCCs\n\n```python\nfrom hybra import ISACSpec, ISACCC\n\n# Spectrogram with temporal averaging for robust feature extraction\nspectrogram = ISACSpec(\n kernel_size=1024,\n num_channels=40, \n L=L, \n fs=fs, \n power=2.0,\n is_log=True\n)\n\n# MFCC-like cepstral coefficients for speech recognition\nmfcc_extractor = ISACCC(\n kernel_size=1024,\n num_channels=40,\n num_cc=13, \n L=L, \n fs=fs\n)\n\n# Extract features\nspec_coeffs = spectrogram(x)\nmfcc_coeffs = mfcc_extractor(x)\n\nprint(f\"Spectrogram shape: {spec_coeffs.shape}\")\nprint(f\"MFCC shape: {mfcc_coeffs.shape}\")\n```\n\n### Integration with Neural Networks\n\nFilterbanks can be easily integrated into neural networks as encoder/decoder pairs:\n```python\nimport torch\nimport torch.nn as nn\nimport torchaudio\nfrom hybra import HybrA\n\nclass Net(nn.Module):\n def __init__(self):\n super().__init__()\n\n self.linear_before = nn.Linear(40, 400)\n\n self.gru = nn.GRU(\n input_size=400,\n hidden_size=400,\n num_layers=2,\n batch_first=True,\n )\n\n self.linear_after = nn.Linear(400, 600)\n self.linear_after2 = nn.Linear(600, 600)\n self.linear_after3 = nn.Linear(600, 40)\n\n\n def forward(self, x):\n\n x = x.permute(0, 2, 1)\n x = torch.relu(self.linear_before(x))\n x, _ = self.gru(x)\n x = torch.relu(self.linear_after(x))\n x = torch.relu(self.linear_after2(x))\n x = torch.sigmoid(self.linear_after3(x))\n x = x.permute(0, 2, 1)\n\n return x\n\nclass HybridfilterbankModel(nn.Module):\n def __init__(self):\n super().__init__()\n\n self.nsnet = Net()\n self.fb = HybrA(fs=16000)\n\n def forward(self, x):\n x = self.fb(x)\n mask = self.nsnet(torch.log10(torch.max(x.abs()**2, 1e-8 * torch.ones_like(x, dtype=torch.float32))))\n return self.fb.decoder(x*mask)\n\nif __name__ == '__main__':\n audio, fs = torchaudio.load('your_audio.wav') \n model = HybridfilterbankModel()\n model(audio)\n```\n\n## Citation\n\nIf you find our work valuable and use HybrA or ISAC in your work, please cite\n\n```\n@inproceedings{haider2024holdmetight,\n author = {Haider, Daniel and Perfler, Felix and Lostanlen, Vincent and Ehler, Martin and Balazs, Peter},\n booktitle = {Annual Conference of the International Speech Communication Association (Interspeech)},\n year = {2024},\n title = {Hold me tight: Stable encoder/decoder design for speech enhancement},\n}\n@inproceedings{haider2025isac,\n author = {Haider, Daniel and Perfler, Felix and Balazs, Peter and Hollomey, Clara and Holighaus, Nicki},\n title = {{ISAC}: An Invertible and Stable Auditory Filter\n Bank with Customizable Kernels for ML Integration},\n booktitle = {International Conference on Sampling Theory and Applications (SampTA)},\n year = {2025}\n}\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "A module for trainable encoder/decoder filterbanks with auditory bias.",
"version": "2025.9.5",
"project_urls": {
"Repository": "https://github.com/danedane-haider/Hybrid-Auditory-Filterbanks.git"
},
"split_keywords": [
"filterbank",
" filterbank learning",
" feature extraction",
" hybrid filterbanks",
" encoder",
" decoder",
" auditory",
" frames",
" audio front-end",
" audio",
" machine-learning",
" pytorch",
" speech",
" music",
" \u26f7\ufe0f"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "1957651bfc2462435f097f2b8666689cd2e210550a49e6f2dd603d8ad96cfd20",
"md5": "4931d3bd87326a5d34fe2f66fb883937",
"sha256": "e3f7e6dfc7d3db2a98e3a1dcb9c28362fd30af6497309b08cafe926ff7a89cb6"
},
"downloads": -1,
"filename": "hybra-2025.9.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4931d3bd87326a5d34fe2f66fb883937",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 27102,
"upload_time": "2025-09-11T13:31:13",
"upload_time_iso_8601": "2025-09-11T13:31:13.390365Z",
"url": "https://files.pythonhosted.org/packages/19/57/651bfc2462435f097f2b8666689cd2e210550a49e6f2dd603d8ad96cfd20/hybra-2025.9.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "bd8874ad16e9255e807512df7617a745b7087a1ebae047c5b5c8374f205b82e4",
"md5": "a6b284ae3fd5de4ac3cd48acdb0258d3",
"sha256": "4ac6747047284f9d42d60eae1580ed111cbd81c8e72cde6d7bad8e2347e7aec1"
},
"downloads": -1,
"filename": "hybra-2025.9.5.tar.gz",
"has_sig": false,
"md5_digest": "a6b284ae3fd5de4ac3cd48acdb0258d3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 32722,
"upload_time": "2025-09-11T13:31:14",
"upload_time_iso_8601": "2025-09-11T13:31:14.670426Z",
"url": "https://files.pythonhosted.org/packages/bd/88/74ad16e9255e807512df7617a745b7087a1ebae047c5b5c8374f205b82e4/hybra-2025.9.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-11 13:31:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "danedane-haider",
"github_project": "Hybrid-Auditory-Filterbanks",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "hybra"
}