# PANNs inferece
**panns_inference** provides an easy to use Python interface for audio tagging and sound event detection. The audio tagging and sound event detection models are trained from PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition: https://github.com/qiuqiangkong/audioset_tagging_cnn
## Installation
PyTorch>=1.0 is required.
```
$ pip install panns-inference
```
## Usage
```
$ python3 example.py
```
For example:
```
import librosa
import panns_inference
from panns_inference import AudioTagging, SoundEventDetection, labels
audio_path = 'examples/R9_ZSCveAHg_7s.wav'
(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)
audio = audio[None, :] # (batch_size, segment_samples)
print('------ Audio tagging ------')
at = AudioTagging(checkpoint_path=None, device='cuda')
(clipwise_output, embedding) = at.inference(audio)
print('------ Sound event detection ------')
sed = SoundEventDetection(checkpoint_path=None, device='cuda')
framewise_output = sed.inference(audio)
```
## Results
<pre>
------ Audio tagging ------
Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth
GPU number: 1
Speech: 0.893
Telephone bell ringing: 0.754
Inside, small room: 0.235
Telephone: 0.183
Music: 0.092
Ringtone: 0.047
Inside, large room or hall: 0.028
Alarm: 0.014
Animal: 0.009
Vehicle: 0.008
------ Sound event detection ------
Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth
GPU number: 1
Save fig to results/sed_result.pdf
</pre>
Sound event detection plot:
<img src="resources/sed_result.png" width="600">
## Cite
[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition." arXiv preprint arXiv:1912.10211 (2019).
Raw data
{
"_id": null,
"home_page": "https://github.com/qiuqiangkong/panns_inference",
"name": "panns-inference",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Qiuqiang Kong",
"author_email": "qiuqiangkong@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/42/aa/308a94956501bf8a9a3d389e2c0e5cb405acc81780f7c16ba3898ae08fc3/panns-inference-0.1.1.tar.gz",
"platform": null,
"description": "# PANNs inferece\n\n**panns_inference** provides an easy to use Python interface for audio tagging and sound event detection. The audio tagging and sound event detection models are trained from PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition: https://github.com/qiuqiangkong/audioset_tagging_cnn\n\n## Installation\nPyTorch>=1.0 is required.\n```\n$ pip install panns-inference\n```\n\n## Usage\n```\n$ python3 example.py\n```\n\nFor example:\n\n```\nimport librosa\nimport panns_inference\nfrom panns_inference import AudioTagging, SoundEventDetection, labels\n\naudio_path = 'examples/R9_ZSCveAHg_7s.wav'\n(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)\naudio = audio[None, :] # (batch_size, segment_samples)\n\nprint('------ Audio tagging ------')\nat = AudioTagging(checkpoint_path=None, device='cuda')\n(clipwise_output, embedding) = at.inference(audio)\n\nprint('------ Sound event detection ------')\nsed = SoundEventDetection(checkpoint_path=None, device='cuda')\nframewise_output = sed.inference(audio)\n```\n\n\n## Results\n<pre>\n------ Audio tagging ------\nCheckpoint path: /root/panns_data/Cnn14_mAP=0.431.pth\nGPU number: 1\nSpeech: 0.893\nTelephone bell ringing: 0.754\nInside, small room: 0.235\nTelephone: 0.183\nMusic: 0.092\nRingtone: 0.047\nInside, large room or hall: 0.028\nAlarm: 0.014\nAnimal: 0.009\nVehicle: 0.008\n------ Sound event detection ------\nCheckpoint path: /root/panns_data/Cnn14_mAP=0.431.pth\nGPU number: 1\nSave fig to results/sed_result.pdf\n</pre>\n\nSound event detection plot:\n<img src=\"resources/sed_result.png\" width=\"600\">\n\n## Cite\n[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. \"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition.\" arXiv preprint arXiv:1912.10211 (2019).\n",
"bugtrack_url": null,
"license": "",
"summary": "panns_inference: audio tagging and sound event detection inference toolbox",
"version": "0.1.1",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "adac0558484d9b5383125912b1cedeb95b1f7e928c2b0781f52d77b068f0ba3d",
"md5": "534116798fb5297a96de6141bf065eeb",
"sha256": "97f6b56b6c9467cf00e21f041e1f88933188c65c1b5ca64eeb3c92e37fb27fc2"
},
"downloads": -1,
"filename": "panns_inference-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "534116798fb5297a96de6141bf065eeb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 8267,
"upload_time": "2023-03-26T15:37:16",
"upload_time_iso_8601": "2023-03-26T15:37:16.277605Z",
"url": "https://files.pythonhosted.org/packages/ad/ac/0558484d9b5383125912b1cedeb95b1f7e928c2b0781f52d77b068f0ba3d/panns_inference-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "42aa308a94956501bf8a9a3d389e2c0e5cb405acc81780f7c16ba3898ae08fc3",
"md5": "bd01afbad13c4ab07f6e7dec66b02fd7",
"sha256": "f8074268513571775e154294729b66fc0ccbdbeceb5c8f6eaa9670664e40c03d"
},
"downloads": -1,
"filename": "panns-inference-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "bd01afbad13c4ab07f6e7dec66b02fd7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 6340,
"upload_time": "2023-03-26T15:37:18",
"upload_time_iso_8601": "2023-03-26T15:37:18.254996Z",
"url": "https://files.pythonhosted.org/packages/42/aa/308a94956501bf8a9a3d389e2c0e5cb405acc81780f7c16ba3898ae08fc3/panns-inference-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-26 15:37:18",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "qiuqiangkong",
"github_project": "panns_inference",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "panns-inference"
}