panns-inference

Name	panns-inference JSON
Version	0.1.1 JSON
	download
home_page	https://github.com/qiuqiangkong/panns_inference
Summary	panns_inference: audio tagging and sound event detection inference toolbox
upload_time	2023-03-26 15:37:18
maintainer
docs_url	None
author	Qiuqiang Kong
requires_python	>=3.6
license
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # PANNs inferece

**panns_inference** provides an easy to use Python interface for audio tagging and sound event detection. The audio tagging and sound event detection models are trained from PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition: https://github.com/qiuqiangkong/audioset_tagging_cnn

## Installation
PyTorch>=1.0 is required.
```
$ pip install panns-inference
```

## Usage
```
$ python3 example.py
```

For example:

```
import librosa
import panns_inference
from panns_inference import AudioTagging, SoundEventDetection, labels

audio_path = 'examples/R9_ZSCveAHg_7s.wav'
(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)
audio = audio[None, :]  # (batch_size, segment_samples)

print('------ Audio tagging ------')
at = AudioTagging(checkpoint_path=None, device='cuda')
(clipwise_output, embedding) = at.inference(audio)

print('------ Sound event detection ------')
sed = SoundEventDetection(checkpoint_path=None, device='cuda')
framewise_output = sed.inference(audio)
```


## Results
<pre>
------ Audio tagging ------
Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth
GPU number: 1
Speech: 0.893
Telephone bell ringing: 0.754
Inside, small room: 0.235
Telephone: 0.183
Music: 0.092
Ringtone: 0.047
Inside, large room or hall: 0.028
Alarm: 0.014
Animal: 0.009
Vehicle: 0.008
------ Sound event detection ------
Checkpoint path: /root/panns_data/Cnn14_mAP=0.431.pth
GPU number: 1
Save fig to results/sed_result.pdf
</pre>

Sound event detection plot:
<img src="resources/sed_result.png" width="600">

## Cite
[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition." arXiv preprint arXiv:1912.10211 (2019).

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/qiuqiangkong/panns_inference",
    "name": "panns-inference",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Qiuqiang Kong",
    "author_email": "qiuqiangkong@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/42/aa/308a94956501bf8a9a3d389e2c0e5cb405acc81780f7c16ba3898ae08fc3/panns-inference-0.1.1.tar.gz",
    "platform": null,
    "description": "# PANNs inferece\n\n**panns_inference** provides an easy to use Python interface for audio tagging and sound event detection. The audio tagging and sound event detection models are trained from PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition: https://github.com/qiuqiangkong/audioset_tagging_cnn\n\n## Installation\nPyTorch>=1.0 is required.\n```\n$ pip install panns-inference\n```\n\n## Usage\n```\n$ python3 example.py\n```\n\nFor example:\n\n```\nimport librosa\nimport panns_inference\nfrom panns_inference import AudioTagging, SoundEventDetection, labels\n\naudio_path = 'examples/R9_ZSCveAHg_7s.wav'\n(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)\naudio = audio[None, :]  # (batch_size, segment_samples)\n\nprint('------ Audio tagging ------')\nat = AudioTagging(checkpoint_path=None, device='cuda')\n(clipwise_output, embedding) = at.inference(audio)\n\nprint('------ Sound event detection ------')\nsed = SoundEventDetection(checkpoint_path=None, device='cuda')\nframewise_output = sed.inference(audio)\n```\n\n\n## Results\n<pre>\n------ Audio tagging ------\nCheckpoint path: /root/panns_data/Cnn14_mAP=0.431.pth\nGPU number: 1\nSpeech: 0.893\nTelephone bell ringing: 0.754\nInside, small room: 0.235\nTelephone: 0.183\nMusic: 0.092\nRingtone: 0.047\nInside, large room or hall: 0.028\nAlarm: 0.014\nAnimal: 0.009\nVehicle: 0.008\n------ Sound event detection ------\nCheckpoint path: /root/panns_data/Cnn14_mAP=0.431.pth\nGPU number: 1\nSave fig to results/sed_result.pdf\n</pre>\n\nSound event detection plot:\n<img src=\"resources/sed_result.png\" width=\"600\">\n\n## Cite\n[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. \"PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition.\" arXiv preprint arXiv:1912.10211 (2019).\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "panns_inference: audio tagging and sound event detection inference toolbox",
    "version": "0.1.1",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "adac0558484d9b5383125912b1cedeb95b1f7e928c2b0781f52d77b068f0ba3d",
                "md5": "534116798fb5297a96de6141bf065eeb",
                "sha256": "97f6b56b6c9467cf00e21f041e1f88933188c65c1b5ca64eeb3c92e37fb27fc2"
            },
            "downloads": -1,
            "filename": "panns_inference-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "534116798fb5297a96de6141bf065eeb",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 8267,
            "upload_time": "2023-03-26T15:37:16",
            "upload_time_iso_8601": "2023-03-26T15:37:16.277605Z",
            "url": "https://files.pythonhosted.org/packages/ad/ac/0558484d9b5383125912b1cedeb95b1f7e928c2b0781f52d77b068f0ba3d/panns_inference-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "42aa308a94956501bf8a9a3d389e2c0e5cb405acc81780f7c16ba3898ae08fc3",
                "md5": "bd01afbad13c4ab07f6e7dec66b02fd7",
                "sha256": "f8074268513571775e154294729b66fc0ccbdbeceb5c8f6eaa9670664e40c03d"
            },
            "downloads": -1,
            "filename": "panns-inference-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "bd01afbad13c4ab07f6e7dec66b02fd7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 6340,
            "upload_time": "2023-03-26T15:37:18",
            "upload_time_iso_8601": "2023-03-26T15:37:18.254996Z",
            "url": "https://files.pythonhosted.org/packages/42/aa/308a94956501bf8a9a3d389e2c0e5cb405acc81780f7c16ba3898ae08fc3/panns-inference-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-26 15:37:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "qiuqiangkong",
    "github_project": "panns_inference",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "panns-inference"
}

Qiuqiang Kong