torch-vggish-yamnet


Nametorch-vggish-yamnet JSON
Version 0.2.1 PyPI version JSON
download
home_pagehttps://github.com/StefanoGiacomelli/torch_vggish_yamnet
Summarytorch_vggish_yamnet: PyTorch VGGish & YAMNet models
upload_time2024-06-13 13:51:14
maintainerNone
docs_urlNone
authorStefano Giacomelli (Ph.D. student UnivAQ)
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Torch VGGish & YAMNet embedding models

**torch_vggish_yamnet** provides a ready-to-use PyTorch porting of AudioSet (Google) audio embedding models. The audio tagging models are trained from Models for AudioSet: A Large Scale Dataset of Audio Events: https://github.com/tensorflow/models/tree/master/research/audioset

This is a re-structured forked repository/project from ```torch_audioset``` (see References)

## Installation
PyTorch>=1.0 is required (dependecies are auto-installed).
```
pip install torch-vggish-yamnet
```

## Usage
```
from torch_vggish_yamnet import yamnet
from torch_vggish_yamnet import vggish
from torch_vggish_yamnet.input_proc import *

# Input signal (x_in) tensor conversion & ad-hoc patching
converter = WaveformToInput()
in_tensor = converter(x_in.float(), in_sr)
in_tensor.shape

# Models init
embedding_yamnet = yamnet.yamnet(pretrained=True)
embedding_vggish = vggish.get_vggish(with_classifier=False, pretrained=True)

# Embedding (forward)
emb_yamnet, _ = embedding_yamnet(in_tensor)  # discard logits
emb_vggish = embedding_vggish(in_tensor)

emb_yamnet.shape, emb_vggish.shape
```

## References
[1] AudioSet Official site: http://g.co/audioset

[2] 
```
@inproceedings{45857,
 title	    = {Audio Set: An ontology and human-labeled dataset for audio events},
 author	    = {Jort F. Gemmeke and Daniel P. W. Ellis and Dylan Freedman and Aren Jansen and Wade Lawrence and R. Channing Moore and Manoj Plakal and Marvin Ritter},
 year	      = {2017},
 booktitle	= {Proc. IEEE ICASSP 2017},
 address	  = {New Orleans, LA}}
```
[3] 
```
@incollection{45611,
title	      = {CNN Architectures for Large-Scale Audio Classification},
author	    = {Shawn Hershey and Sourish Chaudhuri and Daniel P. W. Ellis and Jort F. Gemmeke and Aren Jansen and Channing Moore and Manoj Plakal and Devin Platt and Rif A. Saurous and Bryan Seybold and Malcolm Slaney and Ron Weiss and Kevin Wilson},
year	      = {2017},
URL	        = {https://arxiv.org/abs/1609.09430},
booktitle	  = {International Conference on Acoustics, Speech and Signal Processing (ICASSP)}}
```

[4] torch_audioset GitHub repository: https://github.com/w-hc/torch_audioset/tree/master

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/StefanoGiacomelli/torch_vggish_yamnet",
    "name": "torch-vggish-yamnet",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "Stefano Giacomelli (Ph.D. student UnivAQ)",
    "author_email": "stefano.giacomelli@graduate.univaq.it",
    "download_url": "https://files.pythonhosted.org/packages/bf/a5/ee86aeb801fed1e76c3787badaddd25a3d8cdc5b0c9a132e9ea7cda4f972/torch_vggish_yamnet-0.2.1.tar.gz",
    "platform": null,
    "description": "# Torch VGGish & YAMNet embedding models\n\n**torch_vggish_yamnet** provides a ready-to-use PyTorch porting of AudioSet (Google) audio embedding models. The audio tagging models are trained from Models for AudioSet: A Large Scale Dataset of Audio Events: https://github.com/tensorflow/models/tree/master/research/audioset\n\nThis is a re-structured forked repository/project from ```torch_audioset``` (see References)\n\n## Installation\nPyTorch>=1.0 is required (dependecies are auto-installed).\n```\npip install torch-vggish-yamnet\n```\n\n## Usage\n```\nfrom torch_vggish_yamnet import yamnet\nfrom torch_vggish_yamnet import vggish\nfrom torch_vggish_yamnet.input_proc import *\n\n# Input signal (x_in) tensor conversion & ad-hoc patching\nconverter = WaveformToInput()\nin_tensor = converter(x_in.float(), in_sr)\nin_tensor.shape\n\n# Models init\nembedding_yamnet = yamnet.yamnet(pretrained=True)\nembedding_vggish = vggish.get_vggish(with_classifier=False, pretrained=True)\n\n# Embedding (forward)\nemb_yamnet, _ = embedding_yamnet(in_tensor)  # discard logits\nemb_vggish = embedding_vggish(in_tensor)\n\nemb_yamnet.shape, emb_vggish.shape\n```\n\n## References\n[1] AudioSet Official site: http://g.co/audioset\n\n[2] \n```\n@inproceedings{45857,\n title\t    = {Audio Set: An ontology and human-labeled dataset for audio events},\n author\t    = {Jort F. Gemmeke and Daniel P. W. Ellis and Dylan Freedman and Aren Jansen and Wade Lawrence and R. Channing Moore and Manoj Plakal and Marvin Ritter},\n year\t      = {2017},\n booktitle\t= {Proc. IEEE ICASSP 2017},\n address\t  = {New Orleans, LA}}\n```\n[3] \n```\n@incollection{45611,\ntitle\t      = {CNN Architectures for Large-Scale Audio Classification},\nauthor\t    = {Shawn Hershey and Sourish Chaudhuri and Daniel P. W. Ellis and Jort F. Gemmeke and Aren Jansen and Channing Moore and Manoj Plakal and Devin Platt and Rif A. Saurous and Bryan Seybold and Malcolm Slaney and Ron Weiss and Kevin Wilson},\nyear\t      = {2017},\nURL\t        = {https://arxiv.org/abs/1609.09430},\nbooktitle\t  = {International Conference on Acoustics, Speech and Signal Processing (ICASSP)}}\n```\n\n[4] torch_audioset GitHub repository: https://github.com/w-hc/torch_audioset/tree/master\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "torch_vggish_yamnet: PyTorch VGGish & YAMNet models",
    "version": "0.2.1",
    "project_urls": {
        "Homepage": "https://github.com/StefanoGiacomelli/torch_vggish_yamnet"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7c8ca3e0c1c3fbc7ca87839329bf1f3affe72dea542e47fc6413ee00e30a353e",
                "md5": "00f6cb6692c17f6b832a54d74d54ab5c",
                "sha256": "04ce86c077dfb1e6ccfaec849895088cf13af84a355e05ce6d1f495451af3b5c"
            },
            "downloads": -1,
            "filename": "torch_vggish_yamnet-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "00f6cb6692c17f6b832a54d74d54ab5c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 10822,
            "upload_time": "2024-06-13T13:51:13",
            "upload_time_iso_8601": "2024-06-13T13:51:13.411171Z",
            "url": "https://files.pythonhosted.org/packages/7c/8c/a3e0c1c3fbc7ca87839329bf1f3affe72dea542e47fc6413ee00e30a353e/torch_vggish_yamnet-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bfa5ee86aeb801fed1e76c3787badaddd25a3d8cdc5b0c9a132e9ea7cda4f972",
                "md5": "5b5b2f22199f9df1bbd249fae5999238",
                "sha256": "9794a5c3374512e66bd143f98d925c4546152c066ed6462431c7c9b40f42afb9"
            },
            "downloads": -1,
            "filename": "torch_vggish_yamnet-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "5b5b2f22199f9df1bbd249fae5999238",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 9928,
            "upload_time": "2024-06-13T13:51:14",
            "upload_time_iso_8601": "2024-06-13T13:51:14.677153Z",
            "url": "https://files.pythonhosted.org/packages/bf/a5/ee86aeb801fed1e76c3787badaddd25a3d8cdc5b0c9a132e9ea7cda4f972/torch_vggish_yamnet-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-13 13:51:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "StefanoGiacomelli",
    "github_project": "torch_vggish_yamnet",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "torch-vggish-yamnet"
}
        
Elapsed time: 0.41075s