sidlingvo

Name	sidlingvo JSON
Version	0.1.0 JSON
	download
home_page	https://github.com/google/speaker-id/tree/master/lingvo
Summary	Lingvo utils for Google SVL team
upload_time	2024-09-25 14:27:45
maintainer	None
docs_url	None
author	Quan Wang
requires_python	None
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # `sidlingvo`: Lingvo-based libraries for speaker and language recognition

[![Python application](https://github.com/google/speaker-id/actions/workflows/python-app-lingvo.yml/badge.svg)](https://github.com/google/speaker-id/actions/workflows/python-app-lingvo.yml)
[![PyPI Version](https://img.shields.io/pypi/v/sidlingvo.svg)](https://pypi.python.org/pypi/sidlingvo)
[![Python Versions](https://img.shields.io/pypi/pyversions/sidlingvo.svg)](https://pypi.org/project/sidlingvo)
[![Downloads](https://static.pepy.tech/badge/sidlingvo)](https://www.pepy.tech/projects/sidlingvo)

## Overview

Here we open source some of the [Lingvo](https://github.com/tensorflow/lingvo)-based
libraries used in our publications.

## Disclaimer

**This is NOT an official Google product.**

## Feature frontend and TFLite inference

For the feature frontend and TFLite inference, see the API in
`siglingvo/fe_utils.py`.

For pretrained speaker encoder models, the inference API is in `sidlingvo/wav_to_dvector.py`.

For pretrained language identifcation models, the inference API is in `sidlingvo/wav_to_lang.py`.

## GE2E and GE2E-XS losses

GE2E and GE2E-XS losses are implemented in `sidlingvo/loss_layers.py`.

GE2E was proposed in this paper:

* [Generalized End-to-End Loss for Speaker Verification](https://arxiv.org/abs/1710.10467)

GE2E-XS was proposed in this paper:

* [Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition](https://arxiv.org/abs/2104.01989)

## Attentive temporal pooling

Attentive temporal pooling is implemented in `sidlingvo/cumulative_statistics_layer.py`.

It is used by these papers:

* [Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech](https://arxiv.org/abs/2202.12163)
* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)

## Attentive scoring

Attentive scoring is implemented in `sidlingvo/attentive_scoring_layer.py`.

It is proposed in this paper:

* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)


## Citations

Our papers are cited as:

```
@inproceedings{wan2018generalized,
  title={Generalized end-to-end loss for speaker verification},
  author={Wan, Li and Wang, Quan and Papir, Alan and Moreno, Ignacio Lopez},
  booktitle={International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages={4879--4883},
  year={2018},
  organization={IEEE}
}

@inproceedings{pelecanos2021drvectors,
  title={{Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition}},
  author={Jason Pelecanos and Quan Wang and Ignacio Lopez Moreno},
  year={2021},
  booktitle={Proc. Interspeech},
  pages={4603--4607},
  doi={10.21437/Interspeech.2021-641}
}

@inproceedings{pelecanos2022parameter,
  title={Parameter-Free Attentive Scoring for Speaker Verification},
  author={Jason Pelecanos and Quan Wang and Yiling Huang and Ignacio Lopez Moreno},
  booktitle={Odyssey: The Speaker and Language Recognition Workshop},
  year={2022}
}

@inproceedings{wang2022attentive,
  title={Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech},
  author={Quan Wang and Yang Yu and Jason Pelecanos and Yiling Huang and Ignacio Lopez Moreno},
  booktitle={Odyssey: The Speaker and Language Recognition Workshop},
  year={2022}
}

@inproceedings{chojnacka2021speakerstew,
  title={{SpeakerStew: Scaling to many languages with a triaged multilingual text-dependent and text-independent speaker verification system}},
  author={Chojnacka, Roza and Pelecanos, Jason and Wang, Quan and Moreno, Ignacio Lopez},
  booktitle={Prod. Interspeech},
  pages={1064--1068},
  year={2021},
  doi={10.21437/Interspeech.2021-646},
  issn={2958-1796},
}
```

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/google/speaker-id/tree/master/lingvo",
    "name": "sidlingvo",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Quan Wang",
    "author_email": "quanw@google.com",
    "download_url": "https://files.pythonhosted.org/packages/56/cb/8639f551b69fb120ce988d26cf243da28498ddf1ec764f7aa5229b2f029a/sidlingvo-0.1.0.tar.gz",
    "platform": null,
    "description": "# `sidlingvo`: Lingvo-based libraries for speaker and language recognition\n\n[![Python application](https://github.com/google/speaker-id/actions/workflows/python-app-lingvo.yml/badge.svg)](https://github.com/google/speaker-id/actions/workflows/python-app-lingvo.yml)\n[![PyPI Version](https://img.shields.io/pypi/v/sidlingvo.svg)](https://pypi.python.org/pypi/sidlingvo)\n[![Python Versions](https://img.shields.io/pypi/pyversions/sidlingvo.svg)](https://pypi.org/project/sidlingvo)\n[![Downloads](https://static.pepy.tech/badge/sidlingvo)](https://www.pepy.tech/projects/sidlingvo)\n\n## Overview\n\nHere we open source some of the [Lingvo](https://github.com/tensorflow/lingvo)-based\nlibraries used in our publications.\n\n## Disclaimer\n\n**This is NOT an official Google product.**\n\n## Feature frontend and TFLite inference\n\nFor the feature frontend and TFLite inference, see the API in\n`siglingvo/fe_utils.py`.\n\nFor pretrained speaker encoder models, the inference API is in `sidlingvo/wav_to_dvector.py`.\n\nFor pretrained language identifcation models, the inference API is in `sidlingvo/wav_to_lang.py`.\n\n## GE2E and GE2E-XS losses\n\nGE2E and GE2E-XS losses are implemented in `sidlingvo/loss_layers.py`.\n\nGE2E was proposed in this paper:\n\n* [Generalized End-to-End Loss for Speaker Verification](https://arxiv.org/abs/1710.10467)\n\nGE2E-XS was proposed in this paper:\n\n* [Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition](https://arxiv.org/abs/2104.01989)\n\n## Attentive temporal pooling\n\nAttentive temporal pooling is implemented in `sidlingvo/cumulative_statistics_layer.py`.\n\nIt is used by these papers:\n\n* [Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech](https://arxiv.org/abs/2202.12163)\n* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)\n\n## Attentive scoring\n\nAttentive scoring is implemented in `sidlingvo/attentive_scoring_layer.py`.\n\nIt is proposed in this paper:\n\n* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)\n\n\n## Citations\n\nOur papers are cited as:\n\n```\n@inproceedings{wan2018generalized,\n  title={Generalized end-to-end loss for speaker verification},\n  author={Wan, Li and Wang, Quan and Papir, Alan and Moreno, Ignacio Lopez},\n  booktitle={International Conference on Acoustics, Speech and Signal Processing (ICASSP)},\n  pages={4879--4883},\n  year={2018},\n  organization={IEEE}\n}\n\n@inproceedings{pelecanos2021drvectors,\n  title={{Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition}},\n  author={Jason Pelecanos and Quan Wang and Ignacio Lopez Moreno},\n  year={2021},\n  booktitle={Proc. Interspeech},\n  pages={4603--4607},\n  doi={10.21437/Interspeech.2021-641}\n}\n\n@inproceedings{pelecanos2022parameter,\n  title={Parameter-Free Attentive Scoring for Speaker Verification},\n  author={Jason Pelecanos and Quan Wang and Yiling Huang and Ignacio Lopez Moreno},\n  booktitle={Odyssey: The Speaker and Language Recognition Workshop},\n  year={2022}\n}\n\n@inproceedings{wang2022attentive,\n  title={Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech},\n  author={Quan Wang and Yang Yu and Jason Pelecanos and Yiling Huang and Ignacio Lopez Moreno},\n  booktitle={Odyssey: The Speaker and Language Recognition Workshop},\n  year={2022}\n}\n\n@inproceedings{chojnacka2021speakerstew,\n  title={{SpeakerStew: Scaling to many languages with a triaged multilingual text-dependent and text-independent speaker verification system}},\n  author={Chojnacka, Roza and Pelecanos, Jason and Wang, Quan and Moreno, Ignacio Lopez},\n  booktitle={Prod. Interspeech},\n  pages={1064--1068},\n  year={2021},\n  doi={10.21437/Interspeech.2021-646},\n  issn={2958-1796},\n}\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Lingvo utils for Google SVL team",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/google/speaker-id/tree/master/lingvo"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "41b6291772feb7416379e6251392494045310cbf7d00fb7f651e34ccb40e65e1",
                "md5": "933ed65d21b8c29e9de785e05253a3df",
                "sha256": "7af3ae6fa85d281f56bedf497eda891f10249a10197172366b750d67e693676a"
            },
            "downloads": -1,
            "filename": "sidlingvo-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "933ed65d21b8c29e9de785e05253a3df",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 34831,
            "upload_time": "2024-09-25T14:27:44",
            "upload_time_iso_8601": "2024-09-25T14:27:44.054628Z",
            "url": "https://files.pythonhosted.org/packages/41/b6/291772feb7416379e6251392494045310cbf7d00fb7f651e34ccb40e65e1/sidlingvo-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "56cb8639f551b69fb120ce988d26cf243da28498ddf1ec764f7aa5229b2f029a",
                "md5": "0e2e0c3cbea7c6221f7ebfc3d5e43663",
                "sha256": "e2e5a84a7629cf305d85c89f7b77cbc7a903fd2b9548d5432d167c5bb1f5ce6a"
            },
            "downloads": -1,
            "filename": "sidlingvo-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "0e2e0c3cbea7c6221f7ebfc3d5e43663",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 30914,
            "upload_time": "2024-09-25T14:27:45",
            "upload_time_iso_8601": "2024-09-25T14:27:45.368612Z",
            "url": "https://files.pythonhosted.org/packages/56/cb/8639f551b69fb120ce988d26cf243da28498ddf1ec764f7aa5229b2f029a/sidlingvo-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-25 14:27:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "google",
    "github_project": "speaker-id",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sidlingvo"
}

Quan Wang