# `sidlingvo`: Lingvo-based libraries for speaker and language recognition
[](https://github.com/google/speaker-id/actions/workflows/python-app-lingvo.yml)
[](https://pypi.python.org/pypi/sidlingvo)
[](https://pypi.org/project/sidlingvo)
[](https://www.pepy.tech/projects/sidlingvo)
## Overview
Here we open source some of the [Lingvo](https://github.com/tensorflow/lingvo)-based
libraries used in our publications.
## Disclaimer
**This is NOT an official Google product.**
## Feature frontend and TFLite inference
For the feature frontend and TFLite inference, see the API in
`siglingvo/fe_utils.py`.
For pretrained speaker encoder models, the inference API is in `sidlingvo/wav_to_dvector.py`.
For pretrained language identifcation models, the inference API is in `sidlingvo/wav_to_lang.py`.
## GE2E and GE2E-XS losses
GE2E and GE2E-XS losses are implemented in `sidlingvo/loss_layers.py`.
GE2E was proposed in this paper:
* [Generalized End-to-End Loss for Speaker Verification](https://arxiv.org/abs/1710.10467)
GE2E-XS was proposed in this paper:
* [Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition](https://arxiv.org/abs/2104.01989)
## Attentive temporal pooling
Attentive temporal pooling is implemented in `sidlingvo/cumulative_statistics_layer.py`.
It is used by these papers:
* [Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech](https://arxiv.org/abs/2202.12163)
* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)
## Attentive scoring
Attentive scoring is implemented in `sidlingvo/attentive_scoring_layer.py`.
It is proposed in this paper:
* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)
## Citations
Our papers are cited as:
```
@inproceedings{wan2018generalized,
title={Generalized end-to-end loss for speaker verification},
author={Wan, Li and Wang, Quan and Papir, Alan and Moreno, Ignacio Lopez},
booktitle={International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={4879--4883},
year={2018},
organization={IEEE}
}
@inproceedings{pelecanos2021drvectors,
title={{Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition}},
author={Jason Pelecanos and Quan Wang and Ignacio Lopez Moreno},
year={2021},
booktitle={Proc. Interspeech},
pages={4603--4607},
doi={10.21437/Interspeech.2021-641}
}
@inproceedings{pelecanos2022parameter,
title={Parameter-Free Attentive Scoring for Speaker Verification},
author={Jason Pelecanos and Quan Wang and Yiling Huang and Ignacio Lopez Moreno},
booktitle={Odyssey: The Speaker and Language Recognition Workshop},
year={2022}
}
@inproceedings{wang2022attentive,
title={Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech},
author={Quan Wang and Yang Yu and Jason Pelecanos and Yiling Huang and Ignacio Lopez Moreno},
booktitle={Odyssey: The Speaker and Language Recognition Workshop},
year={2022}
}
@inproceedings{chojnacka2021speakerstew,
title={{SpeakerStew: Scaling to many languages with a triaged multilingual text-dependent and text-independent speaker verification system}},
author={Chojnacka, Roza and Pelecanos, Jason and Wang, Quan and Moreno, Ignacio Lopez},
booktitle={Prod. Interspeech},
pages={1064--1068},
year={2021},
doi={10.21437/Interspeech.2021-646},
issn={2958-1796},
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/google/speaker-id/tree/master/lingvo",
"name": "sidlingvo",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Quan Wang",
"author_email": "quanw@google.com",
"download_url": "https://files.pythonhosted.org/packages/56/cb/8639f551b69fb120ce988d26cf243da28498ddf1ec764f7aa5229b2f029a/sidlingvo-0.1.0.tar.gz",
"platform": null,
"description": "# `sidlingvo`: Lingvo-based libraries for speaker and language recognition\n\n[](https://github.com/google/speaker-id/actions/workflows/python-app-lingvo.yml)\n[](https://pypi.python.org/pypi/sidlingvo)\n[](https://pypi.org/project/sidlingvo)\n[](https://www.pepy.tech/projects/sidlingvo)\n\n## Overview\n\nHere we open source some of the [Lingvo](https://github.com/tensorflow/lingvo)-based\nlibraries used in our publications.\n\n## Disclaimer\n\n**This is NOT an official Google product.**\n\n## Feature frontend and TFLite inference\n\nFor the feature frontend and TFLite inference, see the API in\n`siglingvo/fe_utils.py`.\n\nFor pretrained speaker encoder models, the inference API is in `sidlingvo/wav_to_dvector.py`.\n\nFor pretrained language identifcation models, the inference API is in `sidlingvo/wav_to_lang.py`.\n\n## GE2E and GE2E-XS losses\n\nGE2E and GE2E-XS losses are implemented in `sidlingvo/loss_layers.py`.\n\nGE2E was proposed in this paper:\n\n* [Generalized End-to-End Loss for Speaker Verification](https://arxiv.org/abs/1710.10467)\n\nGE2E-XS was proposed in this paper:\n\n* [Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition](https://arxiv.org/abs/2104.01989)\n\n## Attentive temporal pooling\n\nAttentive temporal pooling is implemented in `sidlingvo/cumulative_statistics_layer.py`.\n\nIt is used by these papers:\n\n* [Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech](https://arxiv.org/abs/2202.12163)\n* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)\n\n## Attentive scoring\n\nAttentive scoring is implemented in `sidlingvo/attentive_scoring_layer.py`.\n\nIt is proposed in this paper:\n\n* [Parameter-Free Attentive Scoring for Speaker Verification](https://arxiv.org/abs/2203.05642)\n\n\n## Citations\n\nOur papers are cited as:\n\n```\n@inproceedings{wan2018generalized,\n title={Generalized end-to-end loss for speaker verification},\n author={Wan, Li and Wang, Quan and Papir, Alan and Moreno, Ignacio Lopez},\n booktitle={International Conference on Acoustics, Speech and Signal Processing (ICASSP)},\n pages={4879--4883},\n year={2018},\n organization={IEEE}\n}\n\n@inproceedings{pelecanos2021drvectors,\n title={{Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition}},\n author={Jason Pelecanos and Quan Wang and Ignacio Lopez Moreno},\n year={2021},\n booktitle={Proc. Interspeech},\n pages={4603--4607},\n doi={10.21437/Interspeech.2021-641}\n}\n\n@inproceedings{pelecanos2022parameter,\n title={Parameter-Free Attentive Scoring for Speaker Verification},\n author={Jason Pelecanos and Quan Wang and Yiling Huang and Ignacio Lopez Moreno},\n booktitle={Odyssey: The Speaker and Language Recognition Workshop},\n year={2022}\n}\n\n@inproceedings{wang2022attentive,\n title={Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech},\n author={Quan Wang and Yang Yu and Jason Pelecanos and Yiling Huang and Ignacio Lopez Moreno},\n booktitle={Odyssey: The Speaker and Language Recognition Workshop},\n year={2022}\n}\n\n@inproceedings{chojnacka2021speakerstew,\n title={{SpeakerStew: Scaling to many languages with a triaged multilingual text-dependent and text-independent speaker verification system}},\n author={Chojnacka, Roza and Pelecanos, Jason and Wang, Quan and Moreno, Ignacio Lopez},\n booktitle={Prod. Interspeech},\n pages={1064--1068},\n year={2021},\n doi={10.21437/Interspeech.2021-646},\n issn={2958-1796},\n}\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Lingvo utils for Google SVL team",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/google/speaker-id/tree/master/lingvo"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "41b6291772feb7416379e6251392494045310cbf7d00fb7f651e34ccb40e65e1",
"md5": "933ed65d21b8c29e9de785e05253a3df",
"sha256": "7af3ae6fa85d281f56bedf497eda891f10249a10197172366b750d67e693676a"
},
"downloads": -1,
"filename": "sidlingvo-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "933ed65d21b8c29e9de785e05253a3df",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 34831,
"upload_time": "2024-09-25T14:27:44",
"upload_time_iso_8601": "2024-09-25T14:27:44.054628Z",
"url": "https://files.pythonhosted.org/packages/41/b6/291772feb7416379e6251392494045310cbf7d00fb7f651e34ccb40e65e1/sidlingvo-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "56cb8639f551b69fb120ce988d26cf243da28498ddf1ec764f7aa5229b2f029a",
"md5": "0e2e0c3cbea7c6221f7ebfc3d5e43663",
"sha256": "e2e5a84a7629cf305d85c89f7b77cbc7a903fd2b9548d5432d167c5bb1f5ce6a"
},
"downloads": -1,
"filename": "sidlingvo-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "0e2e0c3cbea7c6221f7ebfc3d5e43663",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 30914,
"upload_time": "2024-09-25T14:27:45",
"upload_time_iso_8601": "2024-09-25T14:27:45.368612Z",
"url": "https://files.pythonhosted.org/packages/56/cb/8639f551b69fb120ce988d26cf243da28498ddf1ec764f7aa5229b2f029a/sidlingvo-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-25 14:27:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "google",
"github_project": "speaker-id",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sidlingvo"
}