Name | easymms JSON |
Version |
0.1.6
JSON |
| download |
home_page | |
Summary | A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project |
upload_time | 2023-06-02 22:26:32 |
maintainer | |
docs_url | None |
author | Abdeladim Sadiki |
requires_python | >=3.8 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# EasyMMS
A simple Python package to easily use [Meta's Massively Multilingual Speech (MMS) project](https://github.com/facebookresearch/fairseq/tree/main/examples/mms).
[](https://pypi.org/project/easymms/)
[](https://github.com/abdeladim-s/easymms/actions/workflows/wheels.yml)
<a target="_blank" href="https://colab.research.google.com/github/abdeladim-s/easymms/blob/main/examples/EasyMMS.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
The [current MMS code](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/asr/infer/mms_infer.py) is using subprocess to call another Python script, which is not very convenient to use, and might lead to several [issues](https://github.com/facebookresearch/fairseq/issues/5117).
This package is created to address those problems and to wrap up the project in an API to easily integrate it with other projects.
<!-- TOC -->
* [Installation](#installation)
* [Quickstart](#quickstart)
* [ASR](#asr)
* [ASR with Alignment](#asr-with-alignment)
* [TTS](#tts)
* [LID](#lid)
* [API reference](#api-reference)
* [License](#license)
* [Disclaimer & Credits](#disclaimer--credits)
<!-- TOC -->
# Installation
1. You will need [ffmpeg](https://ffmpeg.org/download.html) for audio processing
2. Install `easymms` from Pypi
```bash
pip install easymms
```
or from source
```bash
pip install git+https://github.com/abdeladim-s/easymms
```
3. If you want to use the [`Alignment` model](https://github.com/facebookresearch/fairseq/tree/main/examples/mms/data_prep):
* you will need `perl` to use [uroman](https://github.com/isi-nlp/uroman).
Check the [perl website]([perl](https://www.perl.org/get.html)) for installation instructions on different platforms.
* You will need a nightly version of `torchaudio`:
```shell
pip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118
```
* You might need [sox](https://arielvb.readthedocs.io/en/latest/docs/commandline/sox.html) as well.
4. `Fairseq` has not included the `MMS` project yet in the released PYPI version, so until the next release, you will need to install `fairseq` from source:
```shell
pip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq
```
# Quickstart
:warning: There is an [issue](https://github.com/abdeladim-s/easymms/issues/3) with `fairseq` when running the code in interactive environments like Jupyter notebooks.<br/>
**Please use normal Python files** or use the colab notebook provided above.
## ASR
You will need first to download the model weights, you can find and download all the supported models from [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr).
```python
from easymms.models.asr import ASRModel
asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=False)
for i, transcription in enumerate(transcriptions):
print(f">>> file {files[i]}")
print(transcription)
```
## ASR with Alignment
```python
from easymms.models.asr import ASRModel
asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=True)
for i, transcription in enumerate(transcriptions):
print(f">>> file {files[i]}")
for segment in transcription:
print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
print("----")
```
## Alignment model only
```python
from easymms.models.alignment import AlignmentModel
align_model = AlignmentModel()
transcriptions = align_model.align('path/to/wav_file.wav',
transcript=["segment 1", "segment 2"],
lang='eng')
for transcription in transcriptions:
for segment in transcription:
print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
```
## TTS
```python
from easymms.models.tts import TTSModel
tts = TTSModel('eng')
res = tts.synthesize("This is a simple example")
tts.save(res)
```
## LID
Coming Soon
# API reference
You can check the [API reference documentation](https://abdeladim-s.github.io/easymms/) for more details.
# License
Since the models are [released under the CC-BY-NC 4.0 license](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#license).
This project is following the same [License](./LICENSE).
# Disclaimer & Credits
This project is not endorsed or certified by Meta AI and is just simplifying the use of the MMS project.
<br/>
All credit goes to the authors and to Meta for open sourcing the models.
<br/>
Please check their paper [Scaling Speech Technology to 1000+ languages](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/) and their [blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/).
Raw data
{
"_id": null,
"home_page": "",
"name": "easymms",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "",
"author": "Abdeladim Sadiki",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/f2/c7/ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812/easymms-0.1.6.tar.gz",
"platform": null,
"description": "# EasyMMS\n\nA simple Python package to easily use [Meta's Massively Multilingual Speech (MMS) project](https://github.com/facebookresearch/fairseq/tree/main/examples/mms). \n\n[](https://pypi.org/project/easymms/)\n[](https://github.com/abdeladim-s/easymms/actions/workflows/wheels.yml)\n<a target=\"_blank\" href=\"https://colab.research.google.com/github/abdeladim-s/easymms/blob/main/examples/EasyMMS.ipynb\">\n <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n</a>\n\nThe [current MMS code](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/asr/infer/mms_infer.py) is using subprocess to call another Python script, which is not very convenient to use, and might lead to several [issues](https://github.com/facebookresearch/fairseq/issues/5117).\nThis package is created to address those problems and to wrap up the project in an API to easily integrate it with other projects. \n<!-- TOC -->\n* [Installation](#installation)\n* [Quickstart](#quickstart)\n * [ASR](#asr)\n * [ASR with Alignment](#asr-with-alignment)\n * [TTS](#tts)\n * [LID](#lid)\n* [API reference](#api-reference)\n* [License](#license)\n* [Disclaimer & Credits](#disclaimer--credits)\n<!-- TOC -->\n# Installation\n\n1. You will need [ffmpeg](https://ffmpeg.org/download.html) for audio processing\n2. Install `easymms` from Pypi\n```bash\npip install easymms\n```\nor from source \n```bash\npip install git+https://github.com/abdeladim-s/easymms\n```\n\n3. If you want to use the [`Alignment` model](https://github.com/facebookresearch/fairseq/tree/main/examples/mms/data_prep):\n* you will need `perl` to use [uroman](https://github.com/isi-nlp/uroman).\nCheck the [perl website]([perl](https://www.perl.org/get.html)) for installation instructions on different platforms.\n* You will need a nightly version of `torchaudio`:\n```shell\npip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118\n```\n* You might need [sox](https://arielvb.readthedocs.io/en/latest/docs/commandline/sox.html) as well.\n\n\n4. `Fairseq` has not included the `MMS` project yet in the released PYPI version, so until the next release, you will need to install `fairseq` from source:\n```shell\npip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq\n```\n\n# Quickstart\n:warning: There is an [issue](https://github.com/abdeladim-s/easymms/issues/3) with `fairseq` when running the code in interactive environments like Jupyter notebooks.<br/>\n**Please use normal Python files** or use the colab notebook provided above. \n\n## ASR \nYou will need first to download the model weights, you can find and download all the supported models from [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr).\n\n\n```python\nfrom easymms.models.asr import ASRModel\n\nasr = ASRModel(model='/path/to/mms/model')\nfiles = ['path/to/media_file_1', 'path/to/media_file_2']\ntranscriptions = asr.transcribe(files, lang='eng', align=False)\nfor i, transcription in enumerate(transcriptions):\n print(f\">>> file {files[i]}\")\n print(transcription)\n```\n\n## ASR with Alignment\n\n```python \nfrom easymms.models.asr import ASRModel\n\nasr = ASRModel(model='/path/to/mms/model')\nfiles = ['path/to/media_file_1', 'path/to/media_file_2']\ntranscriptions = asr.transcribe(files, lang='eng', align=True)\nfor i, transcription in enumerate(transcriptions):\n print(f\">>> file {files[i]}\")\n for segment in transcription:\n print(f\"{segment['start_time']} -> {segment['end_time']}: {segment['text']}\")\n print(\"----\")\n```\n\n## Alignment model only\n\n```python \nfrom easymms.models.alignment import AlignmentModel\n \nalign_model = AlignmentModel()\ntranscriptions = align_model.align('path/to/wav_file.wav', \n transcript=[\"segment 1\", \"segment 2\"],\n lang='eng')\nfor transcription in transcriptions:\n for segment in transcription:\n print(f\"{segment['start_time']} -> {segment['end_time']}: {segment['text']}\")\n```\n\n## TTS\n```python \nfrom easymms.models.tts import TTSModel\n\ntts = TTSModel('eng')\nres = tts.synthesize(\"This is a simple example\")\ntts.save(res)\n```\n\n## LID \nComing Soon\n\n# API reference\nYou can check the [API reference documentation](https://abdeladim-s.github.io/easymms/) for more details.\n\n# License\nSince the models are [released under the CC-BY-NC 4.0 license](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#license). \nThis project is following the same [License](./LICENSE).\n\n# Disclaimer & Credits\nThis project is not endorsed or certified by Meta AI and is just simplifying the use of the MMS project. \n<br/>\nAll credit goes to the authors and to Meta for open sourcing the models.\n<br/>\nPlease check their paper [Scaling Speech Technology to 1000+ languages](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/) and their [blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project",
"version": "0.1.6",
"project_urls": {
"Documentation": "https://abdeladim-s.github.io/easymms/",
"Source": "https://abdeladim-s.github.io/easymms/",
"Tracker": "https://abdeladim-s.github.io/easymms/issues"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f2c7ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812",
"md5": "180a69d6c18cf2c3a9d32bada1e642f5",
"sha256": "a6115ef5a47348a6c96c6787d49d4596317d71c0608bf4c669164baaa2e36141"
},
"downloads": -1,
"filename": "easymms-0.1.6.tar.gz",
"has_sig": false,
"md5_digest": "180a69d6c18cf2c3a9d32bada1e642f5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 20494,
"upload_time": "2023-06-02T22:26:32",
"upload_time_iso_8601": "2023-06-02T22:26:32.699937Z",
"url": "https://files.pythonhosted.org/packages/f2/c7/ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812/easymms-0.1.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-02 22:26:32",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "easymms"
}