easymms

Name	easymms JSON
Version	0.1.6 JSON
	download
home_page
Summary	A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project
upload_time	2023-06-02 22:26:32
maintainer
docs_url	None
author	Abdeladim Sadiki
requires_python	>=3.8
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # EasyMMS

A simple Python package to easily use [Meta's Massively Multilingual Speech (MMS) project](https://github.com/facebookresearch/fairseq/tree/main/examples/mms). 

[![PyPi version](https://badgen.net/pypi/v/easymms)](https://pypi.org/project/easymms/)
[![wheels](https://github.com/abdeladim-s/easymms/actions/workflows/wheels.yml/badge.svg)](https://github.com/abdeladim-s/easymms/actions/workflows/wheels.yml)
<a target="_blank" href="https://colab.research.google.com/github/abdeladim-s/easymms/blob/main/examples/EasyMMS.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

The [current MMS code](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/asr/infer/mms_infer.py) is using subprocess to call another Python script, which is not very convenient to use, and might lead to several [issues](https://github.com/facebookresearch/fairseq/issues/5117).
This package is created to address those problems and to wrap up the project in an API to easily integrate it with other projects. 
<!-- TOC -->
* [Installation](#installation)
* [Quickstart](#quickstart)
  * [ASR](#asr)
  * [ASR with Alignment](#asr-with-alignment)
  * [TTS](#tts)
  * [LID](#lid)
* [API reference](#api-reference)
* [License](#license)
* [Disclaimer & Credits](#disclaimer--credits)
<!-- TOC -->
# Installation

1. You will need [ffmpeg](https://ffmpeg.org/download.html) for audio processing
2. Install `easymms` from Pypi
```bash
pip install easymms
```
or from source 
```bash
pip install git+https://github.com/abdeladim-s/easymms
```

3. If you want to use the [`Alignment` model](https://github.com/facebookresearch/fairseq/tree/main/examples/mms/data_prep):
* you will need `perl` to use [uroman](https://github.com/isi-nlp/uroman).
Check the [perl website]([perl](https://www.perl.org/get.html)) for installation instructions on different platforms.
* You will need a nightly version of `torchaudio`:
```shell
pip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118
```
* You might need [sox](https://arielvb.readthedocs.io/en/latest/docs/commandline/sox.html) as well.


4. `Fairseq` has not included the `MMS` project yet in the released PYPI version, so until the next release, you will need to install `fairseq` from source:
```shell
pip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq
```

# Quickstart
:warning: There is an [issue](https://github.com/abdeladim-s/easymms/issues/3) with `fairseq` when running the code in interactive environments like Jupyter notebooks.<br/>
**Please use normal Python files** or use the colab notebook provided above. 

## ASR 
You will need first to download the model weights, you can find and download all the supported models from [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr).


```python
from easymms.models.asr import ASRModel

asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=False)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    print(transcription)
```

## ASR with Alignment

```python 
from easymms.models.asr import ASRModel

asr = ASRModel(model='/path/to/mms/model')
files = ['path/to/media_file_1', 'path/to/media_file_2']
transcriptions = asr.transcribe(files, lang='eng', align=True)
for i, transcription in enumerate(transcriptions):
    print(f">>> file {files[i]}")
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
    print("----")
```

## Alignment model only

```python 
from easymms.models.alignment import AlignmentModel
    
align_model = AlignmentModel()
transcriptions = align_model.align('path/to/wav_file.wav', 
                                   transcript=["segment 1", "segment 2"],
                                   lang='eng')
for transcription in transcriptions:
    for segment in transcription:
        print(f"{segment['start_time']} -> {segment['end_time']}: {segment['text']}")
```

## TTS
```python 
from easymms.models.tts import TTSModel

tts = TTSModel('eng')
res = tts.synthesize("This is a simple example")
tts.save(res)
```

## LID 
Coming Soon

# API reference
You can check the [API reference documentation](https://abdeladim-s.github.io/easymms/) for more details.

# License
Since the models are [released under the CC-BY-NC 4.0 license](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#license). 
This project is following the same [License](./LICENSE).

# Disclaimer & Credits
This project is not endorsed or certified by Meta AI and is just simplifying the use of the MMS project. 
<br/>
All credit goes to the authors and to Meta for open sourcing the models.
<br/>
Please check their paper [Scaling Speech Technology to 1000+ languages](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/) and their [blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/).

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "easymms",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "",
    "author": "Abdeladim Sadiki",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/f2/c7/ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812/easymms-0.1.6.tar.gz",
    "platform": null,
    "description": "# EasyMMS\n\nA simple Python package to easily use [Meta's Massively Multilingual Speech (MMS) project](https://github.com/facebookresearch/fairseq/tree/main/examples/mms). \n\n[![PyPi version](https://badgen.net/pypi/v/easymms)](https://pypi.org/project/easymms/)\n[![wheels](https://github.com/abdeladim-s/easymms/actions/workflows/wheels.yml/badge.svg)](https://github.com/abdeladim-s/easymms/actions/workflows/wheels.yml)\n<a target=\"_blank\" href=\"https://colab.research.google.com/github/abdeladim-s/easymms/blob/main/examples/EasyMMS.ipynb\">\n  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n</a>\n\nThe [current MMS code](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/asr/infer/mms_infer.py) is using subprocess to call another Python script, which is not very convenient to use, and might lead to several [issues](https://github.com/facebookresearch/fairseq/issues/5117).\nThis package is created to address those problems and to wrap up the project in an API to easily integrate it with other projects. \n<!-- TOC -->\n* [Installation](#installation)\n* [Quickstart](#quickstart)\n  * [ASR](#asr)\n  * [ASR with Alignment](#asr-with-alignment)\n  * [TTS](#tts)\n  * [LID](#lid)\n* [API reference](#api-reference)\n* [License](#license)\n* [Disclaimer & Credits](#disclaimer--credits)\n<!-- TOC -->\n# Installation\n\n1. You will need [ffmpeg](https://ffmpeg.org/download.html) for audio processing\n2. Install `easymms` from Pypi\n```bash\npip install easymms\n```\nor from source \n```bash\npip install git+https://github.com/abdeladim-s/easymms\n```\n\n3. If you want to use the [`Alignment` model](https://github.com/facebookresearch/fairseq/tree/main/examples/mms/data_prep):\n* you will need `perl` to use [uroman](https://github.com/isi-nlp/uroman).\nCheck the [perl website]([perl](https://www.perl.org/get.html)) for installation instructions on different platforms.\n* You will need a nightly version of `torchaudio`:\n```shell\npip install -U --pre torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118\n```\n* You might need [sox](https://arielvb.readthedocs.io/en/latest/docs/commandline/sox.html) as well.\n\n\n4. `Fairseq` has not included the `MMS` project yet in the released PYPI version, so until the next release, you will need to install `fairseq` from source:\n```shell\npip uninstall fairseq && pip install git+https://github.com/facebookresearch/fairseq\n```\n\n# Quickstart\n:warning: There is an [issue](https://github.com/abdeladim-s/easymms/issues/3) with `fairseq` when running the code in interactive environments like Jupyter notebooks.<br/>\n**Please use normal Python files** or use the colab notebook provided above. \n\n## ASR \nYou will need first to download the model weights, you can find and download all the supported models from [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms#asr).\n\n\n```python\nfrom easymms.models.asr import ASRModel\n\nasr = ASRModel(model='/path/to/mms/model')\nfiles = ['path/to/media_file_1', 'path/to/media_file_2']\ntranscriptions = asr.transcribe(files, lang='eng', align=False)\nfor i, transcription in enumerate(transcriptions):\n    print(f\">>> file {files[i]}\")\n    print(transcription)\n```\n\n## ASR with Alignment\n\n```python \nfrom easymms.models.asr import ASRModel\n\nasr = ASRModel(model='/path/to/mms/model')\nfiles = ['path/to/media_file_1', 'path/to/media_file_2']\ntranscriptions = asr.transcribe(files, lang='eng', align=True)\nfor i, transcription in enumerate(transcriptions):\n    print(f\">>> file {files[i]}\")\n    for segment in transcription:\n        print(f\"{segment['start_time']} -> {segment['end_time']}: {segment['text']}\")\n    print(\"----\")\n```\n\n## Alignment model only\n\n```python \nfrom easymms.models.alignment import AlignmentModel\n    \nalign_model = AlignmentModel()\ntranscriptions = align_model.align('path/to/wav_file.wav', \n                                   transcript=[\"segment 1\", \"segment 2\"],\n                                   lang='eng')\nfor transcription in transcriptions:\n    for segment in transcription:\n        print(f\"{segment['start_time']} -> {segment['end_time']}: {segment['text']}\")\n```\n\n## TTS\n```python \nfrom easymms.models.tts import TTSModel\n\ntts = TTSModel('eng')\nres = tts.synthesize(\"This is a simple example\")\ntts.save(res)\n```\n\n## LID \nComing Soon\n\n# API reference\nYou can check the [API reference documentation](https://abdeladim-s.github.io/easymms/) for more details.\n\n# License\nSince the models are [released under the CC-BY-NC 4.0 license](https://github.com/facebookresearch/fairseq/blob/main/examples/mms/README.md#license). \nThis project is following the same [License](./LICENSE).\n\n# Disclaimer & Credits\nThis project is not endorsed or certified by Meta AI and is just simplifying the use of the MMS project. \n<br/>\nAll credit goes to the authors and to Meta for open sourcing the models.\n<br/>\nPlease check their paper [Scaling Speech Technology to 1000+ languages](https://research.facebook.com/publications/scaling-speech-technology-to-1000-languages/) and their [blog post](https://ai.facebook.com/blog/multilingual-model-speech-recognition/).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simple Python package to easily use Meta's Massively Multilingual Speech (MMS) project",
    "version": "0.1.6",
    "project_urls": {
        "Documentation": "https://abdeladim-s.github.io/easymms/",
        "Source": "https://abdeladim-s.github.io/easymms/",
        "Tracker": "https://abdeladim-s.github.io/easymms/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f2c7ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812",
                "md5": "180a69d6c18cf2c3a9d32bada1e642f5",
                "sha256": "a6115ef5a47348a6c96c6787d49d4596317d71c0608bf4c669164baaa2e36141"
            },
            "downloads": -1,
            "filename": "easymms-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "180a69d6c18cf2c3a9d32bada1e642f5",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 20494,
            "upload_time": "2023-06-02T22:26:32",
            "upload_time_iso_8601": "2023-06-02T22:26:32.699937Z",
            "url": "https://files.pythonhosted.org/packages/f2/c7/ac6dbf524746d657d9fd3f7bbcf34e5f16d34b0573f3a30db18610857812/easymms-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-02 22:26:32",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "easymms"
}

Abdeladim Sadiki