sdab


Namesdab JSON
Version 0.1.2 PyPI version JSON
download
home_pagehttps://github.com/MetythornPenn/sdab.git
SummaryKhmer Speech To Text Inference API using Wav2Vec2 with Pretrain Model
upload_time2024-05-30 03:05:25
maintainerNone
docs_urlNone
authorMetythorn Penn
requires_pythonNone
licenseApache Software License 2.0
keywords asr
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Sdab

#### Khmer Automatic Speech Recognition


Sdab is a Python package for Automatic Speech Recognition with focus on Khmer language. It have offline khmer automatic speech recognition model from my Pretrain Model and other that using Wav2Vec2 model.

License: [Apache-2.0 License](https://github.com/MetythornPenn/sdab/blob/main/LICENSE)

Pretrain Model: [Huggingface](https://huggingface.co/metythorn/khmer-asr-openslr)

## Installation


#### Install from PyPI
```sh
pip install sdab
```

#### Install from source

```sh

# clone repo 
git clone https://github.com/MetythornPenn/sdab.git

# install lib from source
pip install -e .
```

## Usage

#### Download sample audio

```bash
wget -O audio.wav https://github.com/MetythornPenn/sdab/blob/main/sample/audio.wav
```

#### Python API

```python
from sdab import Sdab

file_path = "audio.wav"
model_name = "metythorn/khmer-asr-openslr"  # or local directory path

sdab = Sdab( file_path = file_path, model_name = model_name)
print(sdab.result)

# result : ស្ពានកំពងចំលងអ្នកលើងនៅព្រីវែញជាស្ពានវេញជាងគេសក្នុងព្រសរាជាអាចកម្ពុជា
```

- `file_path`: path of audio file
- `model_name` : pretrain model path from `huggingface` or `local`
- `device` : should be `cpu` or `cuda` but I use `cpu` by default
- `tokenized`: show `[PAD]` in output, `False` by default
- `return`: Khmer text from ASR

## Reference 
- Inspired by [Techcast](https://www.youtube.com/watch?v=ekhFo-6JzLQ&t=28s)
- Khmer word segmentation from SeangHay [khmercut](https://github.com/seanghay/khmercut.git) | [khmersegment](https://github.com/seanghay/khmersegment)
- Wav2Vec2 from Facebook [Wav2Vec2](https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/README.md)



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MetythornPenn/sdab.git",
    "name": "sdab",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "asr",
    "author": "Metythorn Penn",
    "author_email": "metythorn@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/5e/f1/48ef0a7504355cb664cc07e7ecbaa8316b86943a5c4f65d416745afae615/sdab-0.1.2.tar.gz",
    "platform": null,
    "description": "# Sdab\n\n#### Khmer Automatic Speech Recognition\n\n\nSdab is a Python package for Automatic Speech Recognition with focus on Khmer language. It have offline khmer automatic speech recognition model from my Pretrain Model and other that using Wav2Vec2 model.\n\nLicense: [Apache-2.0 License](https://github.com/MetythornPenn/sdab/blob/main/LICENSE)\n\nPretrain Model: [Huggingface](https://huggingface.co/metythorn/khmer-asr-openslr)\n\n## Installation\n\n\n#### Install from PyPI\n```sh\npip install sdab\n```\n\n#### Install from source\n\n```sh\n\n# clone repo \ngit clone https://github.com/MetythornPenn/sdab.git\n\n# install lib from source\npip install -e .\n```\n\n## Usage\n\n#### Download sample audio\n\n```bash\nwget -O audio.wav https://github.com/MetythornPenn/sdab/blob/main/sample/audio.wav\n```\n\n#### Python API\n\n```python\nfrom sdab import Sdab\n\nfile_path = \"audio.wav\"\nmodel_name = \"metythorn/khmer-asr-openslr\"  # or local directory path\n\nsdab = Sdab( file_path = file_path, model_name = model_name)\nprint(sdab.result)\n\n# result : \u179f\u17d2\u1796\u17b6\u1793\u1780\u17c6\u1796\u1784\u1785\u17c6\u179b\u1784\u17a2\u17d2\u1793\u1780\u179b\u17be\u1784\u1793\u17c5\u1796\u17d2\u179a\u17b8\u179c\u17c2\u1789\u1787\u17b6\u179f\u17d2\u1796\u17b6\u1793\u179c\u17c1\u1789\u1787\u17b6\u1784\u1782\u17c1\u179f\u1780\u17d2\u1793\u17bb\u1784\u1796\u17d2\u179a\u179f\u179a\u17b6\u1787\u17b6\u17a2\u17b6\u1785\u1780\u1798\u17d2\u1796\u17bb\u1787\u17b6\n```\n\n- `file_path`: path of audio file\n- `model_name` : pretrain model path from `huggingface` or `local`\n- `device` : should be `cpu` or `cuda` but I use `cpu` by default\n- `tokenized`: show `[PAD]` in output, `False` by default\n- `return`: Khmer text from ASR\n\n## Reference \n- Inspired by [Techcast](https://www.youtube.com/watch?v=ekhFo-6JzLQ&t=28s)\n- Khmer word segmentation from SeangHay [khmercut](https://github.com/seanghay/khmercut.git) | [khmersegment](https://github.com/seanghay/khmersegment)\n- Wav2Vec2 from Facebook [Wav2Vec2](https://github.com/facebookresearch/fairseq/blob/main/examples/wav2vec/README.md)\n\n\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "Khmer Speech To Text Inference API using Wav2Vec2 with Pretrain Model",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://github.com/MetythornPenn/sdab.git"
    },
    "split_keywords": [
        "asr"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "88ae7ea9576c5071115233846f6716c4da97d7093dc331404222a2085c43ecf0",
                "md5": "e93a3242af4b70904ffec8f8e072cb4c",
                "sha256": "230a9b3b416cacbf6b72a901761b4033afe3b3de463e33a4e2cf1244103f6b64"
            },
            "downloads": -1,
            "filename": "sdab-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e93a3242af4b70904ffec8f8e072cb4c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 7233,
            "upload_time": "2024-05-30T03:05:24",
            "upload_time_iso_8601": "2024-05-30T03:05:24.026984Z",
            "url": "https://files.pythonhosted.org/packages/88/ae/7ea9576c5071115233846f6716c4da97d7093dc331404222a2085c43ecf0/sdab-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5ef148ef0a7504355cb664cc07e7ecbaa8316b86943a5c4f65d416745afae615",
                "md5": "b35b6ae95ff9aac88658732eff017667",
                "sha256": "f6f6bfde776115eb7ccca34cd14cd0fd8e5dc15be6be8a4ea6e5c72c494883d8"
            },
            "downloads": -1,
            "filename": "sdab-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "b35b6ae95ff9aac88658732eff017667",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7128,
            "upload_time": "2024-05-30T03:05:25",
            "upload_time_iso_8601": "2024-05-30T03:05:25.687215Z",
            "url": "https://files.pythonhosted.org/packages/5e/f1/48ef0a7504355cb664cc07e7ecbaa8316b86943a5c4f65d416745afae615/sdab-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-30 03:05:25",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MetythornPenn",
    "github_project": "sdab",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sdab"
}
        
Elapsed time: 3.29496s