# OpenPhonemizer
**[Audio Samples](https://neuralvox.github.io/OpenPhonemizer/) / [Models](https://huggingface.co/openphonemizer/ckpt) / [Live Demo](https://huggingface.co/spaces/openphonemizer/PhonemizerHub) / [Dataset](https://huggingface.co/datasets/mrfakename/ipa-phonemes-word-pairs)**
A permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning. This Phonemizer attempts to replicate the `espeak` Phonemizer while remaining permissively-licensed.
OpenPhonemizer is heavily based on the amazing [DeepPhonemizer](https://github.com/as-ideas/DeepPhonemizer). The main changes are the model checkpoints, which more closely resembles `espeak`'s phonemizer.
Optional GPL-licensed portions are available [here](https://github.com/NeuralVox/OpenPhonemizer-GPL).
## Features
* Permissively licensed & open source
* Fast & efficient
* Works well with TTS models that depend on phonemizer or espeak
* Automatic GPU acceleration (CUDA/MPS) if available
## Project
* Project status: Alpha
* Supported languages: English (more coming soon! What languages do you want? Let me know!)
## Installation
Easily install OpenPhonemizer:
```bash
pip install -U openphonemizer
```
Or, install the latest version from Git:
```bash
pip install -U "openphonemizer @ git+https://github.com/NeuralVox/OpenPhonemizer"
```
## Usage
### OpenPhonemizer
```python
from openphonemizer import OpenPhonemizer
phonemizer = OpenPhonemizer()
# Or specify a custom checkpoint path: OpenPhonemizer('model.pt')
phonemizer('test')
phonemizer('hello this is a test')
```
Please note that by default, OpenPhonemizer loads a built-in dictionary of words/phonemes. Because storage is quite inefficient, the model is ~100MB larger and uses more memory, however it is _much_ faster. If you're low on VRAM, you can either run the model exclusively on CPU (`disable_gpu=True`) or load a model without a dictionary.
**Load without dictionary:**
```python
from cached_path import cached_path
from openphonemizer import OpenPhonemizer
phonemizer = OpenPhonemizer(str(cached_path('hf://openphonemizer/ckpt/best_model_no_optim.pt'))) # add disable_gpu=True to run on CPU only
phonemizer('test')
phonemizer('hello this is a test')
```
**[NEW] Use autoregressive model:**
NEW: An autoregressive model is now available. The autoregressive model is more accurate but slightly slower. To use the autoregressive model:
```python
OpenPhonemizer(str(cached_path('hf://openphonemizer/autoreg-ckpt/best_model.pt')))
```
## Evaluation
We introduce PhonemizerBench, a benchmark to evaluate the similarity of alternate Phonemizers to `espeak` (this benchmark measures against `espeak`, assuming it's score is 100).
**Run 1**
| Phonemizer | Score |
| --- | --- |
| Gruut | 75.08 |
| DeepPhonemizer | 85.24 |
| G2P_EN | 86.16 |
| OpenPhonemizer | 93.64 |
| OpenPhonemizer Autoregressive | **93.74** |
**Run 2**
| Phonemizer | Score |
| --- | --- |
| Gruut | 75.54 |
| DeepPhonemizer | 85.03 |
| G2P_EN | 86.28 |
| OpenPhonemizer | 93.54 |
| OpenPhonemizer Autoregressive | **93.59** |
**Run 3**
| Phonemizer | Score |
| --- | --- |
| Gruut | 73.72 |
| DeepPhonemizer | 84.64 |
| G2P_EN | 85.74 |
| OpenPhonemizer | 93.38 |
| OpenPhonemizer Autoregressive | **93.67** |
## Todo
- [x] Train autoregressive model
- [x] Allow disabling GPU usage
- [ ] Multilingual support (any requests?)
## License
OpenPhonemizer is open source software. You may use it under the BSD-3-Clause Clear license found in the LICENSE file.
Please note that OpenPhonemizer depends on software under different licenses, it is your responsibility when redistributing or modifying OpenPhonemizer to comply with these licenses (notably LGPL).
*By contributing to this repository, you grant the author the permission to change the license in the future at their sole discretion or offer different licenses to other individuals.*
**NOTE:** Model weights may be licensed under different licenses. Please make sure to check all model weights for licenses.
## Credits
Special thanks to [Christian Schäfer](https://github.com/cschaefer26), who created [Deep Phonemizer](https://github.com/as-ideas/DeepPhonemizer), on which OpenPhonemizer relies. OpenPhonemizer uses [num2words](https://github.com/savoirfairelinux/num2words) to read out large numbers and [cached_path](https://github.com/allenai/cached_path) from Allen AI for caching models.
OpenPhonemizer was created by [mrfakename](https://twitter.com/realmrfakename).
Raw data
{
"_id": null,
"home_page": "https://github.com/NeuralVox/OpenPhonemizer",
"name": "openphonemizer",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "mrfakename",
"author_email": "me@mrfake.name",
"download_url": "https://files.pythonhosted.org/packages/fc/d6/daa9534f5cc41eb7ad6f60d1ecda351cda09381287e4c632831aa0f0a7c7/openphonemizer-0.1.2.tar.gz",
"platform": null,
"description": "# OpenPhonemizer\n\n**[Audio Samples](https://neuralvox.github.io/OpenPhonemizer/) / [Models](https://huggingface.co/openphonemizer/ckpt) / [Live Demo](https://huggingface.co/spaces/openphonemizer/PhonemizerHub) / [Dataset](https://huggingface.co/datasets/mrfakename/ipa-phonemes-word-pairs)**\n\nA permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning. This Phonemizer attempts to replicate the `espeak` Phonemizer while remaining permissively-licensed.\n\nOpenPhonemizer is heavily based on the amazing [DeepPhonemizer](https://github.com/as-ideas/DeepPhonemizer). The main changes are the model checkpoints, which more closely resembles `espeak`'s phonemizer.\n\nOptional GPL-licensed portions are available [here](https://github.com/NeuralVox/OpenPhonemizer-GPL).\n\n## Features\n\n* Permissively licensed & open source\n* Fast & efficient\n* Works well with TTS models that depend on phonemizer or espeak\n* Automatic GPU acceleration (CUDA/MPS) if available\n\n## Project\n\n* Project status: Alpha\n* Supported languages: English (more coming soon! What languages do you want? Let me know!)\n\n## Installation\n\nEasily install OpenPhonemizer:\n\n```bash\npip install -U openphonemizer\n```\n\nOr, install the latest version from Git:\n\n```bash\npip install -U \"openphonemizer @ git+https://github.com/NeuralVox/OpenPhonemizer\"\n```\n\n## Usage\n\n### OpenPhonemizer\n\n```python\nfrom openphonemizer import OpenPhonemizer\nphonemizer = OpenPhonemizer()\n# Or specify a custom checkpoint path: OpenPhonemizer('model.pt')\nphonemizer('test')\nphonemizer('hello this is a test')\n```\n\nPlease note that by default, OpenPhonemizer loads a built-in dictionary of words/phonemes. Because storage is quite inefficient, the model is ~100MB larger and uses more memory, however it is _much_ faster. If you're low on VRAM, you can either run the model exclusively on CPU (`disable_gpu=True`) or load a model without a dictionary.\n\n**Load without dictionary:**\n\n```python\nfrom cached_path import cached_path\nfrom openphonemizer import OpenPhonemizer\nphonemizer = OpenPhonemizer(str(cached_path('hf://openphonemizer/ckpt/best_model_no_optim.pt'))) # add disable_gpu=True to run on CPU only\nphonemizer('test')\nphonemizer('hello this is a test')\n```\n\n**[NEW] Use autoregressive model:**\n\nNEW: An autoregressive model is now available. The autoregressive model is more accurate but slightly slower. To use the autoregressive model:\n\n```python\nOpenPhonemizer(str(cached_path('hf://openphonemizer/autoreg-ckpt/best_model.pt')))\n```\n\n## Evaluation\n\nWe introduce PhonemizerBench, a benchmark to evaluate the similarity of alternate Phonemizers to `espeak` (this benchmark measures against `espeak`, assuming it's score is 100).\n\n**Run 1**\n\n| Phonemizer | Score |\n| --- | --- |\n| Gruut | 75.08 |\n| DeepPhonemizer | 85.24 |\n| G2P_EN | 86.16 |\n| OpenPhonemizer | 93.64 |\n| OpenPhonemizer Autoregressive | **93.74** |\n\n**Run 2**\n\n| Phonemizer | Score |\n| --- | --- |\n| Gruut | 75.54 |\n| DeepPhonemizer | 85.03 |\n| G2P_EN | 86.28 |\n| OpenPhonemizer | 93.54 |\n| OpenPhonemizer Autoregressive | **93.59** |\n\n**Run 3**\n\n| Phonemizer | Score |\n| --- | --- |\n| Gruut | 73.72 |\n| DeepPhonemizer | 84.64 |\n| G2P_EN | 85.74 |\n| OpenPhonemizer | 93.38 |\n| OpenPhonemizer Autoregressive | **93.67** |\n\n## Todo\n\n- [x] Train autoregressive model\n- [x] Allow disabling GPU usage\n- [ ] Multilingual support (any requests?)\n\n## License\n\nOpenPhonemizer is open source software. You may use it under the BSD-3-Clause Clear license found in the LICENSE file.\n\nPlease note that OpenPhonemizer depends on software under different licenses, it is your responsibility when redistributing or modifying OpenPhonemizer to comply with these licenses (notably LGPL).\n\n*By contributing to this repository, you grant the author the permission to change the license in the future at their sole discretion or offer different licenses to other individuals.*\n\n**NOTE:** Model weights may be licensed under different licenses. Please make sure to check all model weights for licenses.\n\n## Credits\n\nSpecial thanks to [Christian Sch\u00e4fer](https://github.com/cschaefer26), who created [Deep Phonemizer](https://github.com/as-ideas/DeepPhonemizer), on which OpenPhonemizer relies. OpenPhonemizer uses [num2words](https://github.com/savoirfairelinux/num2words) to read out large numbers and [cached_path](https://github.com/allenai/cached_path) from Allen AI for caching models.\n\nOpenPhonemizer was created by [mrfakename](https://twitter.com/realmrfakename).\n",
"bugtrack_url": null,
"license": "BSD-3-Clause-Clear",
"summary": "Permissively licensed, open sourced, local IPA Phonemizer (G2P) powered by deep learning.",
"version": "0.1.2",
"project_urls": {
"Homepage": "https://github.com/NeuralVox/OpenPhonemizer"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "1fa28ce340b16426d7b62092db1d5dc3a9c050c64d88b1836901b8ee0024d440",
"md5": "074596892367ca312e9d402175d7a17c",
"sha256": "3876cf6ad76c6c9b0aeb69cb7e24da10dad8240076a0ab04f096d3d6266e3a3a"
},
"downloads": -1,
"filename": "openphonemizer-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "074596892367ca312e9d402175d7a17c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 5583,
"upload_time": "2024-03-22T02:01:52",
"upload_time_iso_8601": "2024-03-22T02:01:52.893136Z",
"url": "https://files.pythonhosted.org/packages/1f/a2/8ce340b16426d7b62092db1d5dc3a9c050c64d88b1836901b8ee0024d440/openphonemizer-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "fcd6daa9534f5cc41eb7ad6f60d1ecda351cda09381287e4c632831aa0f0a7c7",
"md5": "36b1ad6dbe238d1f1ff13aaaf4c1ea9e",
"sha256": "4df8e2512c9bd39d8efdf2bdb651d95209b1dfb8a9e4b589a6e5c5a9e736c5e5"
},
"downloads": -1,
"filename": "openphonemizer-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "36b1ad6dbe238d1f1ff13aaaf4c1ea9e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 4568,
"upload_time": "2024-03-22T02:01:53",
"upload_time_iso_8601": "2024-03-22T02:01:53.860523Z",
"url": "https://files.pythonhosted.org/packages/fc/d6/daa9534f5cc41eb7ad6f60d1ecda351cda09381287e4c632831aa0f0a7c7/openphonemizer-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-22 02:01:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "NeuralVox",
"github_project": "OpenPhonemizer",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "openphonemizer"
}