Name | dict-from-g2pE JSON |
Version |
0.0.2
JSON |
| download |
home_page | |
Summary | CLI to create a pronunciation dictionary by predicting English ARPAbet phonemes using seq2seq model from g2pE and the possibility of ignoring punctuation and splitting on hyphens before prediction. |
upload_time | 2024-01-24 12:45:06 |
maintainer | Stefan Taubert |
docs_url | None |
author | Stefan Taubert |
requires_python | <3.13,>=3.8 |
license | MIT |
keywords |
arpabet
pronunciation
dictionary
g2pe
language
linguistics
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# dict-from-g2pE
[![PyPI](https://img.shields.io/pypi/v/dict-from-g2pE.svg)](https://pypi.python.org/pypi/dict-from-g2pE)
[![PyPI](https://img.shields.io/pypi/pyversions/dict-from-g2pE.svg)](https://pypi.python.org/pypi/dict-from-g2pE)
[![MIT](https://img.shields.io/github/license/stefantaubert/dict-from-g2p.svg)](LICENSE)
[![PyPI](https://img.shields.io/pypi/wheel/dict-from-g2pE.svg)](https://pypi.python.org/pypi/dict-from-g2pE/#files)
![PyPI](https://img.shields.io/pypi/implementation/dict-from-g2pE.svg)
[![PyPI](https://img.shields.io/github/commits-since/stefantaubert/dict-from-g2p/latest/master.svg)](https://github.com/stefantaubert/dict-from-g2p/compare/v0.0.2...master)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10561178.svg)](https://doi.org/10.5281/zenodo.10561178)
CLI to create a pronunciation dictionary by predicting English ARPAbet phonemes using seq2seq model from [g2pE](https://www.github.com/kyubyong/g2p) and the possibility of ignoring punctuation and splitting on hyphens before prediction.
## Installation
```sh
pip install dict-from-g2pE --user
```
## Usage
```sh
dict-from-g2pE-cli
```
### Example
```sh
# Create example vocabulary
cat > /tmp/vocabulary.txt << EOF
Test?
abc,
"def
Test-def.
"xyz?
"uv-w?
EOF
# Create dictionary from vocabulary and example dictionary
dict-from-g2pE-cli \
/tmp/vocabulary.txt \
/tmp/result.dict \
--split-on-hyphen \
--n-jobs 4
cat /tmp/result.dict
```
Output:
```dict
Test? T EH1 S T ?
abc, AE1 B K ,
"def " D EH1 F
Test-def. T EH1 S T - D EH1 F .
"xyz? " Z IH1 JH IH0 Z ?
"uv-w? " AH1 V - V IY1 ?
```
## Development setup
```sh
# update
sudo apt update
# install Python 3.8-3.12 for ensuring that tests can be run
sudo apt install python3-pip \
python3.8 python3.8-dev python3.8-distutils python3.8-venv \
python3.9 python3.9-dev python3.9-distutils python3.9-venv \
python3.10 python3.10-dev python3.10-distutils python3.10-venv \
python3.11 python3.11-dev python3.11-distutils python3.11-venv \
python3.12 python3.12-dev python3.12-distutils python3.12-venv
# install pipenv for creation of virtual environments
python3.8 -m pip install pipenv --user
# check out repo
git clone https://github.com/stefantaubert/dict-from-g2p.git
cd dict-from-g2p
# create virtual environment
python3.8 -m pipenv install --dev
```
## Running the tests
```sh
# first install the tool like in "Development setup"
# then, navigate into the directory of the repo (if not already done)
cd dict-from-g2p
# activate environment
python3.8 -m pipenv shell
# run tests
tox
```
Final lines of test result output:
```log
py38: commands succeeded
py39: commands succeeded
py310: commands succeeded
py311: commands succeeded
py312: commands succeeded
congratulations :)
```
## License
MIT License
## Acknowledgments
[g2pE: A Simple Python Module for English Grapheme To Phoneme Conversion](https://www.github.com/kyubyong/g2p)
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 416228727 – CRC 1410
## Citation
If you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see *About => Cite this repository*).
```txt
Taubert, S. (2024). dict-from-g2pE (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.10561178
```
Raw data
{
"_id": null,
"home_page": "",
"name": "dict-from-g2pE",
"maintainer": "Stefan Taubert",
"docs_url": null,
"requires_python": "<3.13,>=3.8",
"maintainer_email": "pypi@stefantaubert.com",
"keywords": "ARPAbet,Pronunciation,Dictionary,g2pE,Language,Linguistics",
"author": "Stefan Taubert",
"author_email": "pypi@stefantaubert.com",
"download_url": "https://files.pythonhosted.org/packages/9f/52/9b674e3335cc9b16fa99715f1fed28a85879dfcc84aa7e39dc98ec2dc460/dict-from-g2pE-0.0.2.tar.gz",
"platform": null,
"description": "# dict-from-g2pE\n\n[![PyPI](https://img.shields.io/pypi/v/dict-from-g2pE.svg)](https://pypi.python.org/pypi/dict-from-g2pE)\n[![PyPI](https://img.shields.io/pypi/pyversions/dict-from-g2pE.svg)](https://pypi.python.org/pypi/dict-from-g2pE)\n[![MIT](https://img.shields.io/github/license/stefantaubert/dict-from-g2p.svg)](LICENSE)\n[![PyPI](https://img.shields.io/pypi/wheel/dict-from-g2pE.svg)](https://pypi.python.org/pypi/dict-from-g2pE/#files)\n![PyPI](https://img.shields.io/pypi/implementation/dict-from-g2pE.svg)\n[![PyPI](https://img.shields.io/github/commits-since/stefantaubert/dict-from-g2p/latest/master.svg)](https://github.com/stefantaubert/dict-from-g2p/compare/v0.0.2...master)\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.10561178.svg)](https://doi.org/10.5281/zenodo.10561178)\n\nCLI to create a pronunciation dictionary by predicting English ARPAbet phonemes using seq2seq model from [g2pE](https://www.github.com/kyubyong/g2p) and the possibility of ignoring punctuation and splitting on hyphens before prediction.\n\n## Installation\n\n```sh\npip install dict-from-g2pE --user\n```\n\n## Usage\n\n```sh\ndict-from-g2pE-cli\n```\n\n### Example\n\n```sh\n# Create example vocabulary\ncat > /tmp/vocabulary.txt << EOF\nTest?\nabc,\n\"def\nTest-def.\n\"xyz?\n\"uv-w?\nEOF\n\n# Create dictionary from vocabulary and example dictionary\ndict-from-g2pE-cli \\\n /tmp/vocabulary.txt \\\n /tmp/result.dict \\\n --split-on-hyphen \\\n --n-jobs 4\n\ncat /tmp/result.dict\n```\n\nOutput:\n\n```dict\nTest? T EH1 S T ?\nabc, AE1 B K ,\n\"def \" D EH1 F\nTest-def. T EH1 S T - D EH1 F .\n\"xyz? \" Z IH1 JH IH0 Z ?\n\"uv-w? \" AH1 V - V IY1 ?\n```\n\n## Development setup\n\n```sh\n# update\nsudo apt update\n# install Python 3.8-3.12 for ensuring that tests can be run\nsudo apt install python3-pip \\\n python3.8 python3.8-dev python3.8-distutils python3.8-venv \\\n python3.9 python3.9-dev python3.9-distutils python3.9-venv \\\n python3.10 python3.10-dev python3.10-distutils python3.10-venv \\\n python3.11 python3.11-dev python3.11-distutils python3.11-venv \\\n python3.12 python3.12-dev python3.12-distutils python3.12-venv\n# install pipenv for creation of virtual environments\npython3.8 -m pip install pipenv --user\n\n# check out repo\ngit clone https://github.com/stefantaubert/dict-from-g2p.git\ncd dict-from-g2p\n# create virtual environment\npython3.8 -m pipenv install --dev\n```\n\n## Running the tests\n\n```sh\n# first install the tool like in \"Development setup\"\n# then, navigate into the directory of the repo (if not already done)\ncd dict-from-g2p\n# activate environment\npython3.8 -m pipenv shell\n# run tests\ntox\n```\n\nFinal lines of test result output:\n\n```log\n py38: commands succeeded\n py39: commands succeeded\n py310: commands succeeded\n py311: commands succeeded\n py312: commands succeeded\n congratulations :)\n```\n\n## License\n\nMIT License\n\n## Acknowledgments\n\n[g2pE: A Simple Python Module for English Grapheme To Phoneme Conversion](https://www.github.com/kyubyong/g2p)\n\nFunded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) \u2013 Project-ID 416228727 \u2013 CRC 1410\n\n## Citation\n\nIf you want to cite this repo, you can use this BibTeX-entry generated by GitHub (see *About => Cite this repository*).\n\n```txt\nTaubert, S. (2024). dict-from-g2pE (Version 0.0.2) [Computer software]. https://doi.org/10.5281/zenodo.10561178\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "CLI to create a pronunciation dictionary by predicting English ARPAbet phonemes using seq2seq model from g2pE and the possibility of ignoring punctuation and splitting on hyphens before prediction.",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/stefantaubert/dict-from-g2p",
"Issues": "https://github.com/stefantaubert/dict-from-g2p/issues"
},
"split_keywords": [
"arpabet",
"pronunciation",
"dictionary",
"g2pe",
"language",
"linguistics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bf5832146701b058d8ea14241c629c033ac3d4a8f31388fd7b834bdbb3f29457",
"md5": "83db0145eb3eeb4088c0178c2d6faca4",
"sha256": "19300df7c83a6a4b0d101f858a4dc75ea8b9d7791be4b2cba8c6cd9d2a5f871b"
},
"downloads": -1,
"filename": "dict_from_g2pE-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "83db0145eb3eeb4088c0178c2d6faca4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.8",
"size": 9771,
"upload_time": "2024-01-24T12:45:04",
"upload_time_iso_8601": "2024-01-24T12:45:04.934736Z",
"url": "https://files.pythonhosted.org/packages/bf/58/32146701b058d8ea14241c629c033ac3d4a8f31388fd7b834bdbb3f29457/dict_from_g2pE-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9f529b674e3335cc9b16fa99715f1fed28a85879dfcc84aa7e39dc98ec2dc460",
"md5": "e65d71e1c8a0422763d2b687d82a777c",
"sha256": "2c90efd6149a7306d1812b674da65afcbf620d7cbf50a0e936c393314810ae91"
},
"downloads": -1,
"filename": "dict-from-g2pE-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "e65d71e1c8a0422763d2b687d82a777c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.8",
"size": 10451,
"upload_time": "2024-01-24T12:45:06",
"upload_time_iso_8601": "2024-01-24T12:45:06.607739Z",
"url": "https://files.pythonhosted.org/packages/9f/52/9b674e3335cc9b16fa99715f1fed28a85879dfcc84aa7e39dc98ec2dc460/dict-from-g2pE-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-24 12:45:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "stefantaubert",
"github_project": "dict-from-g2p",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "dict-from-g2pe"
}