# g2p-mix
- Cantonese: [pycantonese](https://github.com/jacksonllee/pycantonese)
- English: [g2p_en](https://github.com/Kyubyong/g2p)
- Mandarin: [pypinyin](https://github.com/mozillazg/python-pinyin) or [g2pW](https://github.com/GitYCC/g2pW)
## Usage
```bash
$ pip install g2p-mix
$ python
```
### Mandarin
```python
>>> from g2p_mix import G2pMix
>>> G2pMix().g2p("你这个idea, 不太make sense。", sandhi=True, return_seg=True)
```
```
[
Token(word='你', lang='ZH', pos='r', phones=[['n', 'i3']]),
Token(word='这个', lang='ZH', pos='r', phones=[['zh', 'e4'], ['g', 'e5']]),
Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0']),
Token(word=',', lang='SYM', pos='x', phones=[',']),
Token(word='不太', lang='ZH', pos='d', phones=[['b', 'u2'], ['t', 'ai4']]),
Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K']),
Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S']),
Token(word='。', lang='SYM', pos='x', phones=['。']),
]
```
### Cantonese
```python
>>> G2pMix(jyut=True).g2p("你这个idea, 不太make sense。", return_seg=True)
```
```
[
Token(word='你', lang='ZH', pos='PRON', phones=[['n', 'ei5']])
Token(word='這個', lang='ZH', pos='PRON', phones=[['z', 'e3'], ['g', 'o3']])
Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0'])
Token(word=',', lang='SYM', pos='x', phones=[','])
Token(word='不', lang='ZH', pos='ADV', phones=[['b', 'at1']])
Token(word='太', lang='ZH', pos='ADV', phones=[['t', 'aai3']])
Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K'])
Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S'])
Token(word='。', lang='SYM', pos='x', phones=['。'])
]
```
Raw data
{
"_id": null,
"home_page": "https://github.com/pengzhendong/g2p-mix",
"name": "g2p-mix",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Zhendong Peng",
"author_email": "pzd17@tsinghua.org.cn",
"download_url": null,
"platform": null,
"description": "# g2p-mix\n\n- Cantonese: [pycantonese](https://github.com/jacksonllee/pycantonese)\n- English: [g2p_en](https://github.com/Kyubyong/g2p)\n- Mandarin: [pypinyin](https://github.com/mozillazg/python-pinyin) or [g2pW](https://github.com/GitYCC/g2pW)\n\n## Usage\n\n```bash\n$ pip install g2p-mix\n$ python\n```\n\n### Mandarin\n\n```python\n>>> from g2p_mix import G2pMix\n>>> G2pMix().g2p(\"\u4f60\u8fd9\u4e2aidea, \u4e0d\u592amake sense\u3002\", sandhi=True, return_seg=True)\n```\n\n```\n[\n Token(word='\u4f60', lang='ZH', pos='r', phones=[['n', 'i3']]),\n Token(word='\u8fd9\u4e2a', lang='ZH', pos='r', phones=[['zh', 'e4'], ['g', 'e5']]),\n Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0']),\n Token(word=',', lang='SYM', pos='x', phones=[',']),\n Token(word='\u4e0d\u592a', lang='ZH', pos='d', phones=[['b', 'u2'], ['t', 'ai4']]),\n Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K']),\n Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S']),\n Token(word='\u3002', lang='SYM', pos='x', phones=['\u3002']),\n]\n```\n\n### Cantonese\n\n```python\n>>> G2pMix(jyut=True).g2p(\"\u4f60\u8fd9\u4e2aidea, \u4e0d\u592amake sense\u3002\", return_seg=True)\n```\n\n```\n[\n Token(word='\u4f60', lang='ZH', pos='PRON', phones=[['n', 'ei5']])\n Token(word='\u9019\u500b', lang='ZH', pos='PRON', phones=[['z', 'e3'], ['g', 'o3']])\n Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0'])\n Token(word=',', lang='SYM', pos='x', phones=[','])\n Token(word='\u4e0d', lang='ZH', pos='ADV', phones=[['b', 'at1']])\n Token(word='\u592a', lang='ZH', pos='ADV', phones=[['t', 'aai3']])\n Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K'])\n Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S'])\n Token(word='\u3002', lang='SYM', pos='x', phones=['\u3002'])\n]\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "G2P mix",
"version": "0.6.4",
"project_urls": {
"Homepage": "https://github.com/pengzhendong/g2p-mix"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2de60f14521431c36f709e5e8ff0dc1aaf0264c846e9ef3edad0732c6ca36295",
"md5": "6fed6454e9ae00694393dfe17b1c5ff2",
"sha256": "2fe928be99ecbc045aff9561d1590d111169268a9f0003f481b8480e399a8813"
},
"downloads": -1,
"filename": "g2p_mix-0.6.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6fed6454e9ae00694393dfe17b1c5ff2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 4987954,
"upload_time": "2025-02-22T02:11:55",
"upload_time_iso_8601": "2025-02-22T02:11:55.970575Z",
"url": "https://files.pythonhosted.org/packages/2d/e6/0f14521431c36f709e5e8ff0dc1aaf0264c846e9ef3edad0732c6ca36295/g2p_mix-0.6.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-22 02:11:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pengzhendong",
"github_project": "g2p-mix",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "g2p_en",
"specs": []
},
{
"name": "jieba",
"specs": []
},
{
"name": "pycantonese",
"specs": []
},
{
"name": "pyopenhc",
"specs": []
},
{
"name": "pypinyin",
"specs": []
},
{
"name": "pypinyin-dict",
"specs": []
},
{
"name": "wetext",
"specs": []
},
{
"name": "wordsegment",
"specs": []
}
],
"lcname": "g2p-mix"
}