g2p-mix


Nameg2p-mix JSON
Version 0.6.4 PyPI version JSON
download
home_pagehttps://github.com/pengzhendong/g2p-mix
SummaryG2P mix
upload_time2025-02-22 02:11:55
maintainerNone
docs_urlNone
authorZhendong Peng
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements g2p_en jieba pycantonese pyopenhc pypinyin pypinyin-dict wetext wordsegment
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # g2p-mix

- Cantonese: [pycantonese](https://github.com/jacksonllee/pycantonese)
- English: [g2p_en](https://github.com/Kyubyong/g2p)
- Mandarin: [pypinyin](https://github.com/mozillazg/python-pinyin) or [g2pW](https://github.com/GitYCC/g2pW)

## Usage

```bash
$ pip install g2p-mix
$ python
```

### Mandarin

```python
>>> from g2p_mix import G2pMix
>>> G2pMix().g2p("你这个idea, 不太make sense。", sandhi=True, return_seg=True)
```

```
[
  Token(word='你', lang='ZH', pos='r', phones=[['n', 'i3']]),
  Token(word='这个', lang='ZH', pos='r', phones=[['zh', 'e4'], ['g', 'e5']]),
  Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0']),
  Token(word=',', lang='SYM', pos='x', phones=[',']),
  Token(word='不太', lang='ZH', pos='d', phones=[['b', 'u2'], ['t', 'ai4']]),
  Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K']),
  Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S']),
  Token(word='。', lang='SYM', pos='x', phones=['。']),
]
```

### Cantonese

```python
>>> G2pMix(jyut=True).g2p("你这个idea, 不太make sense。", return_seg=True)
```

```
[
  Token(word='你', lang='ZH', pos='PRON', phones=[['n', 'ei5']])
  Token(word='這個', lang='ZH', pos='PRON', phones=[['z', 'e3'], ['g', 'o3']])
  Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0'])
  Token(word=',', lang='SYM', pos='x', phones=[','])
  Token(word='不', lang='ZH', pos='ADV', phones=[['b', 'at1']])
  Token(word='太', lang='ZH', pos='ADV', phones=[['t', 'aai3']])
  Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K'])
  Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S'])
  Token(word='。', lang='SYM', pos='x', phones=['。'])
]
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/pengzhendong/g2p-mix",
    "name": "g2p-mix",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Zhendong Peng",
    "author_email": "pzd17@tsinghua.org.cn",
    "download_url": null,
    "platform": null,
    "description": "# g2p-mix\n\n- Cantonese: [pycantonese](https://github.com/jacksonllee/pycantonese)\n- English: [g2p_en](https://github.com/Kyubyong/g2p)\n- Mandarin: [pypinyin](https://github.com/mozillazg/python-pinyin) or [g2pW](https://github.com/GitYCC/g2pW)\n\n## Usage\n\n```bash\n$ pip install g2p-mix\n$ python\n```\n\n### Mandarin\n\n```python\n>>> from g2p_mix import G2pMix\n>>> G2pMix().g2p(\"\u4f60\u8fd9\u4e2aidea, \u4e0d\u592amake sense\u3002\", sandhi=True, return_seg=True)\n```\n\n```\n[\n  Token(word='\u4f60', lang='ZH', pos='r', phones=[['n', 'i3']]),\n  Token(word='\u8fd9\u4e2a', lang='ZH', pos='r', phones=[['zh', 'e4'], ['g', 'e5']]),\n  Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0']),\n  Token(word=',', lang='SYM', pos='x', phones=[',']),\n  Token(word='\u4e0d\u592a', lang='ZH', pos='d', phones=[['b', 'u2'], ['t', 'ai4']]),\n  Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K']),\n  Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S']),\n  Token(word='\u3002', lang='SYM', pos='x', phones=['\u3002']),\n]\n```\n\n### Cantonese\n\n```python\n>>> G2pMix(jyut=True).g2p(\"\u4f60\u8fd9\u4e2aidea, \u4e0d\u592amake sense\u3002\", return_seg=True)\n```\n\n```\n[\n  Token(word='\u4f60', lang='ZH', pos='PRON', phones=[['n', 'ei5']])\n  Token(word='\u9019\u500b', lang='ZH', pos='PRON', phones=[['z', 'e3'], ['g', 'o3']])\n  Token(word='idea', lang='EN', pos=None, phones=['AY0', 'D', 'IY1', 'AH0'])\n  Token(word=',', lang='SYM', pos='x', phones=[','])\n  Token(word='\u4e0d', lang='ZH', pos='ADV', phones=[['b', 'at1']])\n  Token(word='\u592a', lang='ZH', pos='ADV', phones=[['t', 'aai3']])\n  Token(word='make', lang='EN', pos=None, phones=['M', 'EY1', 'K'])\n  Token(word='sense', lang='EN', pos=None, phones=['S', 'EH1', 'N', 'S'])\n  Token(word='\u3002', lang='SYM', pos='x', phones=['\u3002'])\n]\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "G2P mix",
    "version": "0.6.4",
    "project_urls": {
        "Homepage": "https://github.com/pengzhendong/g2p-mix"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2de60f14521431c36f709e5e8ff0dc1aaf0264c846e9ef3edad0732c6ca36295",
                "md5": "6fed6454e9ae00694393dfe17b1c5ff2",
                "sha256": "2fe928be99ecbc045aff9561d1590d111169268a9f0003f481b8480e399a8813"
            },
            "downloads": -1,
            "filename": "g2p_mix-0.6.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6fed6454e9ae00694393dfe17b1c5ff2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4987954,
            "upload_time": "2025-02-22T02:11:55",
            "upload_time_iso_8601": "2025-02-22T02:11:55.970575Z",
            "url": "https://files.pythonhosted.org/packages/2d/e6/0f14521431c36f709e5e8ff0dc1aaf0264c846e9ef3edad0732c6ca36295/g2p_mix-0.6.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-22 02:11:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pengzhendong",
    "github_project": "g2p-mix",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "g2p_en",
            "specs": []
        },
        {
            "name": "jieba",
            "specs": []
        },
        {
            "name": "pycantonese",
            "specs": []
        },
        {
            "name": "pyopenhc",
            "specs": []
        },
        {
            "name": "pypinyin",
            "specs": []
        },
        {
            "name": "pypinyin-dict",
            "specs": []
        },
        {
            "name": "wetext",
            "specs": []
        },
        {
            "name": "wordsegment",
            "specs": []
        }
    ],
    "lcname": "g2p-mix"
}
        
Elapsed time: 0.39236s