[![Current PyPI packages](https://badge.fury.io/py/spacy-thai.svg)](https://pypi.org/project/spacy-thai/)
# spaCy-Thai
Tokenizer, POS-tagger, and dependency-parser for Thai language, working on [Universal Dependencies](https://github.com/UniversalDependencies/UD_Thai-PUD).
## Basic Usage
```py
>>> import spacy_thai
>>> nlp=spacy_thai.load()
>>> doc=nlp("แผนกนี้กำลังเผชิญกับความท้าทายใหม่")
>>> for t in doc:
... print("\t".join([str(t.i+1),t.orth_,t.lemma_,t.pos_,t.tag_,"_",str(0 if t.head==t else t.head.i+1),t.dep_,"_","_" if t.whitespace_ else "SpaceAfter=No"]))
...
1 แผนก แผนก NOUN NCMN _ 4 nsubj _ SpaceAfter=No
2 นี้ นี้ DET DDAC _ 1 det _ SpaceAfter=No
3 กำลัง กำลัง AUX XVBM _ 4 aux _ SpaceAfter=No
4 เผชิญ เผชิญ VERB VSTA _ 0 ROOT _ SpaceAfter=No
5 กับ กับ ADP RPRE _ 6 case _ SpaceAfter=No
6 ความ ความ PART FIXN _ 4 obl _ SpaceAfter=No
7 ท้าทาย ท้าทาย VERB VACT _ 6 acl _ SpaceAfter=No
8 ใหม่ ใหม่ ADV ADVN _ 7 advmod _ SpaceAfter=No
>>> import deplacy
>>> deplacy.render(doc,WordRight=True)
nsubj ╔════════>╔═ NOUN แผนก
det ║ ╚> DET นี้
aux ║ ╔════════> AUX กำลัง
ROOT ╚═╚═╔═══════ VERB เผชิญ
case ║ ╔════> ADP กับ
obl ╚>╚═╔═══ PART ความ
acl ╚>╔═ VERB ท้าทาย
advmod ╚> ADV ใหม่
```
## Installation for Linux
```sh
pip3 install spacy_thai --user
```
## Installation for Cygwin
Make sure to get `python37-devel` `python37-pip` `python37-numpy` `python37-cython` `gcc-g++`, and then:
```sh
pip3.7 install spacy_thai
```
## Installation for Google Colaboratory
```py
!pip install spacy_thai
```
Try [notebook](https://colab.research.google.com/github/KoichiYasuoka/spaCy-Thai/blob/master/spacy_thai.ipynb).
Raw data
{
"_id": null,
"home_page": "https://github.com/KoichiYasuoka/spaCy-Thai",
"name": "spacy-thai",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": "spaCy udpipe nlp",
"author": "Koichi Yasuoka",
"author_email": "yasuoka@kanji.zinbun.kyoto-u.ac.jp",
"download_url": null,
"platform": null,
"description": "[![Current PyPI packages](https://badge.fury.io/py/spacy-thai.svg)](https://pypi.org/project/spacy-thai/)\n\n# spaCy-Thai\n\nTokenizer, POS-tagger, and dependency-parser for Thai language, working on [Universal Dependencies](https://github.com/UniversalDependencies/UD_Thai-PUD).\n\n## Basic Usage\n\n```py\n>>> import spacy_thai\n>>> nlp=spacy_thai.load()\n>>> doc=nlp(\"\u0e41\u0e1c\u0e19\u0e01\u0e19\u0e35\u0e49\u0e01\u0e33\u0e25\u0e31\u0e07\u0e40\u0e1c\u0e0a\u0e34\u0e0d\u0e01\u0e31\u0e1a\u0e04\u0e27\u0e32\u0e21\u0e17\u0e49\u0e32\u0e17\u0e32\u0e22\u0e43\u0e2b\u0e21\u0e48\")\n>>> for t in doc:\n... print(\"\\t\".join([str(t.i+1),t.orth_,t.lemma_,t.pos_,t.tag_,\"_\",str(0 if t.head==t else t.head.i+1),t.dep_,\"_\",\"_\" if t.whitespace_ else \"SpaceAfter=No\"]))\n...\n1\t\u0e41\u0e1c\u0e19\u0e01\t\u0e41\u0e1c\u0e19\u0e01\tNOUN\tNCMN\t_\t4\tnsubj\t_\tSpaceAfter=No\n2\t\u0e19\u0e35\u0e49\t\u0e19\u0e35\u0e49\tDET\tDDAC\t_\t1\tdet\t_\tSpaceAfter=No\n3\t\u0e01\u0e33\u0e25\u0e31\u0e07\t\u0e01\u0e33\u0e25\u0e31\u0e07\tAUX\tXVBM\t_\t4\taux\t_\tSpaceAfter=No\n4\t\u0e40\u0e1c\u0e0a\u0e34\u0e0d\t\u0e40\u0e1c\u0e0a\u0e34\u0e0d\tVERB\tVSTA\t_\t0\tROOT\t_\tSpaceAfter=No\n5\t\u0e01\u0e31\u0e1a\t\u0e01\u0e31\u0e1a\tADP\tRPRE\t_\t6\tcase\t_\tSpaceAfter=No\n6\t\u0e04\u0e27\u0e32\u0e21\t\u0e04\u0e27\u0e32\u0e21\tPART\tFIXN\t_\t4\tobl\t_\tSpaceAfter=No\n7\t\u0e17\u0e49\u0e32\u0e17\u0e32\u0e22\t\u0e17\u0e49\u0e32\u0e17\u0e32\u0e22\tVERB\tVACT\t_\t6\tacl\t_\tSpaceAfter=No\n8\t\u0e43\u0e2b\u0e21\u0e48\t\u0e43\u0e2b\u0e21\u0e48\tADV\tADVN\t_\t7\tadvmod\t_\tSpaceAfter=No\n>>> import deplacy\n>>> deplacy.render(doc,WordRight=True)\n nsubj \u2554\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550>\u2554\u2550 NOUN \u0e41\u0e1c\u0e19\u0e01\n det \u2551 \u255a> DET \u0e19\u0e35\u0e49\n aux \u2551 \u2554\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550> AUX \u0e01\u0e33\u0e25\u0e31\u0e07\n ROOT \u255a\u2550\u255a\u2550\u2554\u2550\u2550\u2550\u2550\u2550\u2550\u2550 VERB \u0e40\u0e1c\u0e0a\u0e34\u0e0d\n case \u2551 \u2554\u2550\u2550\u2550\u2550> ADP \u0e01\u0e31\u0e1a\n obl \u255a>\u255a\u2550\u2554\u2550\u2550\u2550 PART \u0e04\u0e27\u0e32\u0e21\n acl \u255a>\u2554\u2550 VERB \u0e17\u0e49\u0e32\u0e17\u0e32\u0e22\nadvmod \u255a> ADV \u0e43\u0e2b\u0e21\u0e48\n```\n\n## Installation for Linux\n\n```sh\npip3 install spacy_thai --user\n```\n\n## Installation for Cygwin\n\nMake sure to get `python37-devel` `python37-pip` `python37-numpy` `python37-cython` `gcc-g++`, and then:\n\n```sh\npip3.7 install spacy_thai\n```\n\n## Installation for Google Colaboratory\n\n```py\n!pip install spacy_thai\n```\n\nTry [notebook](https://colab.research.google.com/github/KoichiYasuoka/spaCy-Thai/blob/master/spacy_thai.ipynb).\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Dependency-parser for Thai language",
"version": "0.7.8",
"project_urls": {
"Homepage": "https://github.com/KoichiYasuoka/spaCy-Thai",
"Source": "https://github.com/KoichiYasuoka/spaCy-Thai",
"Tracker": "https://github.com/KoichiYasuoka/spaCy-Thai/issues"
},
"split_keywords": [
"spacy",
"udpipe",
"nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d10583d46cea4cb48387f50a027a018eb3fbd31650b7b2ae07b024e7786436b8",
"md5": "314f06d49ebdbe709861fa6939504ab9",
"sha256": "b3d9b20f9d02a506f638c4d54712b938cf29d078fbe3228d85f2b7f1c7c05bd2"
},
"downloads": -1,
"filename": "spacy_thai-0.7.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "314f06d49ebdbe709861fa6939504ab9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 15677602,
"upload_time": "2024-11-20T00:52:44",
"upload_time_iso_8601": "2024-11-20T00:52:44.937011Z",
"url": "https://files.pythonhosted.org/packages/d1/05/83d46cea4cb48387f50a027a018eb3fbd31650b7b2ae07b024e7786436b8/spacy_thai-0.7.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-20 00:52:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "KoichiYasuoka",
"github_project": "spaCy-Thai",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "spacy-thai"
}