# ArticutAPI_Taigi (文截台語 NLP 工具)
基於 [卓騰語言科技](https://api.droidtown.co) 研發的 <u>Articut 中文 NLP 系統</u>,**ArticutAPI_Taigi** 是專供台語文使用的斷詞/詞性標記/命名實體辨識 NLP 工具。
由於 **Articut_Taigi** 是基於 Articut 開發的台語文 NLP 工具,它的免費字數即直接受益於卓騰語言科技提供 Articut NLP 系統使用者的 2000 字/小時。此外,因 Articut 計算字數時不會將字典詞彙計入,而 **Articut_Taigi** 的台語文功能有很大一部份是依賴台文字典實現的,因此實際上消耗的字數會比較少。
若 2000 字/小時的免費額度不夠您的需求使用的話,可自行採購 Articut 的字數額度,取得其 API key 即可使用。我相信絕大多數的情況下,[300 元(十萬字)] 的方案就已經相當夠使用了。([採購連結](https://api.droidtown.co/product/) )
即便是台灣最多人使用的國語,放在商業現實的「中國普通話」面前都算是沒什麼市場價值的小語種,就更別提台閩語、客語、南島語…等本土語言了。因此這些本土語言的 NLP 工具的開發有賴個人支持。我也只能在工餘的時間盡力貢獻。
若您有意贊助 **Articut_Taigi (文截台語 NLP 工具)** 及其後各種台灣本土語言 NLP 工具 (e.g., Articut_Hakka, Articut_Amis, ... 等) 的開發,歡迎直接贊助開發者本人小弟在下我: [http://paypal.me/donatepeterwolf](http://paypal.me/donatepeterwolf) 。
### 主要功能:
- 全白話字斷詞暨詞性/NER 標記 (e.g., "歡迎逐家做伙來做台灣語言")
- 全台羅拼音斷詞暨詞性/NER 標記 (e.g., "huan-gîng ta̍k-ke tsò-hué lâi tsò tâi-uan gí-giân")
- 白話字台羅拼音混打斷詞暨詞性/NER 標記 (e.g., 歡迎ta̍k-ke做伙來做 tâi-uan 語言")
- 自訂詞典
### 進階功能:
- 白話字轉譯台羅拼音
- 台羅拼音轉譯白話字 (施作中…)
### 網頁操作介面:
[國立清華大學語言學研究所:: 本土語言斷詞系統](https://taiwan-lingu.ist/segmentation/)
## I. 基本操作:斷詞(WS)/詞性標記(POS)/命名實體辨識(NER)
```python
from ArticutAPI_Taigi import ArticutTG
from pprint import pprint
username = "" #這裡填入您在 https://api.droidtown.co 使用的帳號 email。若使用空字串,則預設使用每小時 2000 字的公用額度。
apikey = "" #這裡填入您在 https://api.droidtown.co 登入後取得的 api Key。若使用空字串,則預設使用每小時 2000 字的公用額度。
articutTG = ArticutTG(username, apikey)
inputSTR = "歡迎逐家做伙來做台灣語言"
resultDICT = articutTG.parse(inputSTR, level="lv2") #lv2 為預設值
pprint(resultDICT)
```
### 回傳結果
```python
{'exec_time': 0.17456531524658203,
'level': 'lv2',
'msg': 'Success!',
'result_obj': [[{'pos': 'ACTION_verb', 'text': '歡迎'},
{'pos': 'ENTITY_pronoun', 'text': '逐家'},
{'pos': 'MODIFIER', 'text': '做伙'},
{'pos': 'ACTION_verb', 'text': '來'},
{'pos': 'ACTION_verb', 'text': '做'},
{'pos': 'LOCATION', 'text': '台灣'},
{'pos': 'ENTITY_noun', 'text': '語言'}]],
'result_pos': ['<ACTION_verb>歡迎</ACTION_verb><ENTITY_pronoun>逐家</ENTITY_pronoun><MODIFIER>做伙</MODIFIER><ACTION_verb>來</ACTION_verb><ACTION_verb>做</ACTION_verb><LOCATION>台灣</LOCATION><ENTITY_noun>語言</ENTITY_noun>'],
'result_segmentation': '歡迎/逐家/做伙/來/做/台灣/語言',
'status': True,
'version': 'v261',
'word_count_balance': 1965}
```
## 使用自訂詞典
**ArticutAPI_Taigi** 支援自訂詞典的設定。詞典需存為一 .json 檔,且內容格式如下:
```json
{
"ACTION_verb" : [ ], #普通動詞
"ACTION_lightVerb" : [ ], #輕動詞 (e.g., 把、使…)
"ACTION_quantifiedVerb" : [ ], #量化動詞 (e.g., 呷看嘜、聽看看…等表示動作只做了輕微嚐試的動詞)
"ACTION_eventQuantifier": [ ], #事件量化詞 (e.g., 趟、圈…等表用以計算事件發生次數的詞彙)
"ASPECT" : [ ], #時態標記詞 (e.g., 看過、吃過…等詞中的「過」)
"AUX" : [ ], #助動詞 (e.g., 為、是…等)
"CLAUSE_particle" : [ ], #語氣詞 (e.g., 啊、嘛…等)
"CLAUSE_Q" : [ ], #疑問詞 (e.g., 嗎、是不是…等)
"ENTITY_classifier" : [ ], #量詞 (e.g., 一部車的「部」、一頭牛的「頭」)
"ENTITY_DetPhrase" : [ ], #冠詞詞組 (e.g., 這個、那位…等)
"ENTITY_measurement" : [ ], #量測詞組 (e.g., 兩公斤、30尺…等)
"ENTITY_noun" : [ ], #普通名詞
"ENTITY_num" : [ ], #數字
"ENTITY_person" : [ ], #人名
"ENTITY_possessive" : [ ], #所有格代名詞 (e.g., 我的、他們的)
"ENTITY_pronoun" : [ ], #代名詞 (e.g., 你、他、哥哥…等)
"FUNC_conjunction" : [ ], #連結詞 (e.g., 和、與…等)
"FUNC_degreeHead" : [ ], #程度詞中心語 (e.g., 很、非常…等)
"FUNC_inner" : [ ], #功能詞,不涉及其它句子存在 (e.g., 的)
"FUNC_inter" : [ ], #功能詞,暗示其它句子存在 (e.g., 而且)
"FUNC_negation" : [ ], #否定詞 (e.g., 不、沒、嘸…等)
"IDIOM" : [ ], #成語、俚語、俗語
"LOCATION" : [ ], #地名
"MODAL" : [ ], #情態動詞 (e.g., 會、能…等)
"MODIFIER" : [ ], #形容詞
"MODIFIER_color" : [ ], #顏色形容詞
"QUANTIFIER" : [ ], #量化詞 (e.g., 八成、一些…等)
"RANGE_locality" : [ ], #地點方位詞 (e.g., 附近、旁邊…等)
"RANGE_period" : [ ], #時間方位詞 (e.g., 之前、以後…等)
"TIME_justtime" : [ ], #短時間詞
"TIME_season" : [ ] #季節時間詞
}
```
使用時,只要在 .parse() 中指定字典檔即可:
```python
from ArticutAPI_Taigi import ArticutTG
from pprint import pprint
username = "" #這裡填入您在 https://api.droidtown.co 使用的帳號 email。若使用空字串,則預設使用每小時 2000 字的公用額度。
apikey = "" #這裡填入您在 https://api.droidtown.co 登入後取得的 api Key。若使用空字串,則預設使用每小時 2000 字的公用額度。
articutTG = ArticutTG(username, apikey)
inputSTR = "歡迎逐家做伙來做台灣語言"
resultDICT = articutTG.parse(inputSTR, level="lv2", userDefinedDictFILE=""my_dictionary.json")
pprint(resultDICT)
```
---
## II. 進階操作:白話字轉台羅拼音
```python
from ArticutAPI_Taigi import ArticutTG
from pprint import pprint
username = "" #這裡填入您在 https://api.droidtown.co 使用的帳號 email。若使用空字串,則預設使用每小時 2000 字的公用額度。
apikey = "" #這裡填入您在 https://api.droidtown.co 登入後取得的 api Key。若使用空字串,則預設使用每小時 2000 字的公用額度。
articutTG = ArticutTG(username, apikey)
inputSTR = "歡迎逐家做伙來做台灣語言"
resultDICT = articutTG.parse(inputSTR, level="lv3") #將 lv2 的預設值改為 lv3
pprint(resultDICT)
```
### 回傳結果
```python
{'entity': [[(179, 181, '語言')]],
'exec_time': 0.1532421112060547,
'level': 'lv3',
'msg': 'Success!',
'person': [[(45, 47, '逐家')]],
'site': [[(153, 155, '台灣')]],
'status': True,
'time': [[]],
'utterance': 'huan-gîng╱ta̍k-ke╱(tsò-hué/tsuè-hé)╱lâi╱(tsò/tsuè)╱tâi-(uan/uân)╱(gí-giân/gú-giân)',
'version': 'v261',
'word_count_balance': 1965}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/Droidtown/ArticutAPI_Taigi",
"name": "ArticutAPI-Taigi",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6.1",
"maintainer_email": null,
"keywords": "NLP, NLU, CWS, POS, NER, AI, artificial intelligence, Chinese word segmentation, computational linguistics, language, linguistics, graphQL, natural language, natural language processing, natural language understanding, parsing, part-of-speech-embdding, part-of-speech-tagger, pos-tagger, pos-tagging, syntax, tagging, text analytics",
"author": "Droidtown Linguistic Tech. Co. Ltd.",
"author_email": "info@droidtown.co",
"download_url": "https://files.pythonhosted.org/packages/79/d8/81937dc196660b7ae8ab48baef13dec3cf503d3545d51f6a2a45874e93db/ArticutAPI_Taigi-0.95.tar.gz",
"platform": null,
"description": "# ArticutAPI_Taigi (\u6587\u622a\u53f0\u8a9e NLP \u5de5\u5177)\n\n\u57fa\u65bc [\u5353\u9a30\u8a9e\u8a00\u79d1\u6280](https://api.droidtown.co) \u7814\u767c\u7684 <u>Articut \u4e2d\u6587 NLP \u7cfb\u7d71</u>\uff0c**ArticutAPI_Taigi** \u662f\u5c08\u4f9b\u53f0\u8a9e\u6587\u4f7f\u7528\u7684\u65b7\u8a5e/\u8a5e\u6027\u6a19\u8a18/\u547d\u540d\u5be6\u9ad4\u8fa8\u8b58 NLP \u5de5\u5177\u3002\n\n\u7531\u65bc **Articut_Taigi** \u662f\u57fa\u65bc Articut \u958b\u767c\u7684\u53f0\u8a9e\u6587 NLP \u5de5\u5177\uff0c\u5b83\u7684\u514d\u8cbb\u5b57\u6578\u5373\u76f4\u63a5\u53d7\u76ca\u65bc\u5353\u9a30\u8a9e\u8a00\u79d1\u6280\u63d0\u4f9b Articut NLP \u7cfb\u7d71\u4f7f\u7528\u8005\u7684 2000 \u5b57/\u5c0f\u6642\u3002\u6b64\u5916\uff0c\u56e0 Articut \u8a08\u7b97\u5b57\u6578\u6642\u4e0d\u6703\u5c07\u5b57\u5178\u8a5e\u5f59\u8a08\u5165\uff0c\u800c **Articut_Taigi** \u7684\u53f0\u8a9e\u6587\u529f\u80fd\u6709\u5f88\u5927\u4e00\u90e8\u4efd\u662f\u4f9d\u8cf4\u53f0\u6587\u5b57\u5178\u5be6\u73fe\u7684\uff0c\u56e0\u6b64\u5be6\u969b\u4e0a\u6d88\u8017\u7684\u5b57\u6578\u6703\u6bd4\u8f03\u5c11\u3002\n\n\u82e5 2000 \u5b57/\u5c0f\u6642\u7684\u514d\u8cbb\u984d\u5ea6\u4e0d\u5920\u60a8\u7684\u9700\u6c42\u4f7f\u7528\u7684\u8a71\uff0c\u53ef\u81ea\u884c\u63a1\u8cfc Articut \u7684\u5b57\u6578\u984d\u5ea6\uff0c\u53d6\u5f97\u5176 API key \u5373\u53ef\u4f7f\u7528\u3002\u6211\u76f8\u4fe1\u7d55\u5927\u591a\u6578\u7684\u60c5\u6cc1\u4e0b\uff0c[300 \u5143(\u5341\u842c\u5b57)] \u7684\u65b9\u6848\u5c31\u5df2\u7d93\u76f8\u7576\u5920\u4f7f\u7528\u4e86\u3002([\u63a1\u8cfc\u9023\u7d50](https://api.droidtown.co/product/) )\n\n\u5373\u4fbf\u662f\u53f0\u7063\u6700\u591a\u4eba\u4f7f\u7528\u7684\u570b\u8a9e\uff0c\u653e\u5728\u5546\u696d\u73fe\u5be6\u7684\u300c\u4e2d\u570b\u666e\u901a\u8a71\u300d\u9762\u524d\u90fd\u7b97\u662f\u6c92\u4ec0\u9ebc\u5e02\u5834\u50f9\u503c\u7684\u5c0f\u8a9e\u7a2e\uff0c\u5c31\u66f4\u5225\u63d0\u53f0\u95a9\u8a9e\u3001\u5ba2\u8a9e\u3001\u5357\u5cf6\u8a9e\u2026\u7b49\u672c\u571f\u8a9e\u8a00\u4e86\u3002\u56e0\u6b64\u9019\u4e9b\u672c\u571f\u8a9e\u8a00\u7684 NLP \u5de5\u5177\u7684\u958b\u767c\u6709\u8cf4\u500b\u4eba\u652f\u6301\u3002\u6211\u4e5f\u53ea\u80fd\u5728\u5de5\u9918\u7684\u6642\u9593\u76e1\u529b\u8ca2\u737b\u3002\n\n\u82e5\u60a8\u6709\u610f\u8d0a\u52a9 **Articut_Taigi (\u6587\u622a\u53f0\u8a9e NLP \u5de5\u5177)** \u53ca\u5176\u5f8c\u5404\u7a2e\u53f0\u7063\u672c\u571f\u8a9e\u8a00 NLP \u5de5\u5177 (e.g., Articut_Hakka, Articut_Amis, ... \u7b49) \u7684\u958b\u767c\uff0c\u6b61\u8fce\u76f4\u63a5\u8d0a\u52a9\u958b\u767c\u8005\u672c\u4eba\u5c0f\u5f1f\u5728\u4e0b\u6211\uff1a [http://paypal.me/donatepeterwolf](http://paypal.me/donatepeterwolf) \u3002\n\n### \u4e3b\u8981\u529f\u80fd\uff1a\n- \u5168\u767d\u8a71\u5b57\u65b7\u8a5e\u66a8\u8a5e\u6027/NER \u6a19\u8a18 (e.g., \"\u6b61\u8fce\u9010\u5bb6\u505a\u4f19\u4f86\u505a\u53f0\u7063\u8a9e\u8a00\")\n- \u5168\u53f0\u7f85\u62fc\u97f3\u65b7\u8a5e\u66a8\u8a5e\u6027/NER \u6a19\u8a18 (e.g., \"huan-g\u00eeng ta\u030dk-ke ts\u00f2-hu\u00e9 l\u00e2i ts\u00f2 t\u00e2i-uan g\u00ed-gi\u00e2n\")\n- \u767d\u8a71\u5b57\u53f0\u7f85\u62fc\u97f3\u6df7\u6253\u65b7\u8a5e\u66a8\u8a5e\u6027/NER \u6a19\u8a18 (e.g., \u6b61\u8fceta\u030dk-ke\u505a\u4f19\u4f86\u505a t\u00e2i-uan \u8a9e\u8a00\")\n- \u81ea\u8a02\u8a5e\u5178\n\n### \u9032\u968e\u529f\u80fd\uff1a\n- \u767d\u8a71\u5b57\u8f49\u8b6f\u53f0\u7f85\u62fc\u97f3\n- \u53f0\u7f85\u62fc\u97f3\u8f49\u8b6f\u767d\u8a71\u5b57 (\u65bd\u4f5c\u4e2d\u2026)\n\n### \u7db2\u9801\u64cd\u4f5c\u4ecb\u9762\uff1a\n[\u570b\u7acb\u6e05\u83ef\u5927\u5b78\u8a9e\u8a00\u5b78\u7814\u7a76\u6240:: \u672c\u571f\u8a9e\u8a00\u65b7\u8a5e\u7cfb\u7d71](https://taiwan-lingu.ist/segmentation/)\n\n\n## I. \u57fa\u672c\u64cd\u4f5c\uff1a\u65b7\u8a5e(WS)/\u8a5e\u6027\u6a19\u8a18(POS)/\u547d\u540d\u5be6\u9ad4\u8fa8\u8b58(NER)\n\n```python\nfrom ArticutAPI_Taigi import ArticutTG\nfrom pprint import pprint\nusername = \"\" #\u9019\u88e1\u586b\u5165\u60a8\u5728 https://api.droidtown.co \u4f7f\u7528\u7684\u5e33\u865f email\u3002\u82e5\u4f7f\u7528\u7a7a\u5b57\u4e32\uff0c\u5247\u9810\u8a2d\u4f7f\u7528\u6bcf\u5c0f\u6642 2000 \u5b57\u7684\u516c\u7528\u984d\u5ea6\u3002\napikey = \"\" #\u9019\u88e1\u586b\u5165\u60a8\u5728 https://api.droidtown.co \u767b\u5165\u5f8c\u53d6\u5f97\u7684 api Key\u3002\u82e5\u4f7f\u7528\u7a7a\u5b57\u4e32\uff0c\u5247\u9810\u8a2d\u4f7f\u7528\u6bcf\u5c0f\u6642 2000 \u5b57\u7684\u516c\u7528\u984d\u5ea6\u3002\narticutTG = ArticutTG(username, apikey)\ninputSTR = \"\u6b61\u8fce\u9010\u5bb6\u505a\u4f19\u4f86\u505a\u53f0\u7063\u8a9e\u8a00\"\nresultDICT = articutTG.parse(inputSTR, level=\"lv2\") #lv2 \u70ba\u9810\u8a2d\u503c\npprint(resultDICT)\n```\n\n### \u56de\u50b3\u7d50\u679c\n```python\n{'exec_time': 0.17456531524658203,\n 'level': 'lv2',\n 'msg': 'Success!',\n 'result_obj': [[{'pos': 'ACTION_verb', 'text': '\u6b61\u8fce'},\n {'pos': 'ENTITY_pronoun', 'text': '\u9010\u5bb6'},\n {'pos': 'MODIFIER', 'text': '\u505a\u4f19'},\n {'pos': 'ACTION_verb', 'text': '\u4f86'},\n {'pos': 'ACTION_verb', 'text': '\u505a'},\n {'pos': 'LOCATION', 'text': '\u53f0\u7063'},\n {'pos': 'ENTITY_noun', 'text': '\u8a9e\u8a00'}]],\n 'result_pos': ['<ACTION_verb>\u6b61\u8fce</ACTION_verb><ENTITY_pronoun>\u9010\u5bb6</ENTITY_pronoun><MODIFIER>\u505a\u4f19</MODIFIER><ACTION_verb>\u4f86</ACTION_verb><ACTION_verb>\u505a</ACTION_verb><LOCATION>\u53f0\u7063</LOCATION><ENTITY_noun>\u8a9e\u8a00</ENTITY_noun>'],\n 'result_segmentation': '\u6b61\u8fce/\u9010\u5bb6/\u505a\u4f19/\u4f86/\u505a/\u53f0\u7063/\u8a9e\u8a00',\n 'status': True,\n 'version': 'v261',\n 'word_count_balance': 1965}\n\n```\n\n## \u4f7f\u7528\u81ea\u8a02\u8a5e\u5178\n\n**ArticutAPI_Taigi** \u652f\u63f4\u81ea\u8a02\u8a5e\u5178\u7684\u8a2d\u5b9a\u3002\u8a5e\u5178\u9700\u5b58\u70ba\u4e00 .json \u6a94\uff0c\u4e14\u5167\u5bb9\u683c\u5f0f\u5982\u4e0b\uff1a\n\n```json\n{\n \"ACTION_verb\" : [ ], #\u666e\u901a\u52d5\u8a5e\n \"ACTION_lightVerb\" : [ ], #\u8f15\u52d5\u8a5e (e.g., \u628a\u3001\u4f7f\u2026)\n \"ACTION_quantifiedVerb\" : [ ], #\u91cf\u5316\u52d5\u8a5e (e.g., \u5477\u770b\u561c\u3001\u807d\u770b\u770b\u2026\u7b49\u8868\u793a\u52d5\u4f5c\u53ea\u505a\u4e86\u8f15\u5fae\u5690\u8a66\u7684\u52d5\u8a5e)\n \"ACTION_eventQuantifier\": [ ], #\u4e8b\u4ef6\u91cf\u5316\u8a5e (e.g., \u8d9f\u3001\u5708\u2026\u7b49\u8868\u7528\u4ee5\u8a08\u7b97\u4e8b\u4ef6\u767c\u751f\u6b21\u6578\u7684\u8a5e\u5f59)\n \"ASPECT\" : [ ], #\u6642\u614b\u6a19\u8a18\u8a5e (e.g., \u770b\u904e\u3001\u5403\u904e\u2026\u7b49\u8a5e\u4e2d\u7684\u300c\u904e\u300d)\n \"AUX\" : [ ], #\u52a9\u52d5\u8a5e (e.g., \u70ba\u3001\u662f\u2026\u7b49)\n \"CLAUSE_particle\" : [ ], #\u8a9e\u6c23\u8a5e (e.g., \u554a\u3001\u561b\u2026\u7b49)\n \"CLAUSE_Q\" : [ ], #\u7591\u554f\u8a5e (e.g., \u55ce\u3001\u662f\u4e0d\u662f\u2026\u7b49)\n \"ENTITY_classifier\" : [ ], #\u91cf\u8a5e (e.g., \u4e00\u90e8\u8eca\u7684\u300c\u90e8\u300d\u3001\u4e00\u982d\u725b\u7684\u300c\u982d\u300d)\n \"ENTITY_DetPhrase\" : [ ], #\u51a0\u8a5e\u8a5e\u7d44 (e.g., \u9019\u500b\u3001\u90a3\u4f4d\u2026\u7b49)\n \"ENTITY_measurement\" : [ ], #\u91cf\u6e2c\u8a5e\u7d44 (e.g., \u5169\u516c\u65a4\u300130\u5c3a\u2026\u7b49)\n \"ENTITY_noun\" : [ ], #\u666e\u901a\u540d\u8a5e\n \"ENTITY_num\" : [ ], #\u6578\u5b57\n \"ENTITY_person\" : [ ], #\u4eba\u540d\n \"ENTITY_possessive\" : [ ], #\u6240\u6709\u683c\u4ee3\u540d\u8a5e (e.g., \u6211\u7684\u3001\u4ed6\u5011\u7684)\n \"ENTITY_pronoun\" : [ ], #\u4ee3\u540d\u8a5e (e.g., \u4f60\u3001\u4ed6\u3001\u54e5\u54e5\u2026\u7b49)\n \"FUNC_conjunction\" : [ ], #\u9023\u7d50\u8a5e (e.g., \u548c\u3001\u8207\u2026\u7b49)\n \"FUNC_degreeHead\" : [ ], #\u7a0b\u5ea6\u8a5e\u4e2d\u5fc3\u8a9e (e.g., \u5f88\u3001\u975e\u5e38\u2026\u7b49)\n \"FUNC_inner\" : [ ], #\u529f\u80fd\u8a5e\uff0c\u4e0d\u6d89\u53ca\u5176\u5b83\u53e5\u5b50\u5b58\u5728 (e.g., \u7684)\n \"FUNC_inter\" : [ ], #\u529f\u80fd\u8a5e\uff0c\u6697\u793a\u5176\u5b83\u53e5\u5b50\u5b58\u5728 (e.g., \u800c\u4e14)\n \"FUNC_negation\" : [ ], #\u5426\u5b9a\u8a5e (e.g., \u4e0d\u3001\u6c92\u3001\u5638\u2026\u7b49)\n \"IDIOM\" : [ ], #\u6210\u8a9e\u3001\u4fda\u8a9e\u3001\u4fd7\u8a9e\n \"LOCATION\" : [ ], #\u5730\u540d\n \"MODAL\" : [ ], #\u60c5\u614b\u52d5\u8a5e (e.g., \u6703\u3001\u80fd\u2026\u7b49)\n \"MODIFIER\" : [ ], #\u5f62\u5bb9\u8a5e\n \"MODIFIER_color\" : [ ], #\u984f\u8272\u5f62\u5bb9\u8a5e\n \"QUANTIFIER\" : [ ], #\u91cf\u5316\u8a5e (e.g., \u516b\u6210\u3001\u4e00\u4e9b\u2026\u7b49)\n \"RANGE_locality\" : [ ], #\u5730\u9ede\u65b9\u4f4d\u8a5e (e.g., \u9644\u8fd1\u3001\u65c1\u908a\u2026\u7b49)\n \"RANGE_period\" : [ ], #\u6642\u9593\u65b9\u4f4d\u8a5e (e.g., \u4e4b\u524d\u3001\u4ee5\u5f8c\u2026\u7b49)\n \"TIME_justtime\" : [ ], #\u77ed\u6642\u9593\u8a5e\n \"TIME_season\" : [ ] #\u5b63\u7bc0\u6642\u9593\u8a5e\n}\n```\n\u4f7f\u7528\u6642\uff0c\u53ea\u8981\u5728 .parse() \u4e2d\u6307\u5b9a\u5b57\u5178\u6a94\u5373\u53ef\uff1a\n\n```python\nfrom ArticutAPI_Taigi import ArticutTG\nfrom pprint import pprint\nusername = \"\" #\u9019\u88e1\u586b\u5165\u60a8\u5728 https://api.droidtown.co \u4f7f\u7528\u7684\u5e33\u865f email\u3002\u82e5\u4f7f\u7528\u7a7a\u5b57\u4e32\uff0c\u5247\u9810\u8a2d\u4f7f\u7528\u6bcf\u5c0f\u6642 2000 \u5b57\u7684\u516c\u7528\u984d\u5ea6\u3002\napikey = \"\" #\u9019\u88e1\u586b\u5165\u60a8\u5728 https://api.droidtown.co \u767b\u5165\u5f8c\u53d6\u5f97\u7684 api Key\u3002\u82e5\u4f7f\u7528\u7a7a\u5b57\u4e32\uff0c\u5247\u9810\u8a2d\u4f7f\u7528\u6bcf\u5c0f\u6642 2000 \u5b57\u7684\u516c\u7528\u984d\u5ea6\u3002\narticutTG = ArticutTG(username, apikey)\ninputSTR = \"\u6b61\u8fce\u9010\u5bb6\u505a\u4f19\u4f86\u505a\u53f0\u7063\u8a9e\u8a00\"\nresultDICT = articutTG.parse(inputSTR, level=\"lv2\", userDefinedDictFILE=\"\"my_dictionary.json\") \npprint(resultDICT)\n```\n\n---\n## II. \u9032\u968e\u64cd\u4f5c\uff1a\u767d\u8a71\u5b57\u8f49\u53f0\u7f85\u62fc\u97f3\n```python\nfrom ArticutAPI_Taigi import ArticutTG\nfrom pprint import pprint\nusername = \"\" #\u9019\u88e1\u586b\u5165\u60a8\u5728 https://api.droidtown.co \u4f7f\u7528\u7684\u5e33\u865f email\u3002\u82e5\u4f7f\u7528\u7a7a\u5b57\u4e32\uff0c\u5247\u9810\u8a2d\u4f7f\u7528\u6bcf\u5c0f\u6642 2000 \u5b57\u7684\u516c\u7528\u984d\u5ea6\u3002\napikey = \"\" #\u9019\u88e1\u586b\u5165\u60a8\u5728 https://api.droidtown.co \u767b\u5165\u5f8c\u53d6\u5f97\u7684 api Key\u3002\u82e5\u4f7f\u7528\u7a7a\u5b57\u4e32\uff0c\u5247\u9810\u8a2d\u4f7f\u7528\u6bcf\u5c0f\u6642 2000 \u5b57\u7684\u516c\u7528\u984d\u5ea6\u3002\narticutTG = ArticutTG(username, apikey)\ninputSTR = \"\u6b61\u8fce\u9010\u5bb6\u505a\u4f19\u4f86\u505a\u53f0\u7063\u8a9e\u8a00\"\nresultDICT = articutTG.parse(inputSTR, level=\"lv3\") #\u5c07 lv2 \u7684\u9810\u8a2d\u503c\u6539\u70ba lv3\npprint(resultDICT)\n```\n### \u56de\u50b3\u7d50\u679c\n```python\n{'entity': [[(179, 181, '\u8a9e\u8a00')]],\n 'exec_time': 0.1532421112060547,\n 'level': 'lv3',\n 'msg': 'Success!',\n 'person': [[(45, 47, '\u9010\u5bb6')]],\n 'site': [[(153, 155, '\u53f0\u7063')]],\n 'status': True,\n 'time': [[]],\n 'utterance': 'huan-g\u00eeng\u2571ta\u030dk-ke\u2571(ts\u00f2-hu\u00e9/tsu\u00e8-h\u00e9)\u2571l\u00e2i\u2571(ts\u00f2/tsu\u00e8)\u2571t\u00e2i-(uan/u\u00e2n)\u2571(g\u00ed-gi\u00e2n/g\u00fa-gi\u00e2n)',\n 'version': 'v261',\n 'word_count_balance': 1965}\n```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Articut NLP system provides not only finest results on Chinese word segmentaion (CWS), Part-of-Speech tagging (POS) and Named Entity Recogintion tagging (NER), but also the fastest online API service in the NLP industry.",
"version": "0.95",
"project_urls": {
"Documentation": "https://api.droidtown.co/ArticutAPI/document/",
"Homepage": "https://github.com/Droidtown/ArticutAPI_Taigi",
"Source": "https://github.com/Droidtown/ArticutAPI"
},
"split_keywords": [
"nlp",
" nlu",
" cws",
" pos",
" ner",
" ai",
" artificial intelligence",
" chinese word segmentation",
" computational linguistics",
" language",
" linguistics",
" graphql",
" natural language",
" natural language processing",
" natural language understanding",
" parsing",
" part-of-speech-embdding",
" part-of-speech-tagger",
" pos-tagger",
" pos-tagging",
" syntax",
" tagging",
" text analytics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "600765e18e115ee5b8ac73cce3277a2cc4c2f1463227232578161c4860525092",
"md5": "50772763cb05dde3eedcdae5efef4f1b",
"sha256": "61d60670d8e460f1b9c0b1ec6ff80a72057b0f95e4e58677935924f3b194a574"
},
"downloads": -1,
"filename": "ArticutAPI_Taigi-0.95-py3-none-any.whl",
"has_sig": false,
"md5_digest": "50772763cb05dde3eedcdae5efef4f1b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6.1",
"size": 653448,
"upload_time": "2024-09-09T15:53:34",
"upload_time_iso_8601": "2024-09-09T15:53:34.266222Z",
"url": "https://files.pythonhosted.org/packages/60/07/65e18e115ee5b8ac73cce3277a2cc4c2f1463227232578161c4860525092/ArticutAPI_Taigi-0.95-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "79d881937dc196660b7ae8ab48baef13dec3cf503d3545d51f6a2a45874e93db",
"md5": "66dd80a8e32735ca303cd3e15d5c142b",
"sha256": "f6fdd2749c37f2ee9c33c7b2c919ccc6484604274195c6e3ff19cbd2e6d383dc"
},
"downloads": -1,
"filename": "ArticutAPI_Taigi-0.95.tar.gz",
"has_sig": false,
"md5_digest": "66dd80a8e32735ca303cd3e15d5c142b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6.1",
"size": 644681,
"upload_time": "2024-09-09T15:53:37",
"upload_time_iso_8601": "2024-09-09T15:53:37.523719Z",
"url": "https://files.pythonhosted.org/packages/79/d8/81937dc196660b7ae8ab48baef13dec3cf503d3545d51f6a2a45874e93db/ArticutAPI_Taigi-0.95.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-09 15:53:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Droidtown",
"github_project": "ArticutAPI_Taigi",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "articutapi-taigi"
}