# SpacemanX (失傳的空格)
很久以前,當人類仍然知道全形與半形字集的區別時,一個受過教育的打字者會注意把全型字符和半型字符之間加一個「空格」。後來,隨著文明的崩解與人類集體智力表現的衰退,這個古老的技能漸漸被遺忘了。
SpacemanX 脫胎自 Python2 時代的 Spaceman,在 Python3.6+ 的版本可以運作。它的功能就是確保「半型符號」和「全型符號」之間至少有一個空格。例如 `用English寫2次` 經過 SpacemanX 處理後,會成為 `用 English 寫 2 次` (注意到 English 這個半型字串和前後的全型字符之間多了空格,且 2 這個半型字符也和前後的全型字符之間多了空格)。
### 主要功能:
- 加空格!(不然你期待什麼?發射火箭嗎?)
### 功能:
- 輸入字串 => 輸出字串
- 輸入檔案 => 輸出檔案
- 模式 (mode) 可選為 **"modest"(預設值)** 或 **"strong"**
- modest 模式 (預設值)
```python
import SpacemanX
inputSTR = "這是一個(English)測試"
resultSTR = SpacemanX.makeroom(inputSTR)
print(resultSTR)
# "這是一個 (English) 測試" 括號前後的半型字符不加空格。
# 適合閱讀使用
```
- strong 模式
```python
import SpacemanX
inputSTR = "這是一個(English)測試"
resultSTR = SpacemanX.makeroom(inputSTR, mode="strong")
print(resultSTR)
# "這是一個 ( English ) 測試" 括號前後也加上空格。
# 適合 NLP 前處理使用
```
Raw data
{
"_id": null,
"home_page": "https://github.com/Droidtown/SpacemanX",
"name": "SpacemanX",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6.1",
"maintainer_email": "",
"keywords": "NLP,Chinese word segmentation,computational linguistics,language,linguistics,natural language,natural language processing,natural language understanding,parsing,syntax,text analytics",
"author": "Droidtown Linguistic Tech. Co. Ltd.",
"author_email": "info@droidtown.co",
"download_url": "https://files.pythonhosted.org/packages/e1/13/9ae9dfcb20c30ce6980729c7a76f56699da589ebc9d53ba28d5d98d7cd11/SpacemanX-0.99.tar.gz",
"platform": null,
"description": "# SpacemanX (\u5931\u50b3\u7684\u7a7a\u683c)\n\n\u5f88\u4e45\u4ee5\u524d\uff0c\u7576\u4eba\u985e\u4ecd\u7136\u77e5\u9053\u5168\u5f62\u8207\u534a\u5f62\u5b57\u96c6\u7684\u5340\u5225\u6642\uff0c\u4e00\u500b\u53d7\u904e\u6559\u80b2\u7684\u6253\u5b57\u8005\u6703\u6ce8\u610f\u628a\u5168\u578b\u5b57\u7b26\u548c\u534a\u578b\u5b57\u7b26\u4e4b\u9593\u52a0\u4e00\u500b\u300c\u7a7a\u683c\u300d\u3002\u5f8c\u4f86\uff0c\u96a8\u8457\u6587\u660e\u7684\u5d29\u89e3\u8207\u4eba\u985e\u96c6\u9ad4\u667a\u529b\u8868\u73fe\u7684\u8870\u9000\uff0c\u9019\u500b\u53e4\u8001\u7684\u6280\u80fd\u6f38\u6f38\u88ab\u907a\u5fd8\u4e86\u3002\n\nSpacemanX \u812b\u80ce\u81ea Python2 \u6642\u4ee3\u7684 Spaceman\uff0c\u5728 Python3.6+ \u7684\u7248\u672c\u53ef\u4ee5\u904b\u4f5c\u3002\u5b83\u7684\u529f\u80fd\u5c31\u662f\u78ba\u4fdd\u300c\u534a\u578b\u7b26\u865f\u300d\u548c\u300c\u5168\u578b\u7b26\u865f\u300d\u4e4b\u9593\u81f3\u5c11\u6709\u4e00\u500b\u7a7a\u683c\u3002\u4f8b\u5982 `\u7528English\u5beb2\u6b21` \u7d93\u904e SpacemanX \u8655\u7406\u5f8c\uff0c\u6703\u6210\u70ba `\u7528 English \u5beb 2 \u6b21` (\u6ce8\u610f\u5230 English \u9019\u500b\u534a\u578b\u5b57\u4e32\u548c\u524d\u5f8c\u7684\u5168\u578b\u5b57\u7b26\u4e4b\u9593\u591a\u4e86\u7a7a\u683c\uff0c\u4e14 2 \u9019\u500b\u534a\u578b\u5b57\u7b26\u4e5f\u548c\u524d\u5f8c\u7684\u5168\u578b\u5b57\u7b26\u4e4b\u9593\u591a\u4e86\u7a7a\u683c)\u3002\n\n### \u4e3b\u8981\u529f\u80fd\uff1a\n- \u52a0\u7a7a\u683c\uff01(\u4e0d\u7136\u4f60\u671f\u5f85\u4ec0\u9ebc\uff1f\u767c\u5c04\u706b\u7bad\u55ce\uff1f)\n\n### \u529f\u80fd\uff1a\n- \u8f38\u5165\u5b57\u4e32 => \u8f38\u51fa\u5b57\u4e32\n- \u8f38\u5165\u6a94\u6848 => \u8f38\u51fa\u6a94\u6848\n- \u6a21\u5f0f (mode) \u53ef\u9078\u70ba **\"modest\"(\u9810\u8a2d\u503c)** \u6216 **\"strong\"**\n - modest \u6a21\u5f0f (\u9810\u8a2d\u503c)\n ```python\n import SpacemanX\n inputSTR = \"\u9019\u662f\u4e00\u500b(English)\u6e2c\u8a66\"\n resultSTR = SpacemanX.makeroom(inputSTR)\n print(resultSTR)\n # \"\u9019\u662f\u4e00\u500b (English) \u6e2c\u8a66\" \u62ec\u865f\u524d\u5f8c\u7684\u534a\u578b\u5b57\u7b26\u4e0d\u52a0\u7a7a\u683c\u3002\n # \u9069\u5408\u95b1\u8b80\u4f7f\u7528\n ```\n - strong \u6a21\u5f0f\n ```python\n import SpacemanX\n inputSTR = \"\u9019\u662f\u4e00\u500b(English)\u6e2c\u8a66\"\n resultSTR = SpacemanX.makeroom(inputSTR, mode=\"strong\")\n print(resultSTR)\n # \"\u9019\u662f\u4e00\u500b ( English ) \u6e2c\u8a66\" \u62ec\u865f\u524d\u5f8c\u4e5f\u52a0\u4e0a\u7a7a\u683c\u3002\n # \u9069\u5408 NLP \u524d\u8655\u7406\u4f7f\u7528\n ```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "SpacemanX is a module used to create space (make room) between semi-width and full-width characters.",
"version": "0.99",
"project_urls": {
"Homepage": "https://github.com/Droidtown/SpacemanX",
"Source": "https://github.com/Droidtown/SpacemanX"
},
"split_keywords": [
"nlp",
"chinese word segmentation",
"computational linguistics",
"language",
"linguistics",
"natural language",
"natural language processing",
"natural language understanding",
"parsing",
"syntax",
"text analytics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "69184a030687b0b290a573b0b8bc7e31414e9992dc7ee8b4102e739a1021b17d",
"md5": "050d9f1288a1a705421a540397c8e364",
"sha256": "def7aec25be8c13a3abca1ac87b5b2fa2b90dfea25f898da694f436abfcde120"
},
"downloads": -1,
"filename": "SpacemanX-0.99-py3-none-any.whl",
"has_sig": false,
"md5_digest": "050d9f1288a1a705421a540397c8e364",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6.1",
"size": 5700,
"upload_time": "2023-09-24T08:34:34",
"upload_time_iso_8601": "2023-09-24T08:34:34.662879Z",
"url": "https://files.pythonhosted.org/packages/69/18/4a030687b0b290a573b0b8bc7e31414e9992dc7ee8b4102e739a1021b17d/SpacemanX-0.99-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e1139ae9dfcb20c30ce6980729c7a76f56699da589ebc9d53ba28d5d98d7cd11",
"md5": "483de893aa76004168b83de26bdf1a36",
"sha256": "a1de6c3125afbe634e3dd8a0c9c230b50da0fb119ee3ffa2d031973a02c737f8"
},
"downloads": -1,
"filename": "SpacemanX-0.99.tar.gz",
"has_sig": false,
"md5_digest": "483de893aa76004168b83de26bdf1a36",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6.1",
"size": 5514,
"upload_time": "2023-09-24T08:34:36",
"upload_time_iso_8601": "2023-09-24T08:34:36.373671Z",
"url": "https://files.pythonhosted.org/packages/e1/13/9ae9dfcb20c30ce6980729c7a76f56699da589ebc9d53ba28d5d98d7cd11/SpacemanX-0.99.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-24 08:34:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Droidtown",
"github_project": "SpacemanX",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "spacemanx"
}