swt-nlp


Nameswt-nlp JSON
Version 0.0.57 PyPI version JSON
download
home_page
Summarysimple nlp pipeline
upload_time2023-02-02 08:43:58
maintainer
docs_urlNone
author@muuusiiik
requires_python>=3.7
licenseApache Software License 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # SWT-NLP PACKAGE
### PACKAGE INSTALLATION
``` shell
pip install swt-nlp
```

### KEYTERM EXTRACTION
#### OBJECTIVE
* extract new keyterm from corpus

#### DEMO
demo code for keyterm extraction
``` python
from swt.nlp.basis import keyterm_extractor
from tests.keyterm_extraction.test_keyterm_extraction_fit_modules import Mockup

# corpus in format list of plain text
small_content = Mockup.small_corpus()
# small_content[:5] = [
# 'อยากกระโดดน้ำที่แม่น้ำโขง',
# 'แม่น้ำที่จังหวัดกาญจนบุรีนี่สุดยอดมาก',
# 'เหล้ามีหลายยี่ห้อ แสงโสม แม่น้ำโขงหรืออะไรก็มีหมดเลย',
# 'ข้อความนี้เกี่ยวกับชิมช็อปใช้',
# 'รัฐบาลผลักดันชิมช็อปใช้มากขึ้น']

# extract new terms
kt = keyterm_extractor()
# - in case of using a custom tokenizer
# - this example is using word_tokenizer of pythainlp with keep_whitespace=False setting
# custom_tokenizer = lambda t: word_tokenize(t, keep_whitespace=False)  # your own callable tokenizer function 
# kt = keyterm_extractor(tokenizer=custom_tokenizer)
kt.fit(small_content)
new_terms = kt.extract()
# new_terms = ['ชิมช็อปใช้', 'แม่น้ำโขง']
```


### HOW TO BUILD A PACKAGE TO PYPI
prerequisite
``` shell
pip install setuptools wheel tqdm twine
```

build and upload package
``` shell
# preparing tar.gz package 
python setup.py sdist
# uploading package to pypi server
python -m twine upload dist/{package.tar.gz}  --verbose
```

install package
``` shell
# install latest version
pip install swt-nlp --upgrade
# specific version with no cache
pip install swt-nlp==0.0.11  --no-cache-dir
```

install package by wheel
``` shell
# build wheel 
python setup.py bdist_wheel

# install package by wheel 
# use --force-reinstall if needed
pip install dist/{package.whl}
```




            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "swt-nlp",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "",
    "author": "@muuusiiik",
    "author_email": "muuusiiikd@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/de/bb/c3a81fe353c73d7d7239c67f0b88a49adda27738e59afedc878af61ff1bb/swt-nlp-0.0.57.tar.gz",
    "platform": null,
    "description": "# SWT-NLP PACKAGE\n### PACKAGE INSTALLATION\n``` shell\npip install swt-nlp\n```\n\n### KEYTERM EXTRACTION\n#### OBJECTIVE\n* extract new keyterm from corpus\n\n#### DEMO\ndemo code for keyterm extraction\n``` python\nfrom swt.nlp.basis import keyterm_extractor\nfrom tests.keyterm_extraction.test_keyterm_extraction_fit_modules import Mockup\n\n# corpus in format list of plain text\nsmall_content = Mockup.small_corpus()\n# small_content[:5] = [\n# '\u0e2d\u0e22\u0e32\u0e01\u0e01\u0e23\u0e30\u0e42\u0e14\u0e14\u0e19\u0e49\u0e33\u0e17\u0e35\u0e48\u0e41\u0e21\u0e48\u0e19\u0e49\u0e33\u0e42\u0e02\u0e07',\n# '\u0e41\u0e21\u0e48\u0e19\u0e49\u0e33\u0e17\u0e35\u0e48\u0e08\u0e31\u0e07\u0e2b\u0e27\u0e31\u0e14\u0e01\u0e32\u0e0d\u0e08\u0e19\u0e1a\u0e38\u0e23\u0e35\u0e19\u0e35\u0e48\u0e2a\u0e38\u0e14\u0e22\u0e2d\u0e14\u0e21\u0e32\u0e01',\n# '\u0e40\u0e2b\u0e25\u0e49\u0e32\u0e21\u0e35\u0e2b\u0e25\u0e32\u0e22\u0e22\u0e35\u0e48\u0e2b\u0e49\u0e2d \u0e41\u0e2a\u0e07\u0e42\u0e2a\u0e21 \u0e41\u0e21\u0e48\u0e19\u0e49\u0e33\u0e42\u0e02\u0e07\u0e2b\u0e23\u0e37\u0e2d\u0e2d\u0e30\u0e44\u0e23\u0e01\u0e47\u0e21\u0e35\u0e2b\u0e21\u0e14\u0e40\u0e25\u0e22',\n# '\u0e02\u0e49\u0e2d\u0e04\u0e27\u0e32\u0e21\u0e19\u0e35\u0e49\u0e40\u0e01\u0e35\u0e48\u0e22\u0e27\u0e01\u0e31\u0e1a\u0e0a\u0e34\u0e21\u0e0a\u0e47\u0e2d\u0e1b\u0e43\u0e0a\u0e49',\n# '\u0e23\u0e31\u0e10\u0e1a\u0e32\u0e25\u0e1c\u0e25\u0e31\u0e01\u0e14\u0e31\u0e19\u0e0a\u0e34\u0e21\u0e0a\u0e47\u0e2d\u0e1b\u0e43\u0e0a\u0e49\u0e21\u0e32\u0e01\u0e02\u0e36\u0e49\u0e19']\n\n# extract new terms\nkt = keyterm_extractor()\n# - in case of using a custom tokenizer\n# - this example is using word_tokenizer of pythainlp with keep_whitespace=False setting\n# custom_tokenizer = lambda t: word_tokenize(t, keep_whitespace=False)  # your own callable tokenizer function \n# kt = keyterm_extractor(tokenizer=custom_tokenizer)\nkt.fit(small_content)\nnew_terms = kt.extract()\n# new_terms = ['\u0e0a\u0e34\u0e21\u0e0a\u0e47\u0e2d\u0e1b\u0e43\u0e0a\u0e49', '\u0e41\u0e21\u0e48\u0e19\u0e49\u0e33\u0e42\u0e02\u0e07']\n```\n\n\n### HOW TO BUILD A PACKAGE TO PYPI\nprerequisite\n``` shell\npip install setuptools wheel tqdm twine\n```\n\nbuild and upload package\n``` shell\n# preparing tar.gz package \npython setup.py sdist\n# uploading package to pypi server\npython -m twine upload dist/{package.tar.gz}  --verbose\n```\n\ninstall package\n``` shell\n# install latest version\npip install swt-nlp --upgrade\n# specific version with no cache\npip install swt-nlp==0.0.11  --no-cache-dir\n```\n\ninstall package by wheel\n``` shell\n# build wheel \npython setup.py bdist_wheel\n\n# install package by wheel \n# use --force-reinstall if needed\npip install dist/{package.whl}\n```\n\n\n\n",
    "bugtrack_url": null,
    "license": "Apache Software License 2.0",
    "summary": "simple nlp pipeline",
    "version": "0.0.57",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "debbc3a81fe353c73d7d7239c67f0b88a49adda27738e59afedc878af61ff1bb",
                "md5": "126950da74959429bc549a239f67fac7",
                "sha256": "9f3c5a7fc0717730ef0f629631605aad1571b7996646a6fb9e0c8302c4b5296f"
            },
            "downloads": -1,
            "filename": "swt-nlp-0.0.57.tar.gz",
            "has_sig": false,
            "md5_digest": "126950da74959429bc549a239f67fac7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 26634,
            "upload_time": "2023-02-02T08:43:58",
            "upload_time_iso_8601": "2023-02-02T08:43:58.424316Z",
            "url": "https://files.pythonhosted.org/packages/de/bb/c3a81fe353c73d7d7239c67f0b88a49adda27738e59afedc878af61ff1bb/swt-nlp-0.0.57.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-02-02 08:43:58",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "swt-nlp"
}
        
Elapsed time: 0.03461s