sinling


Namesinling JSON
Version 0.3.6 PyPI version JSON
download
home_pagehttps://github.com/ysenarath/sinling
SummaryA language processing tool for Sinhalese (සිංහල)
upload_time2020-11-08 00:02:47
maintainer
docs_urlNone
authorYasas Senarath
requires_python>=3.6
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # A language processing tool for Sinhalese (සිංහල). 

`Update 2020.11.01: Fixed pypi package. Use 'pip install sinling' to install sinling directly from repository.`

`Update 2020.08.16: Add pypi package @ https://pypi.org/project/sinling/.`

`Update 2020.08.16: Integrated Part of speech tagger and stemmer tool.`

`Update 2019.07.21: This tool no longer requires java to run sinhala tokenizer. 
All java code is ported to Python implementation for convenience.`

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ysenarath/sinling.git/master?filepath=notebooks%2Fexamples.ipynb)
[![PyPI version](https://badge.fury.io/py/sinling.svg)](https://badge.fury.io/py/sinling)

## Installation

Run the following command in your virtualenv to install this package.

`pip install sinling`

## How to use
### Sinhala Tokenizer
```python
from sinling import SinhalaTokenizer

tokenizer = SinhalaTokenizer()

sentence = '...'  # your sentence

tokenizer.tokenize(sentence)
```

### Sinhala Stemmer (Experimental)
```python
from sinling import SinhalaStemmer

stemmer = SinhalaStemmer()

word = '...'  # your sentence

stemmer.stem(word)
```

Please cite [sinhala-stemmer](https://github.com/rksk/sinhala-news-analysis/tree/master/sinhala-stemmer) if you are using this implementation.

### Part-of-Speech Tagger

```python
from sinling import SinhalaTokenizer, POSTagger

tokenizer = SinhalaTokenizer()

document = '...'  # may contain multiple sentences

tokenized_sentences = [tokenizer.tokenize(f'{ss}.') for ss in tokenizer.split_sentences(document)]

tagger = POSTagger()

pos_tags = tagger.predict(tokenized_sentences)
```

### Word Joiner (Morphological Joiner)
```python
from sinling import preprocess, word_joiner

w1 = preprocess('මුනි')
w2 = preprocess('උතුමා')
results = word_joiner.join(w1, w2)
# Returns a list of possible results after applying join rules ['මුනිතුමා', ...]
```

### Word Splitter (Morphological Splitter) / corpus based - *experimental*
```python
from sinling import word_splitter

word = '...'
results = word_splitter.split(word)
# Returns a dict containing debug information, base word and affix
```

Visit [here](https://github.com/ysenarath/sinling/blob/master/scripts/splitter.ipynb) to see some sample splits.

## Contributions
- Contact `wayasas.13@cse.mrt.ac.lk` if you would like to contribute to this project.

## License
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ysenarath/sinling",
    "name": "sinling",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "Yasas Senarath",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/fc/42/b512f34da021decfed3b04fc21b806f96b71d6b76e5e211d492dccd38fe6/sinling-0.3.6.tar.gz",
    "platform": "",
    "description": "# A language processing tool for Sinhalese (\u0dc3\u0dd2\u0d82\u0dc4\u0dbd). \n\n`Update 2020.11.01: Fixed pypi package. Use 'pip install sinling' to install sinling directly from repository.`\n\n`Update 2020.08.16: Add pypi package @ https://pypi.org/project/sinling/.`\n\n`Update 2020.08.16: Integrated Part of speech tagger and stemmer tool.`\n\n`Update 2019.07.21: This tool no longer requires java to run sinhala tokenizer. \nAll java code is ported to Python implementation for convenience.`\n\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ysenarath/sinling.git/master?filepath=notebooks%2Fexamples.ipynb)\n[![PyPI version](https://badge.fury.io/py/sinling.svg)](https://badge.fury.io/py/sinling)\n\n## Installation\n\nRun the following command in your virtualenv to install this package.\n\n`pip install sinling`\n\n## How to use\n### Sinhala Tokenizer\n```python\nfrom sinling import SinhalaTokenizer\n\ntokenizer = SinhalaTokenizer()\n\nsentence = '...'  # your sentence\n\ntokenizer.tokenize(sentence)\n```\n\n### Sinhala Stemmer (Experimental)\n```python\nfrom sinling import SinhalaStemmer\n\nstemmer = SinhalaStemmer()\n\nword = '...'  # your sentence\n\nstemmer.stem(word)\n```\n\nPlease cite [sinhala-stemmer](https://github.com/rksk/sinhala-news-analysis/tree/master/sinhala-stemmer) if you are using this implementation.\n\n### Part-of-Speech Tagger\n\n```python\nfrom sinling import SinhalaTokenizer, POSTagger\n\ntokenizer = SinhalaTokenizer()\n\ndocument = '...'  # may contain multiple sentences\n\ntokenized_sentences = [tokenizer.tokenize(f'{ss}.') for ss in tokenizer.split_sentences(document)]\n\ntagger = POSTagger()\n\npos_tags = tagger.predict(tokenized_sentences)\n```\n\n### Word Joiner (Morphological Joiner)\n```python\nfrom sinling import preprocess, word_joiner\n\nw1 = preprocess('\u0db8\u0dd4\u0db1\u0dd2')\nw2 = preprocess('\u0d8b\u0dad\u0dd4\u0db8\u0dcf')\nresults = word_joiner.join(w1, w2)\n# Returns a list of possible results after applying join rules ['\u0db8\u0dd4\u0db1\u0dd2\u0dad\u0dd4\u0db8\u0dcf', ...]\n```\n\n### Word Splitter (Morphological Splitter) / corpus based - *experimental*\n```python\nfrom sinling import word_splitter\n\nword = '...'\nresults = word_splitter.split(word)\n# Returns a dict containing debug information, base word and affix\n```\n\nVisit [here](https://github.com/ysenarath/sinling/blob/master/scripts/splitter.ipynb) to see some sample splits.\n\n## Contributions\n- Contact `wayasas.13@cse.mrt.ac.lk` if you would like to contribute to this project.\n\n## License\nApache License\nVersion 2.0, January 2004\nhttp://www.apache.org/licenses/\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A language processing tool for Sinhalese (\u0dc3\u0dd2\u0d82\u0dc4\u0dbd)",
    "version": "0.3.6",
    "project_urls": {
        "Homepage": "https://github.com/ysenarath/sinling"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "23bf43a39e626dfd56002d74c646cc4ab88b396c17e889bad03e4a27fbc10265",
                "md5": "b2ab74213d99462634f0bbba226b9379",
                "sha256": "eb3a58ede6531edd9865c8ae4f39ab74cb044bdc896848a0c088c836a91bb3cc"
            },
            "downloads": -1,
            "filename": "sinling-0.3.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b2ab74213d99462634f0bbba226b9379",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 20018406,
            "upload_time": "2020-11-08T00:02:45",
            "upload_time_iso_8601": "2020-11-08T00:02:45.364387Z",
            "url": "https://files.pythonhosted.org/packages/23/bf/43a39e626dfd56002d74c646cc4ab88b396c17e889bad03e4a27fbc10265/sinling-0.3.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fc42b512f34da021decfed3b04fc21b806f96b71d6b76e5e211d492dccd38fe6",
                "md5": "6584b3ed1312b1e31da7d65aacebad4b",
                "sha256": "a0c9cbd49823aab972b5ad059bb02bd315eff6dea480fc8025324b648919af93"
            },
            "downloads": -1,
            "filename": "sinling-0.3.6.tar.gz",
            "has_sig": false,
            "md5_digest": "6584b3ed1312b1e31da7d65aacebad4b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 19630223,
            "upload_time": "2020-11-08T00:02:47",
            "upload_time_iso_8601": "2020-11-08T00:02:47.798836Z",
            "url": "https://files.pythonhosted.org/packages/fc/42/b512f34da021decfed3b04fc21b806f96b71d6b76e5e211d492dccd38fe6/sinling-0.3.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2020-11-08 00:02:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ysenarath",
    "github_project": "sinling",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "sinling"
}
        
Elapsed time: 0.46000s