=============================================
Zeyrek: Morphological Analyzer and Lemmatizer
=============================================
.. image:: https://img.shields.io/pypi/v/zeyrek.svg
:target: https://pypi.python.org/pypi/zeyrek
.. image:: https://readthedocs.org/projects/zeyrek/badge/?version=latest
:target: https://zeyrek.readthedocs.io/en/latest/?badge=latest
:alt: Documentation Status
.. image:: https://github.com/obulat/zeyrek/workflows/build/badge.svg?branch=master
:alt: build
Zeyrek is a partial port of Zemberek library to Python for lemmatizing
and analyzing Turkish language words. It is in alpha stage, and the API
will probably change.
* Free software: MIT license
* Documentation: https://zeyrek.readthedocs.io.
Basic Usage
~~~~~~~~~~~
To use Zeyrek, first create an instance of MorphAnalyzer class::
>>> import zeyrek
>>> analyzer = zeyrek.MorphAnalyzer()
Then, you can call its `analyze` method on words or texts to get all possible analyses::
>>> print(analyzer.analyze('benim'))
Parse(word='benim', lemma='ben', pos='Noun', morphemes=['Noun', 'A3sg', 'P1sg'], formatted='[ben:Noun] ben:Noun+A3sg+im:P1sg')
Parse(word='benim', lemma='ben', pos='Pron', morphemes=['Pron', 'A1sg', 'Gen'], formatted='[ben:Pron,Pers] ben:Pron+A1sg+im:Gen')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Noun] ben:Noun+A3sg|Zero→Verb+Pres+im:A1sg')
Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Pron', 'A1sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Pron,Pers] ben:Pron+A1sg|Zero→Verb+Pres+im:A1sg')
If you only need the base form of words, or lemmas, you can call `lemmatize`. It returns a list
of tuples, with word itself and a list of possible lemmas::
>>> print(analyzer.lemmatize('benim'))
[('benim', ['ben'])]
Credits
-------
This package is a Python port of part of the Zemberek_ package by `Ahmet A. Akın`_
.. _Zemberek: https://github.com/ahmetaa/zemberek-nlp
.. _Ahmet A. Akın: https://github.com/ahmetaa/
This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.
.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage
Raw data
{
"_id": null,
"home_page": "https://github.com/obulat/zeyrek",
"name": "zeyrek",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "zeyrek",
"author": "Olga Bulat",
"author_email": "obulat@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/04/f2/03238387bb70c2efbc88843032c91af3e278d317d2b120376bc02d1aff04/zeyrek-0.1.3.tar.gz",
"platform": null,
"description": "=============================================\nZeyrek: Morphological Analyzer and Lemmatizer\n=============================================\n\n\n.. image:: https://img.shields.io/pypi/v/zeyrek.svg\n :target: https://pypi.python.org/pypi/zeyrek\n\n.. image:: https://readthedocs.org/projects/zeyrek/badge/?version=latest\n :target: https://zeyrek.readthedocs.io/en/latest/?badge=latest\n :alt: Documentation Status\n \n.. image:: https://github.com/obulat/zeyrek/workflows/build/badge.svg?branch=master\n :alt: build\n\nZeyrek is a partial port of Zemberek library to Python for lemmatizing\nand analyzing Turkish language words. It is in alpha stage, and the API\nwill probably change.\n\n\n* Free software: MIT license\n* Documentation: https://zeyrek.readthedocs.io.\n\n\nBasic Usage\n~~~~~~~~~~~\n\nTo use Zeyrek, first create an instance of MorphAnalyzer class::\n\n >>> import zeyrek\n >>> analyzer = zeyrek.MorphAnalyzer()\n\nThen, you can call its `analyze` method on words or texts to get all possible analyses::\n\n >>> print(analyzer.analyze('benim'))\n Parse(word='benim', lemma='ben', pos='Noun', morphemes=['Noun', 'A3sg', 'P1sg'], formatted='[ben:Noun] ben:Noun+A3sg+im:P1sg')\n Parse(word='benim', lemma='ben', pos='Pron', morphemes=['Pron', 'A1sg', 'Gen'], formatted='[ben:Pron,Pers] ben:Pron+A1sg+im:Gen')\n Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Noun', 'A3sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Noun] ben:Noun+A3sg|Zero\u2192Verb+Pres+im:A1sg')\n Parse(word='benim', lemma='ben', pos='Verb', morphemes=['Pron', 'A1sg', 'Zero', 'Verb', 'Pres', 'A1sg'], formatted='[ben:Pron,Pers] ben:Pron+A1sg|Zero\u2192Verb+Pres+im:A1sg')\n\nIf you only need the base form of words, or lemmas, you can call `lemmatize`. It returns a list\nof tuples, with word itself and a list of possible lemmas::\n\n >>> print(analyzer.lemmatize('benim'))\n [('benim', ['ben'])]\n\nCredits\n-------\n\nThis package is a Python port of part of the Zemberek_ package by `Ahmet A. Ak\u0131n`_\n\n.. _Zemberek: https://github.com/ahmetaa/zemberek-nlp\n.. _Ahmet A. Ak\u0131n: https://github.com/ahmetaa/\n\nThis package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.\n\n.. _Cookiecutter: https://github.com/audreyr/cookiecutter\n.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage\n\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "Python morphological analyzer and lemmatizer for Turkish",
"version": "0.1.3",
"project_urls": {
"Homepage": "https://github.com/obulat/zeyrek"
},
"split_keywords": [
"zeyrek"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7f5b76970fab035d2e2649ba06af037c394b9690d0e07a8faf53817ccccb3951",
"md5": "6b598cd65ee8cbca6275d6acb65f9201",
"sha256": "23649bb49322a52d1e94959029b047fa4037bc540762819feb1096aa976b25b5"
},
"downloads": -1,
"filename": "zeyrek-0.1.3-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "6b598cd65ee8cbca6275d6acb65f9201",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 930996,
"upload_time": "2022-12-29T12:24:46",
"upload_time_iso_8601": "2022-12-29T12:24:46.910434Z",
"url": "https://files.pythonhosted.org/packages/7f/5b/76970fab035d2e2649ba06af037c394b9690d0e07a8faf53817ccccb3951/zeyrek-0.1.3-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "04f203238387bb70c2efbc88843032c91af3e278d317d2b120376bc02d1aff04",
"md5": "4d1fe96e5a570716ef108398b60e92c4",
"sha256": "428054015258a48a61fdf759ef51be1ccb0437c65584ab3113de29366222a130"
},
"downloads": -1,
"filename": "zeyrek-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "4d1fe96e5a570716ef108398b60e92c4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 929549,
"upload_time": "2022-12-29T12:24:51",
"upload_time_iso_8601": "2022-12-29T12:24:51.312007Z",
"url": "https://files.pythonhosted.org/packages/04/f2/03238387bb70c2efbc88843032c91af3e278d317d2b120376bc02d1aff04/zeyrek-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-29 12:24:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "obulat",
"github_project": "zeyrek",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "zeyrek"
}