# Authors
**Author1: [Maksudbek](https://github.com/MaksudSharipov)**
**Author2: [Dasturbek](https://github.com/ddasturbek)**
# Lemma & Lemmatization
The package finds lemmas of Uzbek words based on the dictionary.
The process of finding a lemma is called lemmatization.
There are 4 different ways of lemmatization: rule, dictionary, model, hybrid.
It is dictionary-based lemmatization algorithm [program, package].
# Install & Clone
```bash
pip install UzbekLemma
```
```bash
git clone https://github.com/ddasturbek/UzbekLemma.git
```
# Usage
```Python
import UzbekLemma as UL
print(UL.lemmatize("kelganlar")) #kelmoq
```
# The algorithm flowchart
<img alt="Flowchart algorithm" src="https://github.com/user-attachments/assets/6504ee82-e98f-46ac-9b09-6dd811809be0"/>
# The dictionary structure
<img alt="soz_turkumlari" src="https://github.com/ddasturbek/UzbekLemma/assets/76460501/f9d9b0bd-6549-48cc-91d5-b10b208681b7"/>
# Scientific field
<img alt="Certificate" src="https://github.com/user-attachments/assets/16da0619-5d75-4d46-99e5-a4b3b828e7d7"/>
# Patent
<img alt="image" src="https://github.com/user-attachments/assets/2293c61b-b200-4a46-8433-59f7bd8928b5"/>
# Some results of the program
<img alt="image" src="https://github.com/ddasturbek/UzbekLemma/assets/76460501/2f9455a0-ebff-4677-b947-3cbfbd46bdf4"/>
# Corpus & Results
We collected an equal number of texts from 23 different fields and stored them as a [corpus](https://github.com/ddasturbek/UzbekLemma/tree/main/Corpus).
We tested all the files (i.e. corpora) in the program and got these [results](https://github.com/ddasturbek/UzbekLemma/tree/main/Results).
Raw data
{
"_id": null,
"home_page": "https://github.com/ddasturbek/UzbekLemma",
"name": "UzbekLemma",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "MaksudSharipov, Dasturbek",
"author_email": "sobirovogabek0409@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/66/be/d2feb23389411c2a4da6a9a8306c80ac9d0655d1172050eae4b6ba78f30a/uzbeklemma-1.0.2.tar.gz",
"platform": null,
"description": "# Authors\r\n\r\n**Author1: [Maksudbek](https://github.com/MaksudSharipov)**\r\n\r\n**Author2: [Dasturbek](https://github.com/ddasturbek)**\r\n\r\n# Lemma & Lemmatization\r\nThe package finds lemmas of Uzbek words based on the dictionary.\r\n\r\nThe process of finding a lemma is called lemmatization.\r\n\r\nThere are 4 different ways of lemmatization: rule, dictionary, model, hybrid.\r\n\r\nIt is dictionary-based lemmatization algorithm [program, package].\r\n\r\n# Install & Clone\r\n\r\n```bash\r\npip install UzbekLemma\r\n```\r\n\r\n```bash\r\ngit clone https://github.com/ddasturbek/UzbekLemma.git\r\n```\r\n\r\n# Usage\r\n\r\n```Python\r\nimport UzbekLemma as UL\r\n\r\nprint(UL.lemmatize(\"kelganlar\")) #kelmoq\r\n```\r\n\r\n\r\n# The algorithm flowchart\r\n\r\n<img alt=\"Flowchart algorithm\" src=\"https://github.com/user-attachments/assets/6504ee82-e98f-46ac-9b09-6dd811809be0\"/>\r\n\r\n# The dictionary structure\r\n\r\n<img alt=\"soz_turkumlari\" src=\"https://github.com/ddasturbek/UzbekLemma/assets/76460501/f9d9b0bd-6549-48cc-91d5-b10b208681b7\"/>\r\n\r\n# Scientific field\r\n\r\n<img alt=\"Certificate\" src=\"https://github.com/user-attachments/assets/16da0619-5d75-4d46-99e5-a4b3b828e7d7\"/>\r\n\r\n# Patent\r\n\r\n<img alt=\"image\" src=\"https://github.com/user-attachments/assets/2293c61b-b200-4a46-8433-59f7bd8928b5\"/>\r\n\r\n# Some results of the program\r\n\r\n<img alt=\"image\" src=\"https://github.com/ddasturbek/UzbekLemma/assets/76460501/2f9455a0-ebff-4677-b947-3cbfbd46bdf4\"/>\r\n\r\n# Corpus & Results\r\nWe collected an equal number of texts from 23 different fields and stored them as a [corpus](https://github.com/ddasturbek/UzbekLemma/tree/main/Corpus).\r\n\r\nWe tested all the files (i.e. corpora) in the program and got these [results](https://github.com/ddasturbek/UzbekLemma/tree/main/Results).\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Finds the lemma of Uzbek words",
"version": "1.0.2",
"project_urls": {
"Homepage": "https://github.com/ddasturbek/UzbekLemma"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "8943a39a4839dd364850ec883392c2260aa9bba9e33f74a08c16e8ab58b68844",
"md5": "6cdaf9098d03a83cb25d7ab5ba02e1f3",
"sha256": "e78bad5dbabd276807bc5079a48675432ec8c76c38acaf6cdb23bd95e22a9037"
},
"downloads": -1,
"filename": "UzbekLemma-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6cdaf9098d03a83cb25d7ab5ba02e1f3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 4535,
"upload_time": "2024-10-26T11:27:33",
"upload_time_iso_8601": "2024-10-26T11:27:33.491539Z",
"url": "https://files.pythonhosted.org/packages/89/43/a39a4839dd364850ec883392c2260aa9bba9e33f74a08c16e8ab58b68844/UzbekLemma-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "66bed2feb23389411c2a4da6a9a8306c80ac9d0655d1172050eae4b6ba78f30a",
"md5": "ced0abb812621b2316250c27632bf6dc",
"sha256": "f9ce08dd4f5b0f33bfb52a94be60213c2fef61c36f66570ab8e3c439a60d5e54"
},
"downloads": -1,
"filename": "uzbeklemma-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "ced0abb812621b2316250c27632bf6dc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 4208,
"upload_time": "2024-10-26T11:27:35",
"upload_time_iso_8601": "2024-10-26T11:27:35.737577Z",
"url": "https://files.pythonhosted.org/packages/66/be/d2feb23389411c2a4da6a9a8306c80ac9d0655d1172050eae4b6ba78f30a/uzbeklemma-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-26 11:27:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ddasturbek",
"github_project": "UzbekLemma",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "uzbeklemma"
}