# uznltk
**uznltk** — is a lightweight and convenient NLP (Natural Language Processing) library for the Uzbek language. It includes text cleaning, morphological analysis, number and text conversions, syllable splitting, and many other functions.
## 🔗 Links
- [PyPI page](https://pypi.org/project/uznltk)
- [GitHub page](https://github.com/UlugbekSalaev/uznltk)
## 👤 Authors
- **Salaev Ulug'bek** – ulugbek0302@gmail.com
- **Omanov Jasur** – jasuromonov77@gmail.com
- **Zaripboyev Ollabergan** – dewel000per@gmail.com
## 🔧 Install
```bash
pip install uznltk
```
## 🚀 Usage
```python
from uznltk import *
```
## 📚 Functions
### `clean_text(text)`
Corrects characters specific to the Uzbek language (g', o', ( ’ )).
```python
clean_text("O'zbekistonda ta'lim kuchli rivojlanmoqda")
# Result: "O‘zbekistonda ta’lim kuchli rivojlanmoqda"
```
---
### `solid_sign(text)`
Returns words with a ( ’ ) character as a list.
```python
solid_sign("ta'lim bo'lishi oldindan ma'lum edi")
# Result: ['ta’lim', 'ma’lum']
```
---
### `lemmatize(text)` and `stem_word(text)`
Extracts the stem of a word.
```python
lemmatize("mexanizatorlashtirilganlardan")
# Result: "mexanizatorlashtirilgan"
```
---
### `number_to_text(number)`
Converts a number to Uzbek text.
```python
number_to_text(54)
# Result: "ellik to‘rt"
```
---
### `text_to_number(text)`
Converts a number in text to numeric form.
```python
text_to_number("yetmish olti")
# Result: 76
```
---
### `download(name)`
Downloads various resources (e.g. books, news).
```python
download("book")
```
---
### `clean_stopword(text)`
Removes stop words from the text.
```python
clean_stopword("salom dunyo, biz sen va u bilan bugun maktabga bordik")
# Result: "salom dunyo, bugun maktabga bordik"
```
---
### `syllables(text)`
Divides words into syllables.
```python
syllables("Bizga ma’lum ishlar yuz bermoqda!")
# Result: ['Biz-ga', 'ma’-lum', 'ish-lar', 'yuz', 'ber-moq-da!']
```
---
### `hyphenation(text)`
Each word is divided into syllables and presented in a list.
```python
hyphenation("salom dunyo")
# Result: ['sa-lom dunyo', 'salom dun-yo']
```
---
### `count_syllable(text)`
Counts the number of syllables in the text.
```python
count_syllable("Salom Dunyo")
# Result: 4
```
---
### `count_text(text)`
Counts the number of words in the text.
```python
count_text("Salom Dunyo")
# Result: 2
```
---
### `split_sentences(text)`
Sorts the sentences in the text into lists.
```python
split_sentences("Salom Dunyo. Bugun ob-havo qisman bulutli")
# Result: ['Salom Dunyo', 'Bugun ob-havo qisman bulutli']
```
---
### `split_words(text)`
Extracts only words from the text (without IP, email, emoji, URLs) into a list.
```python
split_words("sen 192.168.1.18 va helloworld@example.com elektron manzilidasan. Manba https://pypi.org")
# Result: ['sen', 'va', 'elektron', 'manzilidasan', 'Manba']
```
---
## 💡 Information
- The library is entirely designed for **the Uzbek language**.
- It includes basic NLP components such as number processing, lemmatization, and syntacticization.
Raw data
{
"_id": null,
"home_page": "https://github.com/UlugbekSalaev/uznltk",
"name": "uznltk",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "nltk, morphology, uzbek-language, pos tagging, morphological tagging",
"author": "Ulugbek Salaev",
"author_email": "ulugbek0302@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/35/21/0321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70/uznltk-0.0.14.tar.gz",
"platform": null,
"description": "# uznltk\r\n\r\n**uznltk** \u2014 is a lightweight and convenient NLP (Natural Language Processing) library for the Uzbek language. It includes text cleaning, morphological analysis, number and text conversions, syllable splitting, and many other functions.\r\n\r\n## \ud83d\udd17 Links\r\n\r\n- [PyPI page](https://pypi.org/project/uznltk)\r\n- [GitHub page](https://github.com/UlugbekSalaev/uznltk)\r\n\r\n## \ud83d\udc64 Authors\r\n\r\n- **Salaev Ulug'bek** \u2013 ulugbek0302@gmail.com\r\n- **Omanov Jasur** \u2013 jasuromonov77@gmail.com\r\n- **Zaripboyev Ollabergan** \u2013 dewel000per@gmail.com\r\n\r\n## \ud83d\udd27 Install\r\n\r\n```bash\r\npip install uznltk\r\n```\r\n\r\n## \ud83d\ude80 Usage\r\n\r\n```python\r\nfrom uznltk import *\r\n```\r\n\r\n## \ud83d\udcda Functions\r\n\r\n### `clean_text(text)`\r\n\r\nCorrects characters specific to the Uzbek language (g', o', ( \u2019 )).\r\n\r\n```python\r\nclean_text(\"O'zbekistonda ta'lim kuchli rivojlanmoqda\")\r\n# Result: \"O\u2018zbekistonda ta\u2019lim kuchli rivojlanmoqda\"\r\n```\r\n\r\n---\r\n\r\n### `solid_sign(text)`\r\n\r\nReturns words with a ( \u2019 ) character as a list.\r\n\r\n```python\r\nsolid_sign(\"ta'lim bo'lishi oldindan ma'lum edi\")\r\n# Result: ['ta\u2019lim', 'ma\u2019lum']\r\n```\r\n\r\n---\r\n\r\n### `lemmatize(text)` and `stem_word(text)`\r\n\r\nExtracts the stem of a word.\r\n\r\n```python\r\nlemmatize(\"mexanizatorlashtirilganlardan\")\r\n# Result: \"mexanizatorlashtirilgan\"\r\n```\r\n\r\n---\r\n\r\n### `number_to_text(number)`\r\n\r\nConverts a number to Uzbek text.\r\n\r\n```python\r\nnumber_to_text(54)\r\n# Result: \"ellik to\u2018rt\"\r\n```\r\n\r\n---\r\n\r\n### `text_to_number(text)`\r\n\r\nConverts a number in text to numeric form.\r\n\r\n```python\r\ntext_to_number(\"yetmish olti\")\r\n# Result: 76\r\n```\r\n\r\n---\r\n\r\n### `download(name)`\r\n\r\nDownloads various resources (e.g. books, news).\r\n\r\n```python\r\ndownload(\"book\")\r\n```\r\n\r\n---\r\n\r\n### `clean_stopword(text)`\r\n\r\nRemoves stop words from the text.\r\n\r\n```python\r\nclean_stopword(\"salom dunyo, biz sen va u bilan bugun maktabga bordik\")\r\n# Result: \"salom dunyo, bugun maktabga bordik\"\r\n```\r\n\r\n---\r\n\r\n### `syllables(text)`\r\n\r\nDivides words into syllables.\r\n\r\n```python\r\nsyllables(\"Bizga ma\u2019lum ishlar yuz bermoqda!\")\r\n# Result: ['Biz-ga', 'ma\u2019-lum', 'ish-lar', 'yuz', 'ber-moq-da!']\r\n```\r\n\r\n---\r\n\r\n### `hyphenation(text)`\r\n\r\nEach word is divided into syllables and presented in a list.\r\n\r\n```python\r\nhyphenation(\"salom dunyo\")\r\n# Result: ['sa-lom dunyo', 'salom dun-yo']\r\n```\r\n\r\n---\r\n\r\n### `count_syllable(text)`\r\n\r\nCounts the number of syllables in the text.\r\n\r\n```python\r\ncount_syllable(\"Salom Dunyo\")\r\n# Result: 4\r\n```\r\n\r\n---\r\n\r\n### `count_text(text)`\r\n\r\nCounts the number of words in the text.\r\n\r\n```python\r\ncount_text(\"Salom Dunyo\")\r\n# Result: 2\r\n```\r\n\r\n---\r\n\r\n### `split_sentences(text)`\r\n\r\nSorts the sentences in the text into lists.\r\n\r\n```python\r\nsplit_sentences(\"Salom Dunyo. Bugun ob-havo qisman bulutli\")\r\n# Result: ['Salom Dunyo', 'Bugun ob-havo qisman bulutli']\r\n```\r\n\r\n---\r\n\r\n### `split_words(text)`\r\n\r\nExtracts only words from the text (without IP, email, emoji, URLs) into a list.\r\n\r\n```python\r\nsplit_words(\"sen 192.168.1.18 va helloworld@example.com elektron manzilidasan. Manba https://pypi.org\")\r\n# Result: ['sen', 'va', 'elektron', 'manzilidasan', 'Manba']\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udca1 Information\r\n\r\n- The library is entirely designed for **the Uzbek language**.\r\n- It includes basic NLP components such as number processing, lemmatization, and syntacticization.\r\n\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "The Uzbek Natural Language Toolkit (NLTK) is a Python package for natural language processing.",
"version": "0.0.14",
"project_urls": {
"Bug Tracker": "https://github.com/UlugbekSalaev/uznltk/issues",
"Homepage": "https://github.com/UlugbekSalaev/uznltk"
},
"split_keywords": [
"nltk",
" morphology",
" uzbek-language",
" pos tagging",
" morphological tagging"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0b119a96831b838d28655ae26fc0ebc896b4d8bc2b342b9a7cb0e3c9c0fe82ab",
"md5": "948f6cc4002917bb5c1ff2114d1cf67d",
"sha256": "dbdc87f1079448cb8d170cfbc2168c52840fa702d577617bc3854c94788bda20"
},
"downloads": -1,
"filename": "uznltk-0.0.14-py3-none-any.whl",
"has_sig": false,
"md5_digest": "948f6cc4002917bb5c1ff2114d1cf67d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9186,
"upload_time": "2025-07-20T11:10:46",
"upload_time_iso_8601": "2025-07-20T11:10:46.861508Z",
"url": "https://files.pythonhosted.org/packages/0b/11/9a96831b838d28655ae26fc0ebc896b4d8bc2b342b9a7cb0e3c9c0fe82ab/uznltk-0.0.14-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "35210321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70",
"md5": "e94b0cd5d6a4feed78d1fbf966b4b7a3",
"sha256": "0d17694e5f14211953f9bcdd405d467621558f6aab41ea114a8f9dd19ce5e8bb"
},
"downloads": -1,
"filename": "uznltk-0.0.14.tar.gz",
"has_sig": false,
"md5_digest": "e94b0cd5d6a4feed78d1fbf966b4b7a3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 9737,
"upload_time": "2025-07-20T11:10:48",
"upload_time_iso_8601": "2025-07-20T11:10:48.157186Z",
"url": "https://files.pythonhosted.org/packages/35/21/0321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70/uznltk-0.0.14.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-20 11:10:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "UlugbekSalaev",
"github_project": "uznltk",
"github_not_found": true,
"lcname": "uznltk"
}