uznltk


Nameuznltk JSON
Version 0.0.14 PyPI version JSON
download
home_pagehttps://github.com/UlugbekSalaev/uznltk
SummaryThe Uzbek Natural Language Toolkit (NLTK) is a Python package for natural language processing.
upload_time2025-07-20 11:10:48
maintainerNone
docs_urlNone
authorUlugbek Salaev
requires_python>=3.7
licenseMIT
keywords nltk morphology uzbek-language pos tagging morphological tagging
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # uznltk

**uznltk** — is a lightweight and convenient NLP (Natural Language Processing) library for the Uzbek language. It includes text cleaning, morphological analysis, number and text conversions, syllable splitting, and many other functions.

## 🔗 Links

- [PyPI page](https://pypi.org/project/uznltk)
- [GitHub page](https://github.com/UlugbekSalaev/uznltk)

## 👤 Authors

- **Salaev Ulug'bek** – ulugbek0302@gmail.com
- **Omanov Jasur** – jasuromonov77@gmail.com
- **Zaripboyev Ollabergan** – dewel000per@gmail.com

## 🔧 Install

```bash
pip install uznltk
```

## 🚀 Usage

```python
from uznltk import *
```

## 📚 Functions

### `clean_text(text)`

Corrects characters specific to the Uzbek language (g', o', ( ’ )).

```python
clean_text("O'zbekistonda ta'lim kuchli rivojlanmoqda")
# Result: "O‘zbekistonda ta’lim kuchli rivojlanmoqda"
```

---

### `solid_sign(text)`

Returns words with a ( ’ ) character as a list.

```python
solid_sign("ta'lim bo'lishi oldindan ma'lum edi")
# Result: ['ta’lim', 'ma’lum']
```

---

### `lemmatize(text)` and `stem_word(text)`

Extracts the stem of a word.

```python
lemmatize("mexanizatorlashtirilganlardan")
# Result: "mexanizatorlashtirilgan"
```

---

### `number_to_text(number)`

Converts a number to Uzbek text.

```python
number_to_text(54)
# Result: "ellik to‘rt"
```

---

### `text_to_number(text)`

Converts a number in text to numeric form.

```python
text_to_number("yetmish olti")
# Result: 76
```

---

### `download(name)`

Downloads various resources (e.g. books, news).

```python
download("book")
```

---

### `clean_stopword(text)`

Removes stop words from the text.

```python
clean_stopword("salom dunyo, biz sen va u bilan bugun maktabga bordik")
# Result: "salom dunyo, bugun maktabga bordik"
```

---

### `syllables(text)`

Divides words into syllables.

```python
syllables("Bizga ma’lum ishlar yuz bermoqda!")
# Result: ['Biz-ga', 'ma’-lum', 'ish-lar', 'yuz', 'ber-moq-da!']
```

---

### `hyphenation(text)`

Each word is divided into syllables and presented in a list.

```python
hyphenation("salom dunyo")
# Result: ['sa-lom dunyo', 'salom dun-yo']
```

---

### `count_syllable(text)`

Counts the number of syllables in the text.

```python
count_syllable("Salom Dunyo")
# Result: 4
```

---

### `count_text(text)`

Counts the number of words in the text.

```python
count_text("Salom Dunyo")
# Result: 2
```

---

### `split_sentences(text)`

Sorts the sentences in the text into lists.

```python
split_sentences("Salom Dunyo. Bugun ob-havo qisman bulutli")
# Result: ['Salom Dunyo', 'Bugun ob-havo qisman bulutli']
```

---

### `split_words(text)`

Extracts only words from the text (without IP, email, emoji, URLs) into a list.

```python
split_words("sen 192.168.1.18 va helloworld@example.com elektron manzilidasan. Manba https://pypi.org")
# Result: ['sen', 'va', 'elektron', 'manzilidasan', 'Manba']
```

---

## 💡 Information

- The library is entirely designed for **the Uzbek language**.
- It includes basic NLP components such as number processing, lemmatization, and syntacticization.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/UlugbekSalaev/uznltk",
    "name": "uznltk",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "nltk, morphology, uzbek-language, pos tagging, morphological tagging",
    "author": "Ulugbek Salaev",
    "author_email": "ulugbek0302@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/35/21/0321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70/uznltk-0.0.14.tar.gz",
    "platform": null,
    "description": "# uznltk\r\n\r\n**uznltk** \u2014 is a lightweight and convenient NLP (Natural Language Processing) library for the Uzbek language. It includes text cleaning, morphological analysis, number and text conversions, syllable splitting, and many other functions.\r\n\r\n## \ud83d\udd17 Links\r\n\r\n- [PyPI page](https://pypi.org/project/uznltk)\r\n- [GitHub page](https://github.com/UlugbekSalaev/uznltk)\r\n\r\n## \ud83d\udc64 Authors\r\n\r\n- **Salaev Ulug'bek** \u2013 ulugbek0302@gmail.com\r\n- **Omanov Jasur** \u2013 jasuromonov77@gmail.com\r\n- **Zaripboyev Ollabergan** \u2013 dewel000per@gmail.com\r\n\r\n## \ud83d\udd27 Install\r\n\r\n```bash\r\npip install uznltk\r\n```\r\n\r\n## \ud83d\ude80 Usage\r\n\r\n```python\r\nfrom uznltk import *\r\n```\r\n\r\n## \ud83d\udcda Functions\r\n\r\n### `clean_text(text)`\r\n\r\nCorrects characters specific to the Uzbek language (g', o', ( \u2019 )).\r\n\r\n```python\r\nclean_text(\"O'zbekistonda ta'lim kuchli rivojlanmoqda\")\r\n# Result: \"O\u2018zbekistonda ta\u2019lim kuchli rivojlanmoqda\"\r\n```\r\n\r\n---\r\n\r\n### `solid_sign(text)`\r\n\r\nReturns words with a ( \u2019 ) character as a list.\r\n\r\n```python\r\nsolid_sign(\"ta'lim bo'lishi oldindan ma'lum edi\")\r\n# Result: ['ta\u2019lim', 'ma\u2019lum']\r\n```\r\n\r\n---\r\n\r\n### `lemmatize(text)` and `stem_word(text)`\r\n\r\nExtracts the stem of a word.\r\n\r\n```python\r\nlemmatize(\"mexanizatorlashtirilganlardan\")\r\n# Result: \"mexanizatorlashtirilgan\"\r\n```\r\n\r\n---\r\n\r\n### `number_to_text(number)`\r\n\r\nConverts a number to Uzbek text.\r\n\r\n```python\r\nnumber_to_text(54)\r\n# Result: \"ellik to\u2018rt\"\r\n```\r\n\r\n---\r\n\r\n### `text_to_number(text)`\r\n\r\nConverts a number in text to numeric form.\r\n\r\n```python\r\ntext_to_number(\"yetmish olti\")\r\n# Result: 76\r\n```\r\n\r\n---\r\n\r\n### `download(name)`\r\n\r\nDownloads various resources (e.g. books, news).\r\n\r\n```python\r\ndownload(\"book\")\r\n```\r\n\r\n---\r\n\r\n### `clean_stopword(text)`\r\n\r\nRemoves stop words from the text.\r\n\r\n```python\r\nclean_stopword(\"salom dunyo, biz sen va u bilan bugun maktabga bordik\")\r\n# Result: \"salom dunyo, bugun maktabga bordik\"\r\n```\r\n\r\n---\r\n\r\n### `syllables(text)`\r\n\r\nDivides words into syllables.\r\n\r\n```python\r\nsyllables(\"Bizga ma\u2019lum ishlar yuz bermoqda!\")\r\n# Result: ['Biz-ga', 'ma\u2019-lum', 'ish-lar', 'yuz', 'ber-moq-da!']\r\n```\r\n\r\n---\r\n\r\n### `hyphenation(text)`\r\n\r\nEach word is divided into syllables and presented in a list.\r\n\r\n```python\r\nhyphenation(\"salom dunyo\")\r\n# Result: ['sa-lom dunyo', 'salom dun-yo']\r\n```\r\n\r\n---\r\n\r\n### `count_syllable(text)`\r\n\r\nCounts the number of syllables in the text.\r\n\r\n```python\r\ncount_syllable(\"Salom Dunyo\")\r\n# Result: 4\r\n```\r\n\r\n---\r\n\r\n### `count_text(text)`\r\n\r\nCounts the number of words in the text.\r\n\r\n```python\r\ncount_text(\"Salom Dunyo\")\r\n# Result: 2\r\n```\r\n\r\n---\r\n\r\n### `split_sentences(text)`\r\n\r\nSorts the sentences in the text into lists.\r\n\r\n```python\r\nsplit_sentences(\"Salom Dunyo. Bugun ob-havo qisman bulutli\")\r\n# Result: ['Salom Dunyo', 'Bugun ob-havo qisman bulutli']\r\n```\r\n\r\n---\r\n\r\n### `split_words(text)`\r\n\r\nExtracts only words from the text (without IP, email, emoji, URLs) into a list.\r\n\r\n```python\r\nsplit_words(\"sen 192.168.1.18 va helloworld@example.com elektron manzilidasan. Manba https://pypi.org\")\r\n# Result: ['sen', 'va', 'elektron', 'manzilidasan', 'Manba']\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udca1 Information\r\n\r\n- The library is entirely designed for **the Uzbek language**.\r\n- It includes basic NLP components such as number processing, lemmatization, and syntacticization.\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "The Uzbek Natural Language Toolkit (NLTK) is a Python package for natural language processing.",
    "version": "0.0.14",
    "project_urls": {
        "Bug Tracker": "https://github.com/UlugbekSalaev/uznltk/issues",
        "Homepage": "https://github.com/UlugbekSalaev/uznltk"
    },
    "split_keywords": [
        "nltk",
        " morphology",
        " uzbek-language",
        " pos tagging",
        " morphological tagging"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0b119a96831b838d28655ae26fc0ebc896b4d8bc2b342b9a7cb0e3c9c0fe82ab",
                "md5": "948f6cc4002917bb5c1ff2114d1cf67d",
                "sha256": "dbdc87f1079448cb8d170cfbc2168c52840fa702d577617bc3854c94788bda20"
            },
            "downloads": -1,
            "filename": "uznltk-0.0.14-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "948f6cc4002917bb5c1ff2114d1cf67d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 9186,
            "upload_time": "2025-07-20T11:10:46",
            "upload_time_iso_8601": "2025-07-20T11:10:46.861508Z",
            "url": "https://files.pythonhosted.org/packages/0b/11/9a96831b838d28655ae26fc0ebc896b4d8bc2b342b9a7cb0e3c9c0fe82ab/uznltk-0.0.14-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "35210321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70",
                "md5": "e94b0cd5d6a4feed78d1fbf966b4b7a3",
                "sha256": "0d17694e5f14211953f9bcdd405d467621558f6aab41ea114a8f9dd19ce5e8bb"
            },
            "downloads": -1,
            "filename": "uznltk-0.0.14.tar.gz",
            "has_sig": false,
            "md5_digest": "e94b0cd5d6a4feed78d1fbf966b4b7a3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 9737,
            "upload_time": "2025-07-20T11:10:48",
            "upload_time_iso_8601": "2025-07-20T11:10:48.157186Z",
            "url": "https://files.pythonhosted.org/packages/35/21/0321ee61cb70fbf3c835169c0d6c462fea3f063ac417e7980a8ecf15fa70/uznltk-0.0.14.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-20 11:10:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "UlugbekSalaev",
    "github_project": "uznltk",
    "github_not_found": true,
    "lcname": "uznltk"
}
        
Elapsed time: 2.26373s