hazm

Name	hazm JSON
Version	0.10.0 JSON
	download
home_page	https://roshan-ai.ir/hazm/
Summary	Persian NLP Toolkit
upload_time	2024-01-16 16:48:19
maintainer	Roshan
docs_url	None
author	Roshan
requires_python	>=3.8,<4.0
license	MIT
keywords	nlp persian nlp persian
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Hazm - Persian NLP Toolkit [![](https://img.shields.io/twitter/follow/roshan_ai?label=follow)](https://twitter.com/roshan_ai)

![Tests](https://img.shields.io/github/actions/workflow/status/roshan-research/hazm/test.yml?branch=master)
![PyPI - Downloads](https://img.shields.io/github/downloads/roshan-research/hazm/total)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/hazm)
![GitHub](https://img.shields.io/github/license/roshan-research/hazm)

- [Evaluation](#Evaluation)
- [Introduction](#introduction)
- [Features](#features)
- [Installation](#installation)
- [Pretrained-Models](#pretrained-models)
- [Usage](#usage)
- [Documentation](#documentation)
- [Hazm in other languages](#hazm-in-other-languages)
- [Contribution](#contribution)
- [Thanks](#thanks)
  - [Code contributores](#code-contributores)
  - [Others](#others)

## Evaluation

| Module name      |           |
| :--------------- | --------- |
| DependencyParser | **85.6%** |
| POSTagger        | **98.8%** |
| Chunker          | **93.4%** |
| Lemmatizer       | **89.9%** |

|                                | Metric          | Value   |
| ------------------------------ | --------------- | ------- |
| **SpacyPOSTagger**             | Precision       | 0.99250 |
|                                | Recall          | 0.99249 |
|                                | F1-Score        | 0.99249 |
| **EZ Detection in SpacyPOSTagger** | Precision   | 0.99301 |
|                                | Recall          | 0.99297 |
|                                | F1-Score        | 0.99298 |
| **SpacyChunker**                | Accuracy        | 96.53%  |
|                                | F-Measure       | 95.00%  |
|                                | Recall          | 95.17%  |
|                                | Precision       | 94.83%  |
| **SpacyDependencyParser**       | TOK Accuracy    | 99.06   |
|                                | UAS             | 92.30   |
|                                | LAS             | 89.15   |
|                                | SENT Precision  | 98.84   |
|                                | SENT Recall     | 99.38   |
|                                | SENT F-Measure  | 99.11   |


## Introduction

[**Hazm**](https://www.roshan-ai.ir/hazm/) is a python library to perform natural language processing tasks on Persian text. It offers various features for analyzing, processing, and understanding Persian text. You can use Hazm to normalize text, tokenize sentences and words, lemmatize words, assign part-of-speech tags, identify dependency relations, create word and sentence embeddings, or read popular Persian corpora.

## Features

- **Normalization:** Converts text to a standard form, such as removing diacritics, correcting spacing, etc.
- **Tokenization:** Splits text into sentences and words.
- **Lemmatization:** Reduces words to their base forms.
- **POS tagging:** Assigns a part of speech to each word.
- **Dependency parsing:** Identifies the syntactic relations between words.
- **Embedding:** Creates vector representations of words and sentences.
- **Persian corpora reading:** Easily read popular Persian corpora with ready-made scripts and minimal code.

## Installation

To install the latest version of Hazm, run the following command in your terminal:

    pip install hazm

Alternatively, you can install the latest update from GitHub (this version may be unstable and buggy):

    pip install git+https://github.com/roshan-research/hazm.git

## Pretrained-Models

Finally if you want to use our pretrained models, you can download it from the links below:

| **Module name**                                                                                                                 | **Size** |
| :------------------------------------------------------------------------------------------------------------------------------ | :------- |
| [**Download WordEmbedding**](https://mega.nz/file/GqZUlbpS#XRYP5FHbPK2LnLZ8IExrhrw3ZQ-jclNSVCz59uEhrxY)                         | ~ 5 GB   |
| [**Download SentEmbedding**](https://mega.nz/file/WzR0QChY#J1nG-HGq0UJP69VMY8I1YGl_MfEAFCo5iizpjofA4OY)                         | ~ 1 GB   |
| [**Download POSTagger**](https://drive.google.com/file/d/1Q3JK4NVUC2t5QT63aDiVrCRBV225E_B3)                                     | ~ 18 MB  |
| [**Download DependencyParser**](https://drive.google.com/file/d/1MDapMSUXYfmQlu0etOAkgP5KDiWrNAV6/view?usp=share_link) | ~ 15 MB  |
| [**Download Chunker**](https://drive.google.com/file/d/16hlAb_h7xdlxF4Ukhqk_fOV3g7rItVtk)                                       | ~ 4 MB   |
| [**Download spacy_pos_tagger_parsbertpostagger**](https://huggingface.co/roshan-research/spacy_pos_tagger_parsbertpostagger)    | ~ 630 MB   |
| [**Download spacy_pos_tagger_parsbertpostagger95**](https://huggingface.co/roshan-research/spacy_pos_tagger_parsbertpostagger95)| ~ 630 MB   |
| [**Download spacy_chunker_uncased_bert**](https://huggingface.co/roshan-research/spacy_chunker_uncased_bert)                    | ~ 650 MB   |
| [**Download spacy_chunker_parsbert**](https://huggingface.co/roshan-research/spacy_chunker_parsbert)                            | ~ 630 MB   |
| [**Download spacy_dependency_parser**](https://huggingface.co/roshan-research/spacy_dependency_parser)                          | ~ 630 MB   |

## Usage

```python
>>> from hazm import *

>>> normalizer = Normalizer()
>>> normalizer.normalize('اصلاح نويسه ها و استفاده از نیم‌فاصله پردازش را آسان مي كند')
'اصلاح نویسه‌ها و استفاده از نیم‌فاصله پردازش را آسان می‌کند'

>>> sent_tokenize('ما هم برای وصل کردن آمدیم! ولی برای پردازش، جدا بهتر نیست؟')
['ما هم برای وصل کردن آمدیم!', 'ولی برای پردازش، جدا بهتر نیست؟']
>>> word_tokenize('ولی برای پردازش، جدا بهتر نیست؟')
['ولی', 'برای', 'پردازش', '،', 'جدا', 'بهتر', 'نیست', '؟']

>>> stemmer = Stemmer()
>>> stemmer.stem('کتاب‌ها')
'کتاب'
>>> lemmatizer = Lemmatizer()
>>> lemmatizer.lemmatize('می‌روم')
'رفت#رو'

>>> tagger = POSTagger(model='pos_tagger.model')
>>> tagger.tag(word_tokenize('ما بسیار کتاب می‌خوانیم'))
[('ما', 'PRO'), ('بسیار', 'ADV'), ('کتاب', 'N'), ('می‌خوانیم', 'V')]

>>> spacy_posTagger = SpacyPOSTagger(model_path = 'MODELPATH')
>>> spacy_posTagger.tag(tokens = ['من', 'به', 'مدرسه', 'ایران', 'رفته_بودم', '.'])
[('من', 'PRON'), ('به', 'ADP'), ('مدرسه', 'NOUN,EZ'), ('ایران', 'NOUN'), ('رفته_بودم', 'VERB'), ('.', 'PUNCT')]

>>> posTagger = POSTagger(model = 'pos_tagger.model', universal_tag = False)
>>> posTagger.tag(tokens = ['من', 'به', 'مدرسه', 'ایران', 'رفته_بودم', '.'])
[('من', 'PRON'), ('به', 'ADP'), ('مدرسه', 'NOUN'), ('ایران', 'NOUN'), ('رفته_بودم', 'VERB'), ('.', 'PUNCT')] 

>>> chunker = Chunker(model='chunker.model')
>>> tagged = tagger.tag(word_tokenize('کتاب خواندن را دوست داریم'))
>>> tree2brackets(chunker.parse(tagged))
'[کتاب خواندن NP] [را POSTP] [دوست داریم VP]'

>>> spacy_chunker = SpacyChunker(model_path = 'model_path')
>>> tree = spacy_chunker.parse(sentence = [('نامه', 'NOUN,EZ'), ('ایشان', 'PRON'), ('را', 'ADP'), ('دریافت', 'NOUN'), ('داشتم', 'VERB'), ('.', 'PUNCT')])
>>> print(tree)
(S
  (NP نامه/NOUN,EZ ایشان/PRON)
  (POSTP را/ADP)
  (VP دریافت/NOUN داشتم/VERB)
  ./PUNCT)

>>> word_embedding = WordEmbedding(model_type = 'fasttext', model_path = 'word2vec.bin')
>>> word_embedding.doesnt_match(['سلام' ,'درود' ,'خداحافظ' ,'پنجره'])
'پنجره'
>>> word_embedding.doesnt_match(['ساعت' ,'پلنگ' ,'شیر'])
'ساعت'

>>> parser = DependencyParser(tagger=tagger, lemmatizer=lemmatizer)
>>> parser.parse(word_tokenize('زنگ‌ها برای که به صدا درمی‌آید؟'))
<DependencyGraph with 8 nodes>

>>> spacy_parser = SpacyDependencyParser(tagger=tagger, lemmatizer=lemmatizer)
>>> spacy_parser.parse_sents([word_tokenize('زنگ‌ها برای که به صدا درمی‌آید؟')])

```

## Documentation

Visit https://roshan-ai.ir/hazm/docs to view the full documentation.

## Hazm in other languages

**Disclaimer:** These ports are not developed or maintained by Roshan. They may not have the same functionality or quality as the original Hazm..

- [**JHazm**](https://github.com/mojtaba-khallash/JHazm): A Java port of Hazm
- [**NHazm**](https://github.com/mojtaba-khallash/NHazm): A C# port of Hazm

## Contribution

We welcome and appreciate any contributions to this repo, such as bug reports, feature requests, code improvements, documentation updates, etc. Please follow the [Contribution guideline](./CONTRIBUTION.md) when contributing. You can open an issue, fork the repo, write your code, create a pull request and wait for a review and feedback. Thank you for your interest and support in this repo!

## Thanks

### Code contributores

![Alt](https://repobeats.axiom.co/api/embed/ae42bda158791645d143c3e3c7f19d8a68d06d08.svg "Repobeats analytics image")

<a href="https://github.com/roshan-research/hazm/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=roshan-research/hazm" />
</a>

### Others 

- Thanks to [Virastyar](http://virastyar.ir/) project for providing the persian word list.

[![Star History Chart](https://api.star-history.com/svg?repos=roshan-research/hazm&type=Date)](https://star-history.com/#roshan-research/hazm&Date)

Raw data

            {
    "_id": null,
    "home_page": "https://roshan-ai.ir/hazm/",
    "name": "hazm",
    "maintainer": "Roshan",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "salam@roshan-ai.com",
    "keywords": "nlp,persian nlp,persian",
    "author": "Roshan",
    "author_email": "salam@roshan-ai.com",
    "download_url": "https://files.pythonhosted.org/packages/50/ed/996f77a9c0c49f195a859de3096ee837b7a28f31498221b5d1fd0d00288b/hazm-0.10.0.tar.gz",
    "platform": null,
    "description": "# Hazm - Persian NLP Toolkit [![](https://img.shields.io/twitter/follow/roshan_ai?label=follow)](https://twitter.com/roshan_ai)\n\n![Tests](https://img.shields.io/github/actions/workflow/status/roshan-research/hazm/test.yml?branch=master)\n![PyPI - Downloads](https://img.shields.io/github/downloads/roshan-research/hazm/total)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/hazm)\n![GitHub](https://img.shields.io/github/license/roshan-research/hazm)\n\n- [Evaluation](#Evaluation)\n- [Introduction](#introduction)\n- [Features](#features)\n- [Installation](#installation)\n- [Pretrained-Models](#pretrained-models)\n- [Usage](#usage)\n- [Documentation](#documentation)\n- [Hazm in other languages](#hazm-in-other-languages)\n- [Contribution](#contribution)\n- [Thanks](#thanks)\n  - [Code contributores](#code-contributores)\n  - [Others](#others)\n\n## Evaluation\n\n| Module name      |           |\n| :--------------- | --------- |\n| DependencyParser | **85.6%** |\n| POSTagger        | **98.8%** |\n| Chunker          | **93.4%** |\n| Lemmatizer       | **89.9%** |\n\n|                                | Metric          | Value   |\n| ------------------------------ | --------------- | ------- |\n| **SpacyPOSTagger**             | Precision       | 0.99250 |\n|                                | Recall          | 0.99249 |\n|                                | F1-Score        | 0.99249 |\n| **EZ Detection in SpacyPOSTagger** | Precision   | 0.99301 |\n|                                | Recall          | 0.99297 |\n|                                | F1-Score        | 0.99298 |\n| **SpacyChunker**                | Accuracy        | 96.53%  |\n|                                | F-Measure       | 95.00%  |\n|                                | Recall          | 95.17%  |\n|                                | Precision       | 94.83%  |\n| **SpacyDependencyParser**       | TOK Accuracy    | 99.06   |\n|                                | UAS             | 92.30   |\n|                                | LAS             | 89.15   |\n|                                | SENT Precision  | 98.84   |\n|                                | SENT Recall     | 99.38   |\n|                                | SENT F-Measure  | 99.11   |\n\n\n## Introduction\n\n[**Hazm**](https://www.roshan-ai.ir/hazm/) is a python library to perform natural language processing tasks on Persian text. It offers various features for analyzing, processing, and understanding Persian text. You can use Hazm to normalize text, tokenize sentences and words, lemmatize words, assign part-of-speech tags, identify dependency relations, create word and sentence embeddings, or read popular Persian corpora.\n\n## Features\n\n- **Normalization:** Converts text to a standard form, such as removing diacritics, correcting spacing, etc.\n- **Tokenization:** Splits text into sentences and words.\n- **Lemmatization:** Reduces words to their base forms.\n- **POS tagging:** Assigns a part of speech to each word.\n- **Dependency parsing:** Identifies the syntactic relations between words.\n- **Embedding:** Creates vector representations of words and sentences.\n- **Persian corpora reading:** Easily read popular Persian corpora with ready-made scripts and minimal code.\n\n## Installation\n\nTo install the latest version of Hazm, run the following command in your terminal:\n\n    pip install hazm\n\nAlternatively, you can install the latest update from GitHub (this version may be unstable and buggy):\n\n    pip install git+https://github.com/roshan-research/hazm.git\n\n## Pretrained-Models\n\nFinally if you want to use our pretrained models, you can download it from the links below:\n\n| **Module name**                                                                                                                 | **Size** |\n| :------------------------------------------------------------------------------------------------------------------------------ | :------- |\n| [**Download WordEmbedding**](https://mega.nz/file/GqZUlbpS#XRYP5FHbPK2LnLZ8IExrhrw3ZQ-jclNSVCz59uEhrxY)                         | ~ 5 GB   |\n| [**Download SentEmbedding**](https://mega.nz/file/WzR0QChY#J1nG-HGq0UJP69VMY8I1YGl_MfEAFCo5iizpjofA4OY)                         | ~ 1 GB   |\n| [**Download POSTagger**](https://drive.google.com/file/d/1Q3JK4NVUC2t5QT63aDiVrCRBV225E_B3)                                     | ~ 18 MB  |\n| [**Download DependencyParser**](https://drive.google.com/file/d/1MDapMSUXYfmQlu0etOAkgP5KDiWrNAV6/view?usp=share_link) | ~ 15 MB  |\n| [**Download Chunker**](https://drive.google.com/file/d/16hlAb_h7xdlxF4Ukhqk_fOV3g7rItVtk)                                       | ~ 4 MB   |\n| [**Download spacy_pos_tagger_parsbertpostagger**](https://huggingface.co/roshan-research/spacy_pos_tagger_parsbertpostagger)    | ~ 630 MB   |\n| [**Download spacy_pos_tagger_parsbertpostagger95**](https://huggingface.co/roshan-research/spacy_pos_tagger_parsbertpostagger95)| ~ 630 MB   |\n| [**Download spacy_chunker_uncased_bert**](https://huggingface.co/roshan-research/spacy_chunker_uncased_bert)                    | ~ 650 MB   |\n| [**Download spacy_chunker_parsbert**](https://huggingface.co/roshan-research/spacy_chunker_parsbert)                            | ~ 630 MB   |\n| [**Download spacy_dependency_parser**](https://huggingface.co/roshan-research/spacy_dependency_parser)                          | ~ 630 MB   |\n\n## Usage\n\n```python\n>>> from hazm import *\n\n>>> normalizer = Normalizer()\n>>> normalizer.normalize('\u0627\u0635\u0644\u0627\u062d \u0646\u0648\u064a\u0633\u0647 \u0647\u0627 \u0648 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0627\u0632 \u0646\u06cc\u0645\u200c\u0641\u0627\u0635\u0644\u0647 \u067e\u0631\u062f\u0627\u0632\u0634 \u0631\u0627 \u0622\u0633\u0627\u0646 \u0645\u064a \u0643\u0646\u062f')\n'\u0627\u0635\u0644\u0627\u062d \u0646\u0648\u06cc\u0633\u0647\u200c\u0647\u0627 \u0648 \u0627\u0633\u062a\u0641\u0627\u062f\u0647 \u0627\u0632 \u0646\u06cc\u0645\u200c\u0641\u0627\u0635\u0644\u0647 \u067e\u0631\u062f\u0627\u0632\u0634 \u0631\u0627 \u0622\u0633\u0627\u0646 \u0645\u06cc\u200c\u06a9\u0646\u062f'\n\n>>> sent_tokenize('\u0645\u0627 \u0647\u0645 \u0628\u0631\u0627\u06cc \u0648\u0635\u0644 \u06a9\u0631\u062f\u0646 \u0622\u0645\u062f\u06cc\u0645! \u0648\u0644\u06cc \u0628\u0631\u0627\u06cc \u067e\u0631\u062f\u0627\u0632\u0634\u060c \u062c\u062f\u0627 \u0628\u0647\u062a\u0631 \u0646\u06cc\u0633\u062a\u061f')\n['\u0645\u0627 \u0647\u0645 \u0628\u0631\u0627\u06cc \u0648\u0635\u0644 \u06a9\u0631\u062f\u0646 \u0622\u0645\u062f\u06cc\u0645!', '\u0648\u0644\u06cc \u0628\u0631\u0627\u06cc \u067e\u0631\u062f\u0627\u0632\u0634\u060c \u062c\u062f\u0627 \u0628\u0647\u062a\u0631 \u0646\u06cc\u0633\u062a\u061f']\n>>> word_tokenize('\u0648\u0644\u06cc \u0628\u0631\u0627\u06cc \u067e\u0631\u062f\u0627\u0632\u0634\u060c \u062c\u062f\u0627 \u0628\u0647\u062a\u0631 \u0646\u06cc\u0633\u062a\u061f')\n['\u0648\u0644\u06cc', '\u0628\u0631\u0627\u06cc', '\u067e\u0631\u062f\u0627\u0632\u0634', '\u060c', '\u062c\u062f\u0627', '\u0628\u0647\u062a\u0631', '\u0646\u06cc\u0633\u062a', '\u061f']\n\n>>> stemmer = Stemmer()\n>>> stemmer.stem('\u06a9\u062a\u0627\u0628\u200c\u0647\u0627')\n'\u06a9\u062a\u0627\u0628'\n>>> lemmatizer = Lemmatizer()\n>>> lemmatizer.lemmatize('\u0645\u06cc\u200c\u0631\u0648\u0645')\n'\u0631\u0641\u062a#\u0631\u0648'\n\n>>> tagger = POSTagger(model='pos_tagger.model')\n>>> tagger.tag(word_tokenize('\u0645\u0627 \u0628\u0633\u06cc\u0627\u0631 \u06a9\u062a\u0627\u0628 \u0645\u06cc\u200c\u062e\u0648\u0627\u0646\u06cc\u0645'))\n[('\u0645\u0627', 'PRO'), ('\u0628\u0633\u06cc\u0627\u0631', 'ADV'), ('\u06a9\u062a\u0627\u0628', 'N'), ('\u0645\u06cc\u200c\u062e\u0648\u0627\u0646\u06cc\u0645', 'V')]\n\n>>> spacy_posTagger = SpacyPOSTagger(model_path = 'MODELPATH')\n>>> spacy_posTagger.tag(tokens = ['\u0645\u0646', '\u0628\u0647', '\u0645\u062f\u0631\u0633\u0647', '\u0627\u06cc\u0631\u0627\u0646', '\u0631\u0641\u062a\u0647_\u0628\u0648\u062f\u0645', '.'])\n[('\u0645\u0646', 'PRON'), ('\u0628\u0647', 'ADP'), ('\u0645\u062f\u0631\u0633\u0647', 'NOUN,EZ'), ('\u0627\u06cc\u0631\u0627\u0646', 'NOUN'), ('\u0631\u0641\u062a\u0647_\u0628\u0648\u062f\u0645', 'VERB'), ('.', 'PUNCT')]\n\n>>> posTagger = POSTagger(model = 'pos_tagger.model', universal_tag = False)\n>>> posTagger.tag(tokens = ['\u0645\u0646', '\u0628\u0647', '\u0645\u062f\u0631\u0633\u0647', '\u0627\u06cc\u0631\u0627\u0646', '\u0631\u0641\u062a\u0647_\u0628\u0648\u062f\u0645', '.'])\n[('\u0645\u0646', 'PRON'), ('\u0628\u0647', 'ADP'), ('\u0645\u062f\u0631\u0633\u0647', 'NOUN'), ('\u0627\u06cc\u0631\u0627\u0646', 'NOUN'), ('\u0631\u0641\u062a\u0647_\u0628\u0648\u062f\u0645', 'VERB'), ('.', 'PUNCT')] \n\n>>> chunker = Chunker(model='chunker.model')\n>>> tagged = tagger.tag(word_tokenize('\u06a9\u062a\u0627\u0628 \u062e\u0648\u0627\u0646\u062f\u0646 \u0631\u0627 \u062f\u0648\u0633\u062a \u062f\u0627\u0631\u06cc\u0645'))\n>>> tree2brackets(chunker.parse(tagged))\n'[\u06a9\u062a\u0627\u0628 \u062e\u0648\u0627\u0646\u062f\u0646 NP] [\u0631\u0627 POSTP] [\u062f\u0648\u0633\u062a \u062f\u0627\u0631\u06cc\u0645 VP]'\n\n>>> spacy_chunker = SpacyChunker(model_path = 'model_path')\n>>> tree = spacy_chunker.parse(sentence = [('\u0646\u0627\u0645\u0647', 'NOUN,EZ'), ('\u0627\u06cc\u0634\u0627\u0646', 'PRON'), ('\u0631\u0627', 'ADP'), ('\u062f\u0631\u06cc\u0627\u0641\u062a', 'NOUN'), ('\u062f\u0627\u0634\u062a\u0645', 'VERB'), ('.', 'PUNCT')])\n>>> print(tree)\n(S\n  (NP \u0646\u0627\u0645\u0647/NOUN,EZ \u0627\u06cc\u0634\u0627\u0646/PRON)\n  (POSTP \u0631\u0627/ADP)\n  (VP \u062f\u0631\u06cc\u0627\u0641\u062a/NOUN \u062f\u0627\u0634\u062a\u0645/VERB)\n  ./PUNCT)\n\n>>> word_embedding = WordEmbedding(model_type = 'fasttext', model_path = 'word2vec.bin')\n>>> word_embedding.doesnt_match(['\u0633\u0644\u0627\u0645' ,'\u062f\u0631\u0648\u062f' ,'\u062e\u062f\u0627\u062d\u0627\u0641\u0638' ,'\u067e\u0646\u062c\u0631\u0647'])\n'\u067e\u0646\u062c\u0631\u0647'\n>>> word_embedding.doesnt_match(['\u0633\u0627\u0639\u062a' ,'\u067e\u0644\u0646\u06af' ,'\u0634\u06cc\u0631'])\n'\u0633\u0627\u0639\u062a'\n\n>>> parser = DependencyParser(tagger=tagger, lemmatizer=lemmatizer)\n>>> parser.parse(word_tokenize('\u0632\u0646\u06af\u200c\u0647\u0627 \u0628\u0631\u0627\u06cc \u06a9\u0647 \u0628\u0647 \u0635\u062f\u0627 \u062f\u0631\u0645\u06cc\u200c\u0622\u06cc\u062f\u061f'))\n<DependencyGraph with 8 nodes>\n\n>>> spacy_parser = SpacyDependencyParser(tagger=tagger, lemmatizer=lemmatizer)\n>>> spacy_parser.parse_sents([word_tokenize('\u0632\u0646\u06af\u200c\u0647\u0627 \u0628\u0631\u0627\u06cc \u06a9\u0647 \u0628\u0647 \u0635\u062f\u0627 \u062f\u0631\u0645\u06cc\u200c\u0622\u06cc\u062f\u061f')])\n\n```\n\n## Documentation\n\nVisit https://roshan-ai.ir/hazm/docs to view the full documentation.\n\n## Hazm in other languages\n\n**Disclaimer:** These ports are not developed or maintained by Roshan. They may not have the same functionality or quality as the original Hazm..\n\n- [**JHazm**](https://github.com/mojtaba-khallash/JHazm): A Java port of Hazm\n- [**NHazm**](https://github.com/mojtaba-khallash/NHazm): A C# port of Hazm\n\n## Contribution\n\nWe welcome and appreciate any contributions to this repo, such as bug reports, feature requests, code improvements, documentation updates, etc. Please follow the [Contribution guideline](./CONTRIBUTION.md) when contributing. You can open an issue, fork the repo, write your code, create a pull request and wait for a review and feedback. Thank you for your interest and support in this repo!\n\n## Thanks\n\n### Code contributores\n\n![Alt](https://repobeats.axiom.co/api/embed/ae42bda158791645d143c3e3c7f19d8a68d06d08.svg \"Repobeats analytics image\")\n\n<a href=\"https://github.com/roshan-research/hazm/graphs/contributors\">\n  <img src=\"https://contrib.rocks/image?repo=roshan-research/hazm\" />\n</a>\n\n### Others \n\n- Thanks to [Virastyar](http://virastyar.ir/) project for providing the persian word list.\n\n[![Star History Chart](https://api.star-history.com/svg?repos=roshan-research/hazm&type=Date)](https://star-history.com/#roshan-research/hazm&Date)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Persian NLP Toolkit",
    "version": "0.10.0",
    "project_urls": {
        "Changelog": "https://github.com/roshan-research/hazm/releases/latest",
        "Contribution": "https://github.com/roshan-research/hazm/blob/master/CONTRIBUTION.md",
        "Demo": "https://www.roshan-ai.ir/hazm/demo/",
        "Documentation": "https://roshan-ai.ir/hazm/docs/",
        "Homepage": "https://roshan-ai.ir/hazm/",
        "Issues": "https://github.com/roshan-research/hazm/issues",
        "Join-us": "https://www.roshan-ai.ir/join-us/",
        "Repository": "https://github.com/roshan-research/hazm/"
    },
    "split_keywords": [
        "nlp",
        "persian nlp",
        "persian"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "918ccc3d01c27681eb8223781ea162a23f9926647ce864eb601a19aee4bce0af",
                "md5": "64e33e12b331466253cb7ddf638959a8",
                "sha256": "525c9b32914b98e50dab27fbd4f79c1067c898668812de095b8cdd81cc52b0ef"
            },
            "downloads": -1,
            "filename": "hazm-0.10.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "64e33e12b331466253cb7ddf638959a8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 892598,
            "upload_time": "2024-01-16T16:48:17",
            "upload_time_iso_8601": "2024-01-16T16:48:17.742484Z",
            "url": "https://files.pythonhosted.org/packages/91/8c/cc3d01c27681eb8223781ea162a23f9926647ce864eb601a19aee4bce0af/hazm-0.10.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "50ed996f77a9c0c49f195a859de3096ee837b7a28f31498221b5d1fd0d00288b",
                "md5": "cb20275bd4a794c4b69b94e1ea77163a",
                "sha256": "a356543004630a9338cc09f6725a30f9928cdabda9353092bd21585b4329a97d"
            },
            "downloads": -1,
            "filename": "hazm-0.10.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cb20275bd4a794c4b69b94e1ea77163a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 874323,
            "upload_time": "2024-01-16T16:48:19",
            "upload_time_iso_8601": "2024-01-16T16:48:19.816130Z",
            "url": "https://files.pythonhosted.org/packages/50/ed/996f77a9c0c49f195a859de3096ee837b7a28f31498221b5d1fd0d00288b/hazm-0.10.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-16 16:48:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "roshan-research",
    "github_project": "hazm",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "hazm"
}

Roshan