PyArabic


NamePyArabic JSON
Version 0.6.15 PyPI version JSON
download
home_pagehttp://pyarabic.sourceforge.net/
SummaryArabic text tools for Python
upload_time2022-06-18 10:47:16
maintainer
docs_urlNone
authorTaha Zerrouki
requires_python
licenseGPL
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PyArabic
A specific *Arabic language* library for **Python**, provides basic functions to manipulate Arabic letters and text, like detecting Arabic letters, Arabic letters groups and characteristics, remove diacritics etc.

مكتبة برمجية للغة العربية بلغة بيثون، توفر دوالا للتحكم في الحروف والنصوص، مثلا تحديد نوع الحرف، حذف الحركات، مقارنة التشكيل.


  Developpers:  Taha Zerrouki: http://tahadz.com
    taha dot zerrouki at gmail dot com

Features |   value
---------|---------------------------------------------------------------------------------
Authors  | Taha Zerrouki: http://tahadz.com,  taha dot zerrouki at gmail dot com
Release  | 0.6.12 
License  |[GPL](https://github.com/linuxscout/pyarabic/master/LICENSE)
Tracker  |[linuxscout/pyarabic/Issues](https://github.com/linuxscout/pyarabic/issues)
Website  |[https://pypi.python.org/pypi/pyarabic](https://pypi.python.org/pypi/pyarabic)
Doc  |[package Documentaion](https://pyarabic.readthedocs.io/)
Source  |[Github](http://github.com/linuxscout/pyarabic)
Download  |[pypi.python.org](https://pypi.python.org/pypi/pyarabic)
Feedbacks  |[Comments](https://github.com/linuxscout/pyarabic/issues)
Accounts  |[@Twitter](https://twitter.com/linuxscout)  [@Sourceforge](http://sourceforge.net/projects/pyarabic/)



## Citation
If you would cite it in academic work, can you use this citation
```
T. Zerrouki‏, Pyarabic, An Arabic language library for Python,
  https://pypi.python.org/pypi/pyarabic/, 2010
```
or in bibtex format

```bibtex
@misc{zerrouki2012pyarabic,
  title={pyarabic, An Arabic language library for Python},
  author={Zerrouki, Taha},
  url={https://pypi.python.org/pypi/pyarabic,
  year={2010}
}
```


## مزايا
* تصنيف الحروف
* تفريق النص إلى وحدات (جمل أو كلمات)
* حذف الحركات:( كل الحركات، الحركات عدا الشدة، حذف الشدة، حذف التطويل، حذف الحركة الأخيرة)
* فصل الحركات عن النصوص وإدماجها
* اختزال التشكيل
* قياس التماثل بين كلمتين ( في الحركات جزئيا وكليا، التماثل مع وزن)
* تنميط الحروف ( توحيد التراكيب مثل لام الألف، والهمزات)
* تحويل الأعداد إلى كلمات
* استخلاص العبارات العددية من النص
* تشكيل أولي للعبارات العددية
* قلب النصوص العربية للأنظمة التي لا تدعم تشبيك الحروف

## Features
* Arabic letters classification
* Text tokenization into words or sentences
* Strip Harakat ( all, except Shadda, tatweel, last_haraka)
* Sperate and  join Letters and Harakat
* Reduce tashkeel
* Mesure tashkeel similarity ( Harakats, fully or partially vocalized, similarity with a template)
* Letters normalization ( Ligatures and Hamza)
* Numbers to words
* Extract numerical phrases
* Pre-vocalization of numerical phrases
* Unshiping texts


### Applications

* Arabic text processing

### Installation
```
pip install pyarabic
```

### Usage
```python
import pyarabic.araby as araby
import pyarabic.number as number
```




### Package Documentation
[https://pythonhosted.org/PyArabic/](https://pythonhosted.org/PyArabic/)

#### Files
* file/directory    category    description 
 * araby.py: arabic routins.
 * named.py: handle named enteties recognation.
 * unshape.py: unshaping arabic text

## وصف
مكتبة بيثون للعربيةPyArabic  مكتبة برمجية تجمع في طياتها خصائص ووظائف يحتاجها المبرمج للتعامل مع النصوص العربية، وهي مستوحاة من مكتبة بي أتش بي العربية لصديقنا خالد الشمعة، التي تستهدف توفير مصدر مفتوح لكثير من وظائف النصوص العربية لاستعمالها في مجال النشر في الإنترنت.

### تعريف نص عربي
أفضل طريقة للتعامل مع النصوص العربية بلغة بيثون هو استخدام الترميز يونيكود، التي يدعمها بيثون دعما أصليا، لا حاجة فيه إلى مكتبات خارجية أو دوال خاصة، وقد يكون هذا أهمّ ما دفعني لاختيار لغة بيثون، إذ يكفي أن تسبق النص بحرف يو u  لتدع بيثون يريحك من عناء التفكير وبرمجة النصوص، ويعامل معها بشفافية عالية.

تعريف نص عربي بترميز يونيكود

```python
text = u'الإسلام ديننا'
```

اختيار ترميز ملف المتن.
```
#!/usr/bin/env python
# -*- coding: utf-8 -*-
```

عرض النص العربي في المخرج
```
print text.encode('utf8')
```

اسم المكتبة pyarabic
فيها العديد من الوظائف المجمعة في وحدات:

فيها العديد من الوظائف المجمعة في وحدات:
* وحدة : araby.py  وفيها الثوابت كالحروف وأسمائها ومجموعاتها والوظائف العامة كحذف الحركات وحذف التطويل ومقارنة التشكيل بين الكلمات، وضبط  علامات الترقيم.
* وحدة الأعداد number.py : وفيها وظائف تحويل الأعداد إلى كلمات والكلمات إلى أعداد، كشف ألفاظ الأعداد في النص، وتشكيلها.
* وحدة المسميات : named.py وفيها وظائف لكشف الأسماء والمسميات في النص.


### وحدة الوظائف العامة araby
يمكن استدعاؤها بالأمر 
```python
Import pyarabic.araby as araby
```

وسنستعمل الاختصار araby  فيما بعد
الثوابت العامة في مكتبة عربي:
تضم الحروف العربية  ومجموعاتها المختلفة وبعض الأنماط المستخدمة لاحقا في وظائف مختلفة
1- الحروف العربية الأساسية مع تسميات لاتينية لاستعمالها في البرمجة

The arabic chars contains all arabic letters, a sub class of unicode,

```python
COMMA            = u'\u060C'
SEMICOLON        = u'\u061B'
QUESTION         = u'\u061F'
HAMZA            = u'\u0621'
ALEF_MADDA       = u'\u0622'
ALEF_HAMZA_ABOVE = u'\u0623'
```
المزيد في ملف araby.py

تضم مجموعة الحروف العربية الحروف الأساسية، والحركات والأرقام، وعلامات الترقيم، وبعض الحروف الخاصة كالألف الخنجرية والياء الصغيرة، و لامات الألف بأشكالها.
#### مجموعات الأحرف: 
ويمكن تقسيم الحروف في مجموعات وتصنيفات نستعملها فيما بعد في الوظائف المختلفة

الاسم العربي | وصف المجموعة | عناصرها
--------|--------------|------------
الحروف | مجموعة الحروف العربية دون حركات | LETTERS = u'ابتةثجحخدذرزسشصضطظعغفقكلمنهويءآأؤإئ' 
التشكيل  | مجموعة الحركات مع الشدة مدرجة  | TASHKEEL =(FATHATAN, DAMMATAN, KASRATAN, FATHA,DAMMA,KASRA, SUKUN,   SHADDA)
الحركات | مجموعة الحركات دون الشدة مدرجة | HARAKAT =(  FATHATAN,   DAMMATAN,   KASRATAN,  FATHA,  DAMMA,  KASRA, SUKUN);
الحركات القصيرة | الحركات القصيرة دون تنوين | SHORTHARAKAT =( FATHA,  DAMMA,  KASRA, SUKUN);
التنوين | حركات التنوين | TANWIN =(FATHATAN,  DAMMATAN,   KASRATAN);
المركبات | لامات الألف في أشكالها المختلفة | LIGUATURES = (u'ﻻ', u'ﻷ', u'ﻹ', u'ﻵ') 
الهمزات | الهمزة في أشكالها المختلفة | HAMZAT = (u'ء', u'ؤ', u'ئ', u'ٔ', u'ٕ', u'إ', u'أ') 
الألفات | الألف في أشكالها المختلفة | ALEFAT = (u'ا', u'آ', u'أ', u'إ', u'ٱ', u'ى', u'ٰ') 
حروف العلة | الياء والواو والألف | WEAK = (u'ا', u'و', u'ي', u'ى') 
الياءات | ما يرسم مثل الياء، الصغيرة منها، والألف المقصورة والهمزة على النبرة | YEHLIKE = (u'ي', u'ئ', u'ى', u'ۦ') 
الواوات | ما يرسم مثل الواو | WAWLIKE = (u'و', u'ؤ', u'ۥ') 
التاءات | التاء المربوطة والمفتوحة | TEHLIKE = (u'ت', u'ة') 
الحروف الصغيرة | الألف والياء والواو الصغار | SMALL = (u'ٰ', u'ۥ', u'ۦ') 
الحروف القمرية | الحروف القمرية | MOON = (u'ء', u'آ', u'أ', u'إ', u'ا', u'ب', u'ج', u'ح', u'خ', ... 
الحروف الشمسية | الحروف الشمسية | SUN = (u'ت', u'ث', u'د', u'ذ', u'ر', u'ز', u'س', u'ش', u'ص', u... 
ترتيب الحروف العربية | يعطي لكل حرف عربي رقما ترتيبيا فالألف واحد والباء اثنان والهمزة 29. | AlphabeticOrder = {u'ء': 29, u'آ': 29, u'أ': 29, u'ؤ': 29, u'إ... 
أسماء الحروف | يعطي كل حرف اسمه العربي | NAMES = {u'ء': u'همزة', u'آ': u'ألف ممدودة', u'أ': u'همزة على ... 


#### الوظائف- الدوال

##### أهم الوظائف

وصف الدالة  |الدالة
------|------------
حذف الحركات كلها بما فيها الشدة|strip_tashkeel(text)
حذف الحركات كلها ماعدا الشدة|strip_harakat(text)
حذف الحركة الأخيرة|strip_lastharaka(text)
حذف التطويل| strip_tatweel(text)
تنميط أشكال الهمزة المختلفة | normalize_hamza(text)
تفريق كلمات النص |tokenize(text)
تفريق جمل النص |sentence_tokenize(text)

طالع الوظائف والأمثلة في ]ملف المزايا[
[features.md](https://github.com/linuxscout/pyarabic/blob/master/doc/features.md)



            

Raw data

            {
    "_id": null,
    "home_page": "http://pyarabic.sourceforge.net/",
    "name": "PyArabic",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Taha Zerrouki",
    "author_email": "taha_zerrouki@hotmail.com",
    "download_url": "",
    "platform": null,
    "description": "# PyArabic\nA specific *Arabic language* library for **Python**, provides basic functions to manipulate Arabic letters and text, like detecting Arabic letters, Arabic letters groups and characteristics, remove diacritics etc.\n\n\u0645\u0643\u062a\u0628\u0629 \u0628\u0631\u0645\u062c\u064a\u0629 \u0644\u0644\u063a\u0629 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0628\u0644\u063a\u0629 \u0628\u064a\u062b\u0648\u0646\u060c \u062a\u0648\u0641\u0631 \u062f\u0648\u0627\u0644\u0627 \u0644\u0644\u062a\u062d\u0643\u0645 \u0641\u064a \u0627\u0644\u062d\u0631\u0648\u0641 \u0648\u0627\u0644\u0646\u0635\u0648\u0635\u060c \u0645\u062b\u0644\u0627 \u062a\u062d\u062f\u064a\u062f \u0646\u0648\u0639 \u0627\u0644\u062d\u0631\u0641\u060c \u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0627\u062a\u060c \u0645\u0642\u0627\u0631\u0646\u0629 \u0627\u0644\u062a\u0634\u0643\u064a\u0644.\n\n\n  Developpers:  Taha Zerrouki: http://tahadz.com\n    taha dot zerrouki at gmail dot com\n\nFeatures |   value\n---------|---------------------------------------------------------------------------------\nAuthors  | Taha Zerrouki: http://tahadz.com,  taha dot zerrouki at gmail dot com\nRelease  | 0.6.12 \nLicense  |[GPL](https://github.com/linuxscout/pyarabic/master/LICENSE)\nTracker  |[linuxscout/pyarabic/Issues](https://github.com/linuxscout/pyarabic/issues)\nWebsite  |[https://pypi.python.org/pypi/pyarabic](https://pypi.python.org/pypi/pyarabic)\nDoc  |[package Documentaion](https://pyarabic.readthedocs.io/)\nSource  |[Github](http://github.com/linuxscout/pyarabic)\nDownload  |[pypi.python.org](https://pypi.python.org/pypi/pyarabic)\nFeedbacks  |[Comments](https://github.com/linuxscout/pyarabic/issues)\nAccounts  |[@Twitter](https://twitter.com/linuxscout)  [@Sourceforge](http://sourceforge.net/projects/pyarabic/)\n\n\n\n## Citation\nIf you would cite it in academic work, can you use this citation\n```\nT. Zerrouki\u200f, Pyarabic, An Arabic language library for Python,\n  https://pypi.python.org/pypi/pyarabic/, 2010\n```\nor in bibtex format\n\n```bibtex\n@misc{zerrouki2012pyarabic,\n  title={pyarabic, An Arabic language library for Python},\n  author={Zerrouki, Taha},\n  url={https://pypi.python.org/pypi/pyarabic,\n  year={2010}\n}\n```\n\n\n## \u0645\u0632\u0627\u064a\u0627\n* \u062a\u0635\u0646\u064a\u0641 \u0627\u0644\u062d\u0631\u0648\u0641\n* \u062a\u0641\u0631\u064a\u0642 \u0627\u0644\u0646\u0635 \u0625\u0644\u0649 \u0648\u062d\u062f\u0627\u062a (\u062c\u0645\u0644 \u0623\u0648 \u0643\u0644\u0645\u0627\u062a)\n* \u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0627\u062a:( \u0643\u0644 \u0627\u0644\u062d\u0631\u0643\u0627\u062a\u060c \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0639\u062f\u0627 \u0627\u0644\u0634\u062f\u0629\u060c \u062d\u0630\u0641 \u0627\u0644\u0634\u062f\u0629\u060c \u062d\u0630\u0641 \u0627\u0644\u062a\u0637\u0648\u064a\u0644\u060c \u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0629 \u0627\u0644\u0623\u062e\u064a\u0631\u0629)\n* \u0641\u0635\u0644 \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0639\u0646 \u0627\u0644\u0646\u0635\u0648\u0635 \u0648\u0625\u062f\u0645\u0627\u062c\u0647\u0627\n* \u0627\u062e\u062a\u0632\u0627\u0644 \u0627\u0644\u062a\u0634\u0643\u064a\u0644\n* \u0642\u064a\u0627\u0633 \u0627\u0644\u062a\u0645\u0627\u062b\u0644 \u0628\u064a\u0646 \u0643\u0644\u0645\u062a\u064a\u0646 ( \u0641\u064a \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u062c\u0632\u0626\u064a\u0627 \u0648\u0643\u0644\u064a\u0627\u060c \u0627\u0644\u062a\u0645\u0627\u062b\u0644 \u0645\u0639 \u0648\u0632\u0646)\n* \u062a\u0646\u0645\u064a\u0637 \u0627\u0644\u062d\u0631\u0648\u0641 ( \u062a\u0648\u062d\u064a\u062f \u0627\u0644\u062a\u0631\u0627\u0643\u064a\u0628 \u0645\u062b\u0644 \u0644\u0627\u0645 \u0627\u0644\u0623\u0644\u0641\u060c \u0648\u0627\u0644\u0647\u0645\u0632\u0627\u062a)\n* \u062a\u062d\u0648\u064a\u0644 \u0627\u0644\u0623\u0639\u062f\u0627\u062f \u0625\u0644\u0649 \u0643\u0644\u0645\u0627\u062a\n* \u0627\u0633\u062a\u062e\u0644\u0627\u0635 \u0627\u0644\u0639\u0628\u0627\u0631\u0627\u062a \u0627\u0644\u0639\u062f\u062f\u064a\u0629 \u0645\u0646 \u0627\u0644\u0646\u0635\n* \u062a\u0634\u0643\u064a\u0644 \u0623\u0648\u0644\u064a \u0644\u0644\u0639\u0628\u0627\u0631\u0627\u062a \u0627\u0644\u0639\u062f\u062f\u064a\u0629\n* \u0642\u0644\u0628 \u0627\u0644\u0646\u0635\u0648\u0635 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0644\u0644\u0623\u0646\u0638\u0645\u0629 \u0627\u0644\u062a\u064a \u0644\u0627 \u062a\u062f\u0639\u0645 \u062a\u0634\u0628\u064a\u0643 \u0627\u0644\u062d\u0631\u0648\u0641\n\n## Features\n* Arabic letters classification\n* Text tokenization into words or sentences\n* Strip Harakat ( all, except Shadda, tatweel, last_haraka)\n* Sperate and  join Letters and Harakat\n* Reduce tashkeel\n* Mesure tashkeel similarity ( Harakats, fully or partially vocalized, similarity with a template)\n* Letters normalization ( Ligatures and Hamza)\n* Numbers to words\n* Extract numerical phrases\n* Pre-vocalization of numerical phrases\n* Unshiping texts\n\n\n### Applications\n\n* Arabic text processing\n\n### Installation\n```\npip install pyarabic\n```\n\n### Usage\n```python\nimport pyarabic.araby as araby\nimport pyarabic.number as number\n```\n\n\n\n\n### Package Documentation\n[https://pythonhosted.org/PyArabic/](https://pythonhosted.org/PyArabic/)\n\n#### Files\n* file/directory    category    description \n * araby.py: arabic routins.\n * named.py: handle named enteties recognation.\n * unshape.py: unshaping arabic text\n\n## \u0648\u0635\u0641\n\u0645\u0643\u062a\u0628\u0629 \u0628\u064a\u062b\u0648\u0646 \u0644\u0644\u0639\u0631\u0628\u064a\u0629PyArabic  \u0645\u0643\u062a\u0628\u0629 \u0628\u0631\u0645\u062c\u064a\u0629 \u062a\u062c\u0645\u0639 \u0641\u064a \u0637\u064a\u0627\u062a\u0647\u0627 \u062e\u0635\u0627\u0626\u0635 \u0648\u0648\u0638\u0627\u0626\u0641 \u064a\u062d\u062a\u0627\u062c\u0647\u0627 \u0627\u0644\u0645\u0628\u0631\u0645\u062c \u0644\u0644\u062a\u0639\u0627\u0645\u0644 \u0645\u0639 \u0627\u0644\u0646\u0635\u0648\u0635 \u0627\u0644\u0639\u0631\u0628\u064a\u0629\u060c \u0648\u0647\u064a \u0645\u0633\u062a\u0648\u062d\u0627\u0629 \u0645\u0646 \u0645\u0643\u062a\u0628\u0629 \u0628\u064a \u0623\u062a\u0634 \u0628\u064a \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0644\u0635\u062f\u064a\u0642\u0646\u0627 \u062e\u0627\u0644\u062f \u0627\u0644\u0634\u0645\u0639\u0629\u060c \u0627\u0644\u062a\u064a \u062a\u0633\u062a\u0647\u062f\u0641 \u062a\u0648\u0641\u064a\u0631 \u0645\u0635\u062f\u0631 \u0645\u0641\u062a\u0648\u062d \u0644\u0643\u062b\u064a\u0631 \u0645\u0646 \u0648\u0638\u0627\u0626\u0641 \u0627\u0644\u0646\u0635\u0648\u0635 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0644\u0627\u0633\u062a\u0639\u0645\u0627\u0644\u0647\u0627 \u0641\u064a \u0645\u062c\u0627\u0644 \u0627\u0644\u0646\u0634\u0631 \u0641\u064a \u0627\u0644\u0625\u0646\u062a\u0631\u0646\u062a.\n\n### \u062a\u0639\u0631\u064a\u0641 \u0646\u0635 \u0639\u0631\u0628\u064a\n\u0623\u0641\u0636\u0644 \u0637\u0631\u064a\u0642\u0629 \u0644\u0644\u062a\u0639\u0627\u0645\u0644 \u0645\u0639 \u0627\u0644\u0646\u0635\u0648\u0635 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0628\u0644\u063a\u0629 \u0628\u064a\u062b\u0648\u0646 \u0647\u0648 \u0627\u0633\u062a\u062e\u062f\u0627\u0645 \u0627\u0644\u062a\u0631\u0645\u064a\u0632 \u064a\u0648\u0646\u064a\u0643\u0648\u062f\u060c \u0627\u0644\u062a\u064a \u064a\u062f\u0639\u0645\u0647\u0627 \u0628\u064a\u062b\u0648\u0646 \u062f\u0639\u0645\u0627 \u0623\u0635\u0644\u064a\u0627\u060c \u0644\u0627 \u062d\u0627\u062c\u0629 \u0641\u064a\u0647 \u0625\u0644\u0649 \u0645\u0643\u062a\u0628\u0627\u062a \u062e\u0627\u0631\u062c\u064a\u0629 \u0623\u0648 \u062f\u0648\u0627\u0644 \u062e\u0627\u0635\u0629\u060c \u0648\u0642\u062f \u064a\u0643\u0648\u0646 \u0647\u0630\u0627 \u0623\u0647\u0645\u0651 \u0645\u0627 \u062f\u0641\u0639\u0646\u064a \u0644\u0627\u062e\u062a\u064a\u0627\u0631 \u0644\u063a\u0629 \u0628\u064a\u062b\u0648\u0646\u060c \u0625\u0630 \u064a\u0643\u0641\u064a \u0623\u0646 \u062a\u0633\u0628\u0642 \u0627\u0644\u0646\u0635 \u0628\u062d\u0631\u0641 \u064a\u0648 u  \u0644\u062a\u062f\u0639 \u0628\u064a\u062b\u0648\u0646 \u064a\u0631\u064a\u062d\u0643 \u0645\u0646 \u0639\u0646\u0627\u0621 \u0627\u0644\u062a\u0641\u0643\u064a\u0631 \u0648\u0628\u0631\u0645\u062c\u0629 \u0627\u0644\u0646\u0635\u0648\u0635\u060c \u0648\u064a\u0639\u0627\u0645\u0644 \u0645\u0639\u0647\u0627 \u0628\u0634\u0641\u0627\u0641\u064a\u0629 \u0639\u0627\u0644\u064a\u0629.\n\n\u062a\u0639\u0631\u064a\u0641 \u0646\u0635 \u0639\u0631\u0628\u064a \u0628\u062a\u0631\u0645\u064a\u0632 \u064a\u0648\u0646\u064a\u0643\u0648\u062f\n\n```python\ntext = u'\u0627\u0644\u0625\u0633\u0644\u0627\u0645 \u062f\u064a\u0646\u0646\u0627'\n```\n\n\u0627\u062e\u062a\u064a\u0627\u0631 \u062a\u0631\u0645\u064a\u0632 \u0645\u0644\u0641 \u0627\u0644\u0645\u062a\u0646.\n```\n\ufeff#!/usr/bin/env python\n# -*- coding: utf-8 -*-\n```\n\n\u0639\u0631\u0636 \u0627\u0644\u0646\u0635 \u0627\u0644\u0639\u0631\u0628\u064a \u0641\u064a \u0627\u0644\u0645\u062e\u0631\u062c\n```\nprint text.encode('utf8')\n```\n\n\u0627\u0633\u0645 \u0627\u0644\u0645\u0643\u062a\u0628\u0629 pyarabic\n\u0641\u064a\u0647\u0627 \u0627\u0644\u0639\u062f\u064a\u062f \u0645\u0646 \u0627\u0644\u0648\u0638\u0627\u0626\u0641 \u0627\u0644\u0645\u062c\u0645\u0639\u0629 \u0641\u064a \u0648\u062d\u062f\u0627\u062a:\n\n\u0641\u064a\u0647\u0627 \u0627\u0644\u0639\u062f\u064a\u062f \u0645\u0646 \u0627\u0644\u0648\u0638\u0627\u0626\u0641 \u0627\u0644\u0645\u062c\u0645\u0639\u0629 \u0641\u064a \u0648\u062d\u062f\u0627\u062a:\n* \u0648\u062d\u062f\u0629 : araby.py  \u0648\u0641\u064a\u0647\u0627 \u0627\u0644\u062b\u0648\u0627\u0628\u062a \u0643\u0627\u0644\u062d\u0631\u0648\u0641 \u0648\u0623\u0633\u0645\u0627\u0626\u0647\u0627 \u0648\u0645\u062c\u0645\u0648\u0639\u0627\u062a\u0647\u0627 \u0648\u0627\u0644\u0648\u0638\u0627\u0626\u0641 \u0627\u0644\u0639\u0627\u0645\u0629 \u0643\u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0648\u062d\u0630\u0641 \u0627\u0644\u062a\u0637\u0648\u064a\u0644 \u0648\u0645\u0642\u0627\u0631\u0646\u0629 \u0627\u0644\u062a\u0634\u0643\u064a\u0644 \u0628\u064a\u0646 \u0627\u0644\u0643\u0644\u0645\u0627\u062a\u060c \u0648\u0636\u0628\u0637  \u0639\u0644\u0627\u0645\u0627\u062a \u0627\u0644\u062a\u0631\u0642\u064a\u0645.\n* \u0648\u062d\u062f\u0629 \u0627\u0644\u0623\u0639\u062f\u0627\u062f number.py : \u0648\u0641\u064a\u0647\u0627 \u0648\u0638\u0627\u0626\u0641 \u062a\u062d\u0648\u064a\u0644 \u0627\u0644\u0623\u0639\u062f\u0627\u062f \u0625\u0644\u0649 \u0643\u0644\u0645\u0627\u062a \u0648\u0627\u0644\u0643\u0644\u0645\u0627\u062a \u0625\u0644\u0649 \u0623\u0639\u062f\u0627\u062f\u060c \u0643\u0634\u0641 \u0623\u0644\u0641\u0627\u0638 \u0627\u0644\u0623\u0639\u062f\u0627\u062f \u0641\u064a \u0627\u0644\u0646\u0635\u060c \u0648\u062a\u0634\u0643\u064a\u0644\u0647\u0627.\n* \u0648\u062d\u062f\u0629 \u0627\u0644\u0645\u0633\u0645\u064a\u0627\u062a : named.py \u0648\u0641\u064a\u0647\u0627 \u0648\u0638\u0627\u0626\u0641 \u0644\u0643\u0634\u0641 \u0627\u0644\u0623\u0633\u0645\u0627\u0621 \u0648\u0627\u0644\u0645\u0633\u0645\u064a\u0627\u062a \u0641\u064a \u0627\u0644\u0646\u0635.\n\n\n### \u0648\u062d\u062f\u0629 \u0627\u0644\u0648\u0638\u0627\u0626\u0641 \u0627\u0644\u0639\u0627\u0645\u0629 araby\n\u064a\u0645\u0643\u0646 \u0627\u0633\u062a\u062f\u0639\u0627\u0624\u0647\u0627 \u0628\u0627\u0644\u0623\u0645\u0631 \n```python\nImport pyarabic.araby as araby\n```\n\n\u0648\u0633\u0646\u0633\u062a\u0639\u0645\u0644 \u0627\u0644\u0627\u062e\u062a\u0635\u0627\u0631 araby  \u0641\u064a\u0645\u0627 \u0628\u0639\u062f\n\u0627\u0644\u062b\u0648\u0627\u0628\u062a \u0627\u0644\u0639\u0627\u0645\u0629 \u0641\u064a \u0645\u0643\u062a\u0628\u0629 \u0639\u0631\u0628\u064a:\n\u062a\u0636\u0645 \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0639\u0631\u0628\u064a\u0629  \u0648\u0645\u062c\u0645\u0648\u0639\u0627\u062a\u0647\u0627 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629 \u0648\u0628\u0639\u0636 \u0627\u0644\u0623\u0646\u0645\u0627\u0637 \u0627\u0644\u0645\u0633\u062a\u062e\u062f\u0645\u0629 \u0644\u0627\u062d\u0642\u0627 \u0641\u064a \u0648\u0638\u0627\u0626\u0641 \u0645\u062e\u062a\u0644\u0641\u0629\n1- \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0627\u0644\u0623\u0633\u0627\u0633\u064a\u0629 \u0645\u0639 \u062a\u0633\u0645\u064a\u0627\u062a \u0644\u0627\u062a\u064a\u0646\u064a\u0629 \u0644\u0627\u0633\u062a\u0639\u0645\u0627\u0644\u0647\u0627 \u0641\u064a \u0627\u0644\u0628\u0631\u0645\u062c\u0629\n\nThe arabic chars contains all arabic letters, a sub class of unicode,\n\n```python\nCOMMA            = u'\\u060C'\nSEMICOLON        = u'\\u061B'\nQUESTION         = u'\\u061F'\nHAMZA            = u'\\u0621'\nALEF_MADDA       = u'\\u0622'\nALEF_HAMZA_ABOVE = u'\\u0623'\n```\n\u0627\u0644\u0645\u0632\u064a\u062f \u0641\u064a \u0645\u0644\u0641 araby.py\n\n\u062a\u0636\u0645 \u0645\u062c\u0645\u0648\u0639\u0629 \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0623\u0633\u0627\u0633\u064a\u0629\u060c \u0648\u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0648\u0627\u0644\u0623\u0631\u0642\u0627\u0645\u060c \u0648\u0639\u0644\u0627\u0645\u0627\u062a \u0627\u0644\u062a\u0631\u0642\u064a\u0645\u060c \u0648\u0628\u0639\u0636 \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u062e\u0627\u0635\u0629 \u0643\u0627\u0644\u0623\u0644\u0641 \u0627\u0644\u062e\u0646\u062c\u0631\u064a\u0629 \u0648\u0627\u0644\u064a\u0627\u0621 \u0627\u0644\u0635\u063a\u064a\u0631\u0629\u060c \u0648 \u0644\u0627\u0645\u0627\u062a \u0627\u0644\u0623\u0644\u0641 \u0628\u0623\u0634\u0643\u0627\u0644\u0647\u0627.\n#### \u0645\u062c\u0645\u0648\u0639\u0627\u062a \u0627\u0644\u0623\u062d\u0631\u0641: \n\u0648\u064a\u0645\u0643\u0646 \u062a\u0642\u0633\u064a\u0645 \u0627\u0644\u062d\u0631\u0648\u0641 \u0641\u064a \u0645\u062c\u0645\u0648\u0639\u0627\u062a \u0648\u062a\u0635\u0646\u064a\u0641\u0627\u062a \u0646\u0633\u062a\u0639\u0645\u0644\u0647\u0627 \u0641\u064a\u0645\u0627 \u0628\u0639\u062f \u0641\u064a \u0627\u0644\u0648\u0638\u0627\u0626\u0641 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629\n\n\u0627\u0644\u0627\u0633\u0645 \u0627\u0644\u0639\u0631\u0628\u064a | \u0648\u0635\u0641 \u0627\u0644\u0645\u062c\u0645\u0648\u0639\u0629 | \u0639\u0646\u0627\u0635\u0631\u0647\u0627\n--------|--------------|------------\n\u0627\u0644\u062d\u0631\u0648\u0641 | \u0645\u062c\u0645\u0648\u0639\u0629 \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 \u062f\u0648\u0646 \u062d\u0631\u0643\u0627\u062a | LETTERS = u'\u0627\u0628\u062a\u0629\u062b\u062c\u062d\u062e\u062f\u0630\u0631\u0632\u0633\u0634\u0635\u0636\u0637\u0638\u0639\u063a\u0641\u0642\u0643\u0644\u0645\u0646\u0647\u0648\u064a\u0621\u0622\u0623\u0624\u0625\u0626' \n\u0627\u0644\u062a\u0634\u0643\u064a\u0644  | \u0645\u062c\u0645\u0648\u0639\u0629 \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0645\u0639 \u0627\u0644\u0634\u062f\u0629 \u0645\u062f\u0631\u062c\u0629  | TASHKEEL =(FATHATAN, DAMMATAN, KASRATAN, FATHA,DAMMA,KASRA, SUKUN,   SHADDA)\n\u0627\u0644\u062d\u0631\u0643\u0627\u062a | \u0645\u062c\u0645\u0648\u0639\u0629 \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u062f\u0648\u0646 \u0627\u0644\u0634\u062f\u0629 \u0645\u062f\u0631\u062c\u0629 | HARAKAT =(  FATHATAN,   DAMMATAN,   KASRATAN,  FATHA,  DAMMA,  KASRA, SUKUN);\n\u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0627\u0644\u0642\u0635\u064a\u0631\u0629 | \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0627\u0644\u0642\u0635\u064a\u0631\u0629 \u062f\u0648\u0646 \u062a\u0646\u0648\u064a\u0646 | SHORTHARAKAT =( FATHA,  DAMMA,  KASRA, SUKUN);\n\u0627\u0644\u062a\u0646\u0648\u064a\u0646 | \u062d\u0631\u0643\u0627\u062a \u0627\u0644\u062a\u0646\u0648\u064a\u0646 | TANWIN =(FATHATAN,  DAMMATAN,   KASRATAN);\n\u0627\u0644\u0645\u0631\u0643\u0628\u0627\u062a | \u0644\u0627\u0645\u0627\u062a \u0627\u0644\u0623\u0644\u0641 \u0641\u064a \u0623\u0634\u0643\u0627\u0644\u0647\u0627 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629 | LIGUATURES = (u'\ufefb', u'\ufef7', u'\ufef9', u'\ufef5') \n\u0627\u0644\u0647\u0645\u0632\u0627\u062a | \u0627\u0644\u0647\u0645\u0632\u0629 \u0641\u064a \u0623\u0634\u0643\u0627\u0644\u0647\u0627 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629 | HAMZAT = (u'\u0621', u'\u0624', u'\u0626', u'\u0654', u'\u0655', u'\u0625', u'\u0623') \n\u0627\u0644\u0623\u0644\u0641\u0627\u062a | \u0627\u0644\u0623\u0644\u0641 \u0641\u064a \u0623\u0634\u0643\u0627\u0644\u0647\u0627 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629 | ALEFAT = (u'\u0627', u'\u0622', u'\u0623', u'\u0625', u'\u0671', u'\u0649', u'\u0670') \n\u062d\u0631\u0648\u0641 \u0627\u0644\u0639\u0644\u0629 | \u0627\u0644\u064a\u0627\u0621 \u0648\u0627\u0644\u0648\u0627\u0648 \u0648\u0627\u0644\u0623\u0644\u0641 | WEAK = (u'\u0627', u'\u0648', u'\u064a', u'\u0649') \n\u0627\u0644\u064a\u0627\u0621\u0627\u062a | \u0645\u0627 \u064a\u0631\u0633\u0645 \u0645\u062b\u0644 \u0627\u0644\u064a\u0627\u0621\u060c \u0627\u0644\u0635\u063a\u064a\u0631\u0629 \u0645\u0646\u0647\u0627\u060c \u0648\u0627\u0644\u0623\u0644\u0641 \u0627\u0644\u0645\u0642\u0635\u0648\u0631\u0629 \u0648\u0627\u0644\u0647\u0645\u0632\u0629 \u0639\u0644\u0649 \u0627\u0644\u0646\u0628\u0631\u0629 | YEHLIKE = (u'\u064a', u'\u0626', u'\u0649', u'\u06e6') \n\u0627\u0644\u0648\u0627\u0648\u0627\u062a | \u0645\u0627 \u064a\u0631\u0633\u0645 \u0645\u062b\u0644 \u0627\u0644\u0648\u0627\u0648 | WAWLIKE = (u'\u0648', u'\u0624', u'\u06e5') \n\u0627\u0644\u062a\u0627\u0621\u0627\u062a | \u0627\u0644\u062a\u0627\u0621 \u0627\u0644\u0645\u0631\u0628\u0648\u0637\u0629 \u0648\u0627\u0644\u0645\u0641\u062a\u0648\u062d\u0629 | TEHLIKE = (u'\u062a', u'\u0629') \n\u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0635\u063a\u064a\u0631\u0629 | \u0627\u0644\u0623\u0644\u0641 \u0648\u0627\u0644\u064a\u0627\u0621 \u0648\u0627\u0644\u0648\u0627\u0648 \u0627\u0644\u0635\u063a\u0627\u0631 | SMALL = (u'\u0670', u'\u06e5', u'\u06e6') \n\u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0642\u0645\u0631\u064a\u0629 | \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0642\u0645\u0631\u064a\u0629 | MOON = (u'\u0621', u'\u0622', u'\u0623', u'\u0625', u'\u0627', u'\u0628', u'\u062c', u'\u062d', u'\u062e', ... \n\u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0634\u0645\u0633\u064a\u0629 | \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0634\u0645\u0633\u064a\u0629 | SUN = (u'\u062a', u'\u062b', u'\u062f', u'\u0630', u'\u0631', u'\u0632', u'\u0633', u'\u0634', u'\u0635', u... \n\u062a\u0631\u062a\u064a\u0628 \u0627\u0644\u062d\u0631\u0648\u0641 \u0627\u0644\u0639\u0631\u0628\u064a\u0629 | \u064a\u0639\u0637\u064a \u0644\u0643\u0644 \u062d\u0631\u0641 \u0639\u0631\u0628\u064a \u0631\u0642\u0645\u0627 \u062a\u0631\u062a\u064a\u0628\u064a\u0627 \u0641\u0627\u0644\u0623\u0644\u0641 \u0648\u0627\u062d\u062f \u0648\u0627\u0644\u0628\u0627\u0621 \u0627\u062b\u0646\u0627\u0646 \u0648\u0627\u0644\u0647\u0645\u0632\u0629 29. | AlphabeticOrder = {u'\u0621': 29, u'\u0622': 29, u'\u0623': 29, u'\u0624': 29, u'\u0625... \n\u0623\u0633\u0645\u0627\u0621 \u0627\u0644\u062d\u0631\u0648\u0641 | \u064a\u0639\u0637\u064a \u0643\u0644 \u062d\u0631\u0641 \u0627\u0633\u0645\u0647 \u0627\u0644\u0639\u0631\u0628\u064a | NAMES = {u'\u0621': u'\u0647\u0645\u0632\u0629', u'\u0622': u'\u0623\u0644\u0641 \u0645\u0645\u062f\u0648\u062f\u0629', u'\u0623': u'\u0647\u0645\u0632\u0629 \u0639\u0644\u0649 ... \n\n\n#### \u0627\u0644\u0648\u0638\u0627\u0626\u0641- \u0627\u0644\u062f\u0648\u0627\u0644\n\n##### \u0623\u0647\u0645 \u0627\u0644\u0648\u0638\u0627\u0626\u0641\n\n\u0648\u0635\u0641 \u0627\u0644\u062f\u0627\u0644\u0629  |\u0627\u0644\u062f\u0627\u0644\u0629\n------|------------\n\u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0643\u0644\u0647\u0627 \u0628\u0645\u0627 \u0641\u064a\u0647\u0627 \u0627\u0644\u0634\u062f\u0629|strip_tashkeel(text)\n\u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0627\u062a \u0643\u0644\u0647\u0627 \u0645\u0627\u0639\u062f\u0627 \u0627\u0644\u0634\u062f\u0629|strip_harakat(text)\n\u062d\u0630\u0641 \u0627\u0644\u062d\u0631\u0643\u0629 \u0627\u0644\u0623\u062e\u064a\u0631\u0629|strip_lastharaka(text)\n\u062d\u0630\u0641 \u0627\u0644\u062a\u0637\u0648\u064a\u0644| strip_tatweel(text)\n\u062a\u0646\u0645\u064a\u0637 \u0623\u0634\u0643\u0627\u0644 \u0627\u0644\u0647\u0645\u0632\u0629 \u0627\u0644\u0645\u062e\u062a\u0644\u0641\u0629 | normalize_hamza(text)\n\u062a\u0641\u0631\u064a\u0642 \u0643\u0644\u0645\u0627\u062a \u0627\u0644\u0646\u0635 |tokenize(text)\n\u062a\u0641\u0631\u064a\u0642 \u062c\u0645\u0644 \u0627\u0644\u0646\u0635 |sentence_tokenize(text)\n\n\u0637\u0627\u0644\u0639 \u0627\u0644\u0648\u0638\u0627\u0626\u0641 \u0648\u0627\u0644\u0623\u0645\u062b\u0644\u0629 \u0641\u064a ]\u0645\u0644\u0641 \u0627\u0644\u0645\u0632\u0627\u064a\u0627[\n[features.md](https://github.com/linuxscout/pyarabic/blob/master/doc/features.md)\n\n\n",
    "bugtrack_url": null,
    "license": "GPL",
    "summary": "Arabic text tools for Python",
    "version": "0.6.15",
    "project_urls": {
        "Homepage": "http://pyarabic.sourceforge.net/"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d7640ea5be39e6a6515804cae8c280226d771f42750a08182f9d2e5f3b822694",
                "md5": "7eecff08c087d4539d0ac575e144c045",
                "sha256": "b9a530277876008f5fbe53249c6953b4513dbbf2ea8f4339694e87c8a75d7edf"
            },
            "downloads": -1,
            "filename": "PyArabic-0.6.15-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7eecff08c087d4539d0ac575e144c045",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 126368,
            "upload_time": "2022-06-18T10:47:16",
            "upload_time_iso_8601": "2022-06-18T10:47:16.660520Z",
            "url": "https://files.pythonhosted.org/packages/d7/64/0ea5be39e6a6515804cae8c280226d771f42750a08182f9d2e5f3b822694/PyArabic-0.6.15-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-06-18 10:47:16",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "pyarabic"
}
        
Elapsed time: 0.06633s