fathah


Namefathah JSON
Version 0.0.2 PyPI version JSON
download
home_pagehttps://github.com/fathah/fathah_python
SummaryLightweight NLP preprocessing package for Arabic language
upload_time2022-12-11 08:48:23
maintainer
docs_urlNone
authorAbdul Fathah KA
requires_python
license
keywords nlp fathah arabic
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# fathah

Lightweight NLP preprocessing package for Arabic language



## Installation

```sh

pip install fathah

```

## Usage

```python

from Fathah import TextClean

```



## Methods 



 ### Clean the text 

`clean_text` function includes all these functions:   

  >      1. remove_emails  

  >      2. remove_URLs  

  >      3. remove_mentions   

  >      4. hashtags_to_words     

  >      5. remove_punctuations  

  >      6. normalize_arabic   

  >      7. remove_diacritics   

  >      8. remove_repeating_char   

  >      9. remove_stop_words   

  >      10. remove_emojis



 In other words, `clean_text` includes all functions except `remove_hashtags` 

```

text_cleaned1 = TextClean.clean_text(text)

print(text_cleaned1)

```



 ### Remove repeating character

`remove_repeating_char` function

```

text_cleaned2 = TextClean.remove_repeating_char(text)

print(text_cleaned2)

```



 ### Remove punctuations

`remove_punctuations` function

```

text_cleaned3 = TextClean.remove_punctuations(text)

print(text_cleaned3)

```



 ### Normalize Arabic

`normalize_arabic` function



```

text_cleaned4 = TextClean.normalize_arabic(text)

print(text_cleaned4)

```



 ### Remove diacritics

`remove_diacritics` function

```

text_cleaned5= TextClean.remove_diacritics(text)

print(text_cleaned5)

```



 ### Remove stop words

`remove_stop_words` function

```

text_cleaned6 = TextClean.remove_stop_words(text)

print(text_cleaned6)

```



 ### Remove emojis

`remove_emojis` function

```

text_cleaned7 = TextClean.remove_emojis(text)

print(text_cleaned7)

```



 ### Remove mentions

`remove_mentions` function

```

text_cleaned8 = TextClean.remove_mentions(text)

print(text_cleaned8)

```



 ### Convert any hashtags to words

`hashtags_to_words` function

```

text_cleaned9 = TextClean.hashtags_to_words(text)

print(text_cleaned9)

```



 ### Remove hashtags

`remove_hashtags` function

```

text_cleaned10 = TextClean.remove_hashtags(text)

print(text_cleaned10)

```



 ### Remove emails

`remove_emails` function

```

text_cleaned11 = TextClean.remove_emails(text)

print(text_cleaned11)

```



 ### Remove URLs

`remove_URLs` function

```

text_cleaned12 = TextClean.remove_URLs(text)

print(text_cleaned12)

```





## Example

```python

from fathah import TextClean



cleaner = TextClean(text)

cleaner.remove_diacritics()



# Outputs: السلام عليكم ورحمة الله وبركاته

```





*This package is under development. Contributions are highly welcome*



[Github](https://github.com/fathah) | [IG](https://instagram.com/fatha_cr)


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/fathah/fathah_python",
    "name": "fathah",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "nlp,fathah,arabic",
    "author": "Abdul Fathah KA",
    "author_email": "fathah@ziqx.in",
    "download_url": "https://files.pythonhosted.org/packages/67/83/ae299c84346b5bf62a2f285bf098d07a077b85f193b3218c05c73c51f3b8/fathah-0.0.2.tar.gz",
    "platform": null,
    "description": "\r\n# fathah\r\n\r\nLightweight NLP preprocessing package for Arabic language\r\n\r\n\r\n\r\n## Installation\r\n\r\n```sh\r\n\r\npip install fathah\r\n\r\n```\r\n\r\n## Usage\r\n\r\n```python\r\n\r\nfrom Fathah import TextClean\r\n\r\n```\r\n\r\n\r\n\r\n## Methods \r\n\r\n\r\n\r\n ### Clean the text \r\n\r\n`clean_text` function includes all these functions:   \r\n\r\n  >      1. remove_emails  \r\n\r\n  >      2. remove_URLs  \r\n\r\n  >      3. remove_mentions   \r\n\r\n  >      4. hashtags_to_words     \r\n\r\n  >      5. remove_punctuations  \r\n\r\n  >      6. normalize_arabic   \r\n\r\n  >      7. remove_diacritics   \r\n\r\n  >      8. remove_repeating_char   \r\n\r\n  >      9. remove_stop_words   \r\n\r\n  >      10. remove_emojis\r\n\r\n\r\n\r\n In other words, `clean_text` includes all functions except `remove_hashtags` \r\n\r\n```\r\n\r\ntext_cleaned1 = TextClean.clean_text(text)\r\n\r\nprint(text_cleaned1)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove repeating character\r\n\r\n`remove_repeating_char` function\r\n\r\n```\r\n\r\ntext_cleaned2 = TextClean.remove_repeating_char(text)\r\n\r\nprint(text_cleaned2)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove punctuations\r\n\r\n`remove_punctuations` function\r\n\r\n```\r\n\r\ntext_cleaned3 = TextClean.remove_punctuations(text)\r\n\r\nprint(text_cleaned3)\r\n\r\n```\r\n\r\n\r\n\r\n ### Normalize Arabic\r\n\r\n`normalize_arabic` function\r\n\r\n\r\n\r\n```\r\n\r\ntext_cleaned4 = TextClean.normalize_arabic(text)\r\n\r\nprint(text_cleaned4)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove diacritics\r\n\r\n`remove_diacritics` function\r\n\r\n```\r\n\r\ntext_cleaned5= TextClean.remove_diacritics(text)\r\n\r\nprint(text_cleaned5)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove stop words\r\n\r\n`remove_stop_words` function\r\n\r\n```\r\n\r\ntext_cleaned6 = TextClean.remove_stop_words(text)\r\n\r\nprint(text_cleaned6)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove emojis\r\n\r\n`remove_emojis` function\r\n\r\n```\r\n\r\ntext_cleaned7 = TextClean.remove_emojis(text)\r\n\r\nprint(text_cleaned7)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove mentions\r\n\r\n`remove_mentions` function\r\n\r\n```\r\n\r\ntext_cleaned8 = TextClean.remove_mentions(text)\r\n\r\nprint(text_cleaned8)\r\n\r\n```\r\n\r\n\r\n\r\n ### Convert any hashtags to words\r\n\r\n`hashtags_to_words` function\r\n\r\n```\r\n\r\ntext_cleaned9 = TextClean.hashtags_to_words(text)\r\n\r\nprint(text_cleaned9)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove hashtags\r\n\r\n`remove_hashtags` function\r\n\r\n```\r\n\r\ntext_cleaned10 = TextClean.remove_hashtags(text)\r\n\r\nprint(text_cleaned10)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove emails\r\n\r\n`remove_emails` function\r\n\r\n```\r\n\r\ntext_cleaned11 = TextClean.remove_emails(text)\r\n\r\nprint(text_cleaned11)\r\n\r\n```\r\n\r\n\r\n\r\n ### Remove URLs\r\n\r\n`remove_URLs` function\r\n\r\n```\r\n\r\ntext_cleaned12 = TextClean.remove_URLs(text)\r\n\r\nprint(text_cleaned12)\r\n\r\n```\r\n\r\n\r\n\r\n\r\n\r\n## Example\r\n\r\n```python\r\n\r\nfrom fathah import TextClean\r\n\r\n\r\n\r\ncleaner = TextClean(text)\r\n\r\ncleaner.remove_diacritics()\r\n\r\n\r\n\r\n# Outputs: \u0627\u0644\u0633\u0644\u0627\u0645 \u0639\u0644\u064a\u0643\u0645 \u0648\u0631\u062d\u0645\u0629 \u0627\u0644\u0644\u0647 \u0648\u0628\u0631\u0643\u0627\u062a\u0647\r\n\r\n```\r\n\r\n\r\n\r\n\r\n\r\n*This package is under development. Contributions are highly welcome*\r\n\r\n\r\n\r\n[Github](https://github.com/fathah) | [IG](https://instagram.com/fatha_cr)\r\n\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Lightweight NLP preprocessing package for Arabic language",
    "version": "0.0.2",
    "split_keywords": [
        "nlp",
        "fathah",
        "arabic"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "2f4083f3ac2b549b6c5b9176176c35c2",
                "sha256": "9ef6c0e02f13396e510b707c8d8da36769b30dda87bb35d207e4d06da21fa96f"
            },
            "downloads": -1,
            "filename": "fathah-0.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2f4083f3ac2b549b6c5b9176176c35c2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9518,
            "upload_time": "2022-12-11T08:48:19",
            "upload_time_iso_8601": "2022-12-11T08:48:19.340346Z",
            "url": "https://files.pythonhosted.org/packages/5e/41/f553b8d235813779c47c9d5b2267990ab186cafab55ceed36651c21eeef5/fathah-0.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "d79384d81725f3b47a0761f75d37960d",
                "sha256": "c0a4e56cb44d0b6456e0885eaad3990913900b4bbc09fd509b67655a9e4397c2"
            },
            "downloads": -1,
            "filename": "fathah-0.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d79384d81725f3b47a0761f75d37960d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10722,
            "upload_time": "2022-12-11T08:48:23",
            "upload_time_iso_8601": "2022-12-11T08:48:23.334208Z",
            "url": "https://files.pythonhosted.org/packages/67/83/ae299c84346b5bf62a2f285bf098d07a077b85f193b3218c05c73c51f3b8/fathah-0.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-11 08:48:23",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "fathah",
    "github_project": "fathah_python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "fathah"
}
        
Elapsed time: 0.02169s