# Text Cleaning of English Language Python Package
Text Cleaning is a common preprocessing technique for almost all NLP task. Mainly I have designed the package for Text Classification Task. Also You can use it for other NLP task also. You are welcome to contribute the package.
**Install the package**
```bash
pip install eng-text-cleaner
```
There has number of methods to clean the text such as removing emoji, punctuation, html_tags, urls, characters not words or digits or underscore, digits, stopwords, spell correction, lemmatize the words. One Method named clean text will apply all the methods to clean the text at a glance. Let's explore the simple package.
```python
from eng_text_cleaner import preprocessing
```
Start by removing punctuation
```python
text = "Neither too small nor too large, and nice resolution at a good price."
# create textcleaner instance
textcleaner = preprocessing.TextCleaner()
# remove punctuation
textcleaner.remove_punctuation(text)
```
Output:
```bash
Neither too small nor too large and nice resolution at a good price
```
For Clean the text totally
```python
# fully clean the text
textcleaner.clean_text(text)
```
Output:
```bash
neither small large nice resolution good price
```
Author:
* **Md Abdullah Al Hasib**
Raw data
{
"_id": null,
"home_page": "https://github.com/Al-Hasib/eng_text_cleaner",
"name": "eng-text-cleaner",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "abdullah",
"author_email": "alhasib.iu.cse@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/22/01/008d487986f6c038e59ebd81eef8f99e77820855a28b79f781639104754c/eng_text_cleaner-0.0.5.tar.gz",
"platform": null,
"description": "# Text Cleaning of English Language Python Package\r\n\r\nText Cleaning is a common preprocessing technique for almost all NLP task. Mainly I have designed the package for Text Classification Task. Also You can use it for other NLP task also. You are welcome to contribute the package.\r\n\r\n**Install the package**\r\n\r\n```bash\r\npip install eng-text-cleaner\r\n```\r\n\r\nThere has number of methods to clean the text such as removing emoji, punctuation, html_tags, urls, characters not words or digits or underscore, digits, stopwords, spell correction, lemmatize the words. One Method named clean text will apply all the methods to clean the text at a glance. Let's explore the simple package.\r\n```python\r\nfrom eng_text_cleaner import preprocessing \r\n```\r\nStart by removing punctuation\r\n```python\r\ntext = \"Neither too small nor too large, and nice resolution at a good price.\"\r\n# create textcleaner instance\r\ntextcleaner = preprocessing.TextCleaner()\r\n# remove punctuation\r\ntextcleaner.remove_punctuation(text)\r\n```\r\nOutput:\r\n```bash\r\nNeither too small nor too large and nice resolution at a good price\r\n```\r\nFor Clean the text totally\r\n```python\r\n# fully clean the text\r\ntextcleaner.clean_text(text)\r\n```\r\nOutput:\r\n```bash\r\nneither small large nice resolution good price\r\n```\r\n\r\nAuthor:\r\n* **Md Abdullah Al Hasib**\r\n",
"bugtrack_url": null,
"license": null,
"summary": "This package is for clean the text as text processing",
"version": "0.0.5",
"project_urls": {
"Homepage": "https://github.com/Al-Hasib/eng_text_cleaner"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2201008d487986f6c038e59ebd81eef8f99e77820855a28b79f781639104754c",
"md5": "610f7b7b9bd5d1e896ce54b29b85eaad",
"sha256": "e9a66e2f87b0fd5c47f7012375a8d7f124e83357e7eda7047bb6633cf23898f2"
},
"downloads": -1,
"filename": "eng_text_cleaner-0.0.5.tar.gz",
"has_sig": false,
"md5_digest": "610f7b7b9bd5d1e896ce54b29b85eaad",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 4140,
"upload_time": "2024-08-22T04:57:02",
"upload_time_iso_8601": "2024-08-22T04:57:02.709549Z",
"url": "https://files.pythonhosted.org/packages/22/01/008d487986f6c038e59ebd81eef8f99e77820855a28b79f781639104754c/eng_text_cleaner-0.0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-22 04:57:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Al-Hasib",
"github_project": "eng_text_cleaner",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "eng-text-cleaner"
}