# Preprocess YourText
Preprocess YourText is a Python package for text preprocessing tasks, designed to simplify and streamline the process of cleaning and preparing text data for natural language processing (NLP) tasks.
## Features
- **HTML Tag Removal**: Easily remove HTML tags from text data.
- **URL Removal**: Remove URLs from text data.
- **Email Removal**: Remove email addresses from text data.
- **Special Character Removal**: Remove special characters from text data.
- **Accent Removal**: Remove accents from characters in text data.
- **Contractions Expansion**: Expand contractions in text data (e.g., "don't" to "do not").
- **Lemmatization**: Lemmatize words in text data to their base form.
- **Spelling Correction**: Correct spelling mistakes in text data.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
## Installation
You can install the package via pip:
```bash
pip install mngdataclean
## Usage
import mngdataclean as mdc
# Example usage:
text = "This is an example text with HTML tags <b>and URLs</b>."
clean_text = mdc.get_clean(text)
print(clean_text)
#output is
This is an example text with HTML tags and URLs.
Raw data
{
"_id": null,
"home_page": "https://github.com/Nagaganesh21/mngdataclean",
"name": "mngdataclean",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "text,preprocessing",
"author": "Nagaganesh",
"author_email": "mnagaganesh21@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/1d/36/3abc07be11c27d744de165d982d9a21395db0b7639fa8cc2858417828a8d/mngdataclean-0.4.2.tar.gz",
"platform": null,
"description": "# Preprocess YourText\n\nPreprocess YourText is a Python package for text preprocessing tasks, designed to simplify and streamline the process of cleaning and preparing text data for natural language processing (NLP) tasks.\n\n## Features\n\n- **HTML Tag Removal**: Easily remove HTML tags from text data.\n- **URL Removal**: Remove URLs from text data.\n- **Email Removal**: Remove email addresses from text data.\n- **Special Character Removal**: Remove special characters from text data.\n- **Accent Removal**: Remove accents from characters in text data.\n- **Contractions Expansion**: Expand contractions in text data (e.g., \"don't\" to \"do not\").\n- **Lemmatization**: Lemmatize words in text data to their base form.\n- **Spelling Correction**: Correct spelling mistakes in text data.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n## Installation\n\nYou can install the package via pip:\n\n```bash\npip install mngdataclean\n\n## Usage\nimport mngdataclean as mdc\n\n# Example usage:\ntext = \"This is an example text with HTML tags <b>and URLs</b>.\"\nclean_text = mdc.get_clean(text)\nprint(clean_text)\n\n#output is \nThis is an example text with HTML tags and URLs.\n\n\n\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Text preprocessing package",
"version": "0.4.2",
"project_urls": {
"Homepage": "https://github.com/Nagaganesh21/mngdataclean"
},
"split_keywords": [
"text",
"preprocessing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fdaab9f62a3066d2fca6398412c82753f6ae4b65fe3900f77b598ab51b109608",
"md5": "9aa4d1c01739061e3cc31b11411151fd",
"sha256": "9d9b694c4e4bf8b851a4e589fd7ce1918ead7f3bc2562f5de26b785729aff000"
},
"downloads": -1,
"filename": "mngdataclean-0.4.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9aa4d1c01739061e3cc31b11411151fd",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 3557,
"upload_time": "2024-03-03T14:27:03",
"upload_time_iso_8601": "2024-03-03T14:27:03.049864Z",
"url": "https://files.pythonhosted.org/packages/fd/aa/b9f62a3066d2fca6398412c82753f6ae4b65fe3900f77b598ab51b109608/mngdataclean-0.4.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1d363abc07be11c27d744de165d982d9a21395db0b7639fa8cc2858417828a8d",
"md5": "39ae576651d7f069ca3395ee7b0a9362",
"sha256": "b0a30fcadf1a1669f2a9f23295abcc7eef177eaa343bca4897da8d4bd40f4e57"
},
"downloads": -1,
"filename": "mngdataclean-0.4.2.tar.gz",
"has_sig": false,
"md5_digest": "39ae576651d7f069ca3395ee7b0a9362",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3322,
"upload_time": "2024-03-03T14:27:07",
"upload_time_iso_8601": "2024-03-03T14:27:07.818503Z",
"url": "https://files.pythonhosted.org/packages/1d/36/3abc07be11c27d744de165d982d9a21395db0b7639fa8cc2858417828a8d/mngdataclean-0.4.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-03-03 14:27:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Nagaganesh21",
"github_project": "mngdataclean",
"github_not_found": true,
"lcname": "mngdataclean"
}