<h1 align = "center">
<img alt = "favicon" src = "https://cdn-icons-png.flaticon.com/512/10306/10306116.png" height = 125px><br>
NLPurify
</h1>
<div align = "center">
[![Documentation Status](https://readthedocs.org/projects/nlpurify/badge/?version=latest&style=plastic)](https://nlpurify.readthedocs.io/en/latest/?badge=latest)
[![GitHub Issues](https://img.shields.io/github/issues/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/issues)
[![GitHub Forks](https://img.shields.io/github/forks/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/network)
[![GitHub Stars](https://img.shields.io/github/stars/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/stargazers)
[![LICENSE File](https://img.shields.io/github/license/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/blob/master/LICENSE)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/NLPurify?style=plastic)](https://pypistats.org/packages/pandas-wizard)
[![PyPI Latest Release](https://img.shields.io/pypi/v/NLPurify.svg?style=plastic)](https://pypi.org/project/NLPurify/)
[![GuardRails badge](https://api.guardrails.io/v2/badges/252951?token=2e1d82f6a737cdd3151ea0c869ee61c86196c3a05d17b0d91bf5a032e7766dc0)](https://dashboard.guardrails.io/gh/sharkutilities/repos/252951)
</div>
<div align = "justify">
A text cleaning and extraction engine was developed using a combination of traditional techniques like Unicode translations,
cleaning using regular expressions, and modern tools like "natural language processing" and "large language models" to
detect and clean long texts and create word vectors.
## Getting Started
The source code is hosted at GitHub: [**sharkutilities/NLPurify**](https://github.com/sharkutilities/NLPurify).
The binary installers for the latest release are available at the [Python Package Index (PyPI)](https://pypi.org/project/NLPurify/).
```bash
pip install -U NLPurify
```
The module is currently under development, and new ideas are welcomed. Raise a new PR/issue for the same.
The changes between each release are available [here](./CHANGELOG.md).
</div>
---
> [!CAUTION]
> **This code depreciates the existing GitHub Gist which was previously designed.**
> Check [`#1`](https://github.com/sharkutilities/NLPurify/issues/1) for more details.
> [!NOTE]
> **_Legacy_ codes are available as a submodule.**
> Check [`#5`](https://github.com/sharkutilities/NLPurify/issues/5) for more details.
Raw data
{
"_id": null,
"home_page": "https://github.com/sharkutilities/NLPurify",
"name": "NLPurify",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "nlp, text-cleaning, nlp-cleaning, llm, utility, utilities, util, utils, functions, wrappers, data science, data analysis, data scientist, data analyst",
"author": "shark-utilities developers",
"author_email": "neuralNOD@outlook.com",
"download_url": "https://files.pythonhosted.org/packages/6f/a9/5c6107e0407036adabf51eadce83f9663e5c26968b08839c679811265905/NLPurify-0.0.1.dev2.tar.gz",
"platform": null,
"description": "<h1 align = \"center\">\r\n <img alt = \"favicon\" src = \"https://cdn-icons-png.flaticon.com/512/10306/10306116.png\" height = 125px><br>\r\n NLPurify\r\n</h1>\r\n\r\n<div align = \"center\">\r\n\r\n[![Documentation Status](https://readthedocs.org/projects/nlpurify/badge/?version=latest&style=plastic)](https://nlpurify.readthedocs.io/en/latest/?badge=latest)\r\n[![GitHub Issues](https://img.shields.io/github/issues/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/issues)\r\n[![GitHub Forks](https://img.shields.io/github/forks/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/network)\r\n[![GitHub Stars](https://img.shields.io/github/stars/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/stargazers)\r\n[![LICENSE File](https://img.shields.io/github/license/sharkutilities/NLPurify?style=plastic)](https://github.com/sharkutilities/NLPurify/blob/master/LICENSE)\r\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/NLPurify?style=plastic)](https://pypistats.org/packages/pandas-wizard)\r\n[![PyPI Latest Release](https://img.shields.io/pypi/v/NLPurify.svg?style=plastic)](https://pypi.org/project/NLPurify/)\r\n\r\n[![GuardRails badge](https://api.guardrails.io/v2/badges/252951?token=2e1d82f6a737cdd3151ea0c869ee61c86196c3a05d17b0d91bf5a032e7766dc0)](https://dashboard.guardrails.io/gh/sharkutilities/repos/252951)\r\n\r\n</div>\r\n\r\n<div align = \"justify\">\r\n\r\nA text cleaning and extraction engine was developed using a combination of traditional techniques like Unicode translations,\r\ncleaning using regular expressions, and modern tools like \"natural language processing\" and \"large language models\" to\r\ndetect and clean long texts and create word vectors.\r\n\r\n## Getting Started\r\n\r\nThe source code is hosted at GitHub: [**sharkutilities/NLPurify**](https://github.com/sharkutilities/NLPurify).\r\nThe binary installers for the latest release are available at the [Python Package Index (PyPI)](https://pypi.org/project/NLPurify/).\r\n\r\n```bash\r\npip install -U NLPurify\r\n```\r\n\r\nThe module is currently under development, and new ideas are welcomed. Raise a new PR/issue for the same.\r\nThe changes between each release are available [here](./CHANGELOG.md).\r\n\r\n</div>\r\n\r\n---\r\n\r\n> [!CAUTION]\r\n> **This code depreciates the existing GitHub Gist which was previously designed.**\r\n> Check [`#1`](https://github.com/sharkutilities/NLPurify/issues/1) for more details.\r\n\r\n> [!NOTE]\r\n> **_Legacy_ codes are available as a submodule.**\r\n> Check [`#5`](https://github.com/sharkutilities/NLPurify/issues/5) for more details.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Text cleaning and feature extractions using NLP, Traditional approach.",
"version": "0.0.1.dev2",
"project_urls": {
"Homepage": "https://github.com/sharkutilities/NLPurify",
"Issue Tracker": "https://github.com/sharkutilities/NLPurify/issues",
"Org. Homepage": "https://github.com/sharkutilities"
},
"split_keywords": [
"nlp",
" text-cleaning",
" nlp-cleaning",
" llm",
" utility",
" utilities",
" util",
" utils",
" functions",
" wrappers",
" data science",
" data analysis",
" data scientist",
" data analyst"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6fa95c6107e0407036adabf51eadce83f9663e5c26968b08839c679811265905",
"md5": "b11413702300b68d02f554e6782b21b6",
"sha256": "d79924e62fd64a8408de61170850d2aa263551517817ebdb8af36a35f9acf6b6"
},
"downloads": -1,
"filename": "NLPurify-0.0.1.dev2.tar.gz",
"has_sig": false,
"md5_digest": "b11413702300b68d02f554e6782b21b6",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 8794,
"upload_time": "2024-08-18T15:26:57",
"upload_time_iso_8601": "2024-08-18T15:26:57.481798Z",
"url": "https://files.pythonhosted.org/packages/6f/a9/5c6107e0407036adabf51eadce83f9663e5c26968b08839c679811265905/NLPurify-0.0.1.dev2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-18 15:26:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "sharkutilities",
"github_project": "NLPurify",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "nlpurify"
}