Name | text-tweet-ben JSON |
Version |
0.0.1
JSON |
| download |
home_page | |
Summary | This is for text preprocessing |
upload_time | 2023-04-26 15:10:59 |
maintainer | |
docs_url | None |
author | Behdad Ehsani |
requires_python | |
license | |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Text and Tweet Preprocessing package
This package is created by Behdad (Ben) Ehsani. The package is designed for cleaning tweets on Twitter immediately and with one-shot coding. Additionally, some functions can be used for text preprocessing. An example is provided to demonstrate efficient usage.
## Installing the library
`pip install preprocessing-text-ben`
## Unistalling the library
`pip uninstall preprocessing-text-ben`
Example of one-shot cleaning the code:
```
import preprocessing-text-ben as pp
def get_clean(x):
# Convert the string to lowercase
x = str(x).lower()
# Expand contractions like "don't" to "do not"
x = pp.cont_to_exp(x)
# Remove any email addresses from the string
x = pp.remove_emails(x)
# Remove any URLs from the string
x = pp.remove_urls(x)
# Remove any HTML tags from the string
x = pp.remove_html_tags(x)
# Remove any retweet tags (RT) from the string
x = pp.remove_rt(x)
# Remove any accented characters from the string
x = pp.remove_accented_chars(x)
# Remove any special characters from the string
x = pp.remove_special_chars(x)
# Return the cleaned string
return x
#here is the cleaned text in one shot
df['your_cleaned_column'] = df['your_text_column'].apply(lambda x: get_clean(x))
```
version: 0.0.1
Raw data
{
"_id": null,
"home_page": "",
"name": "text-tweet-ben",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Behdad Ehsani",
"author_email": "behdad.ehsani@hec.ca",
"download_url": "https://files.pythonhosted.org/packages/14/e0/de8311a7c78b4559e1a53ce55c3cde1685c4eecc1b8ea9d768e95b0b9066/text_tweet_ben-0.0.1.tar.gz",
"platform": null,
"description": "# Text and Tweet Preprocessing package\n\n\n\nThis package is created by Behdad (Ben) Ehsani. The package is designed for cleaning tweets on Twitter immediately and with one-shot coding. Additionally, some functions can be used for text preprocessing. An example is provided to demonstrate efficient usage.\n\n\n## Installing the library\n\n`pip install preprocessing-text-ben`\n\n## Unistalling the library\n\n`pip uninstall preprocessing-text-ben`\n\n\n\nExample of one-shot cleaning the code: \n\n```\nimport preprocessing-text-ben as pp\n\ndef get_clean(x):\n \n # Convert the string to lowercase\n x = str(x).lower()\n \n # Expand contractions like \"don't\" to \"do not\"\n x = pp.cont_to_exp(x)\n \n # Remove any email addresses from the string\n x = pp.remove_emails(x)\n \n # Remove any URLs from the string\n x = pp.remove_urls(x)\n \n # Remove any HTML tags from the string\n x = pp.remove_html_tags(x)\n \n # Remove any retweet tags (RT) from the string\n x = pp.remove_rt(x)\n \n # Remove any accented characters from the string\n x = pp.remove_accented_chars(x)\n \n # Remove any special characters from the string\n x = pp.remove_special_chars(x)\n \n # Return the cleaned string\n return x\n\n\n#here is the cleaned text in one shot\ndf['your_cleaned_column'] = df['your_text_column'].apply(lambda x: get_clean(x))\n\n```\n\n\n\n\n\n\nversion: 0.0.1\n",
"bugtrack_url": null,
"license": "",
"summary": "This is for text preprocessing",
"version": "0.0.1",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b80fd3ba1b9b23642e488dadd6ae2239e4f9223005772cc041faf7e0ff97ab5e",
"md5": "b6844463d03f481d6379672caa0c008f",
"sha256": "fb338bfcce3362f263bacbc977e6611d6afdc6ffb0ff8130ba284df9fdb4db14"
},
"downloads": -1,
"filename": "text_tweet_ben-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b6844463d03f481d6379672caa0c008f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 4139,
"upload_time": "2023-04-26T15:10:54",
"upload_time_iso_8601": "2023-04-26T15:10:54.410865Z",
"url": "https://files.pythonhosted.org/packages/b8/0f/d3ba1b9b23642e488dadd6ae2239e4f9223005772cc041faf7e0ff97ab5e/text_tweet_ben-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "14e0de8311a7c78b4559e1a53ce55c3cde1685c4eecc1b8ea9d768e95b0b9066",
"md5": "b19d7c111499e4668f46741a0a52ad21",
"sha256": "55a51bda68b9b78c928d5632a544aa7d2bfe3ad1c6fc3d93be42076e3fe61522"
},
"downloads": -1,
"filename": "text_tweet_ben-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "b19d7c111499e4668f46741a0a52ad21",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3656,
"upload_time": "2023-04-26T15:10:59",
"upload_time_iso_8601": "2023-04-26T15:10:59.040770Z",
"url": "https://files.pythonhosted.org/packages/14/e0/de8311a7c78b4559e1a53ce55c3cde1685c4eecc1b8ea9d768e95b0b9066/text_tweet_ben-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-26 15:10:59",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "text-tweet-ben"
}