text-tweet-ben


Nametext-tweet-ben JSON
Version 0.0.1 PyPI version JSON
download
home_page
SummaryThis is for text preprocessing
upload_time2023-04-26 15:10:59
maintainer
docs_urlNone
authorBehdad Ehsani
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Text and Tweet Preprocessing package



This package is created by Behdad (Ben) Ehsani. The package is designed for cleaning tweets on Twitter immediately and with one-shot coding. Additionally, some functions can be used for text preprocessing. An example is provided to demonstrate efficient usage.


## Installing the library

`pip install preprocessing-text-ben`

## Unistalling the library

`pip uninstall preprocessing-text-ben`



Example of one-shot cleaning the code: 

```
import preprocessing-text-ben as pp

def get_clean(x):
    
    # Convert the string to lowercase
    x = str(x).lower()
    
    # Expand contractions like "don't" to "do not"
    x = pp.cont_to_exp(x)
    
    # Remove any email addresses from the string
    x = pp.remove_emails(x)
    
    # Remove any URLs from the string
    x = pp.remove_urls(x)
    
    # Remove any HTML tags from the string
    x = pp.remove_html_tags(x)
    
    # Remove any retweet tags (RT) from the string
    x = pp.remove_rt(x)
    
    # Remove any accented characters from the string
    x = pp.remove_accented_chars(x)
    
    # Remove any special characters from the string
    x = pp.remove_special_chars(x)
    
    # Return the cleaned string
    return x


#here is the cleaned text in one shot
df['your_cleaned_column'] = df['your_text_column'].apply(lambda x: get_clean(x))

```






version: 0.0.1

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "text-tweet-ben",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Behdad Ehsani",
    "author_email": "behdad.ehsani@hec.ca",
    "download_url": "https://files.pythonhosted.org/packages/14/e0/de8311a7c78b4559e1a53ce55c3cde1685c4eecc1b8ea9d768e95b0b9066/text_tweet_ben-0.0.1.tar.gz",
    "platform": null,
    "description": "# Text and Tweet Preprocessing package\n\n\n\nThis package is created by Behdad (Ben) Ehsani. The package is designed for cleaning tweets on Twitter immediately and with one-shot coding. Additionally, some functions can be used for text preprocessing. An example is provided to demonstrate efficient usage.\n\n\n## Installing the library\n\n`pip install preprocessing-text-ben`\n\n## Unistalling the library\n\n`pip uninstall preprocessing-text-ben`\n\n\n\nExample of one-shot cleaning the code: \n\n```\nimport preprocessing-text-ben as pp\n\ndef get_clean(x):\n    \n    # Convert the string to lowercase\n    x = str(x).lower()\n    \n    # Expand contractions like \"don't\" to \"do not\"\n    x = pp.cont_to_exp(x)\n    \n    # Remove any email addresses from the string\n    x = pp.remove_emails(x)\n    \n    # Remove any URLs from the string\n    x = pp.remove_urls(x)\n    \n    # Remove any HTML tags from the string\n    x = pp.remove_html_tags(x)\n    \n    # Remove any retweet tags (RT) from the string\n    x = pp.remove_rt(x)\n    \n    # Remove any accented characters from the string\n    x = pp.remove_accented_chars(x)\n    \n    # Remove any special characters from the string\n    x = pp.remove_special_chars(x)\n    \n    # Return the cleaned string\n    return x\n\n\n#here is the cleaned text in one shot\ndf['your_cleaned_column'] = df['your_text_column'].apply(lambda x: get_clean(x))\n\n```\n\n\n\n\n\n\nversion: 0.0.1\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "This is for text preprocessing",
    "version": "0.0.1",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b80fd3ba1b9b23642e488dadd6ae2239e4f9223005772cc041faf7e0ff97ab5e",
                "md5": "b6844463d03f481d6379672caa0c008f",
                "sha256": "fb338bfcce3362f263bacbc977e6611d6afdc6ffb0ff8130ba284df9fdb4db14"
            },
            "downloads": -1,
            "filename": "text_tweet_ben-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b6844463d03f481d6379672caa0c008f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4139,
            "upload_time": "2023-04-26T15:10:54",
            "upload_time_iso_8601": "2023-04-26T15:10:54.410865Z",
            "url": "https://files.pythonhosted.org/packages/b8/0f/d3ba1b9b23642e488dadd6ae2239e4f9223005772cc041faf7e0ff97ab5e/text_tweet_ben-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "14e0de8311a7c78b4559e1a53ce55c3cde1685c4eecc1b8ea9d768e95b0b9066",
                "md5": "b19d7c111499e4668f46741a0a52ad21",
                "sha256": "55a51bda68b9b78c928d5632a544aa7d2bfe3ad1c6fc3d93be42076e3fe61522"
            },
            "downloads": -1,
            "filename": "text_tweet_ben-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "b19d7c111499e4668f46741a0a52ad21",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 3656,
            "upload_time": "2023-04-26T15:10:59",
            "upload_time_iso_8601": "2023-04-26T15:10:59.040770Z",
            "url": "https://files.pythonhosted.org/packages/14/e0/de8311a7c78b4559e1a53ce55c3cde1685c4eecc1b8ea9d768e95b0b9066/text_tweet_ben-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-26 15:10:59",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "text-tweet-ben"
}
        
Elapsed time: 1.46466s