tweetben


Nametweetben JSON
Version 0.0.1 PyPI version JSON
download
home_page
SummaryThis is for text preprocessing
upload_time2023-04-26 15:25:20
maintainer
docs_urlNone
authorBehdad Ehsani
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Text and Tweet Preprocessing package



This package is created by Behdad (Ben) Ehsani. The package is designed for cleaning tweets on Twitter immediately and with one-shot coding. Additionally, some functions can be used for text preprocessing. An example is provided to demonstrate efficient usage.


## Installing the library

`pip install preprocessing-text-ben`

## Unistalling the library

`pip uninstall preprocessing-text-ben`



Example of one-shot cleaning the code: 

```
import preprocessing-text-ben as pp

def get_clean(x):
    
    # Convert the string to lowercase
    x = str(x).lower()
    
    # Expand contractions like "don't" to "do not"
    x = pp.cont_to_exp(x)
    
    # Remove any email addresses from the string
    x = pp.remove_emails(x)
    
    # Remove any URLs from the string
    x = pp.remove_urls(x)
    
    # Remove any HTML tags from the string
    x = pp.remove_html_tags(x)
    
    # Remove any retweet tags (RT) from the string
    x = pp.remove_rt(x)
    
    # Remove any accented characters from the string
    x = pp.remove_accented_chars(x)
    
    # Remove any special characters from the string
    x = pp.remove_special_chars(x)
    
    # Return the cleaned string
    return x


#here is the cleaned text in one shot
df['your_cleaned_column'] = df['your_text_column'].apply(lambda x: get_clean(x))

```






version: 0.0.1

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "tweetben",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Behdad Ehsani",
    "author_email": "behdad.ehsani@hec.ca",
    "download_url": "https://files.pythonhosted.org/packages/a3/0b/2d7e915e08bb9da05a34bb4c601d0acebdceef47dd0dc6ba21fd53f18b04/tweetben-0.0.1.tar.gz",
    "platform": null,
    "description": "# Text and Tweet Preprocessing package\n\n\n\nThis package is created by Behdad (Ben) Ehsani. The package is designed for cleaning tweets on Twitter immediately and with one-shot coding. Additionally, some functions can be used for text preprocessing. An example is provided to demonstrate efficient usage.\n\n\n## Installing the library\n\n`pip install preprocessing-text-ben`\n\n## Unistalling the library\n\n`pip uninstall preprocessing-text-ben`\n\n\n\nExample of one-shot cleaning the code: \n\n```\nimport preprocessing-text-ben as pp\n\ndef get_clean(x):\n    \n    # Convert the string to lowercase\n    x = str(x).lower()\n    \n    # Expand contractions like \"don't\" to \"do not\"\n    x = pp.cont_to_exp(x)\n    \n    # Remove any email addresses from the string\n    x = pp.remove_emails(x)\n    \n    # Remove any URLs from the string\n    x = pp.remove_urls(x)\n    \n    # Remove any HTML tags from the string\n    x = pp.remove_html_tags(x)\n    \n    # Remove any retweet tags (RT) from the string\n    x = pp.remove_rt(x)\n    \n    # Remove any accented characters from the string\n    x = pp.remove_accented_chars(x)\n    \n    # Remove any special characters from the string\n    x = pp.remove_special_chars(x)\n    \n    # Return the cleaned string\n    return x\n\n\n#here is the cleaned text in one shot\ndf['your_cleaned_column'] = df['your_text_column'].apply(lambda x: get_clean(x))\n\n```\n\n\n\n\n\n\nversion: 0.0.1\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "This is for text preprocessing",
    "version": "0.0.1",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b1038d3447f7037d88066233ca7c10197e3c0894aed9962cb42c13a800807ae4",
                "md5": "28140dde169246506a2f0558e765a671",
                "sha256": "aa726d3e375a3c712db382eb4a0faee02a633c26c5361940cc169027b2935aae"
            },
            "downloads": -1,
            "filename": "tweetben-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "28140dde169246506a2f0558e765a671",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 4086,
            "upload_time": "2023-04-26T15:25:17",
            "upload_time_iso_8601": "2023-04-26T15:25:17.937475Z",
            "url": "https://files.pythonhosted.org/packages/b1/03/8d3447f7037d88066233ca7c10197e3c0894aed9962cb42c13a800807ae4/tweetben-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a30b2d7e915e08bb9da05a34bb4c601d0acebdceef47dd0dc6ba21fd53f18b04",
                "md5": "1b4035448cf64ab097c487374fe7f270",
                "sha256": "1fa997dece0e1121d684022b87af9e24e9d328b89f2b8e38c6715c3d729b1ced"
            },
            "downloads": -1,
            "filename": "tweetben-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "1b4035448cf64ab097c487374fe7f270",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 3647,
            "upload_time": "2023-04-26T15:25:20",
            "upload_time_iso_8601": "2023-04-26T15:25:20.726641Z",
            "url": "https://files.pythonhosted.org/packages/a3/0b/2d7e915e08bb9da05a34bb4c601d0acebdceef47dd0dc6ba21fd53f18b04/tweetben-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-04-26 15:25:20",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "tweetben"
}
        
Elapsed time: 1.36661s