ChordReviewsVis


NameChordReviewsVis JSON
Version 0.3.3 PyPI version JSON
download
home_pagehttps://github.com/felix-funes/ChordReviewsVis
SummaryProcess reviews data, apply text preprocessing, and generate a chord plot visualization showing word co-occurrence patterns and sentiment analysis.
upload_time2024-07-14 19:18:02
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseNone
keywords customer reviews sentiment analysis chord plot
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Chord Reviews

## Overview
`ChordReviewsVis` is a Python package designed to process and visualize review data by generating chord plots. These visualizations illustrate word co-occurrence patterns and sentiment analysis, providing insights into the textual data. For this, the visualization relies on the following features:
- **Labels for each node:** The top nouns and adjectives extracted from the reviews were displayed around the graphic.
- **Label bars:** Below the labels, there is a bar whose color illustrates the words' overall frequency in reviews. The darker the color, the more frequent the word.
- **Edges:** The line connecting the words that occur together.
- **Edge thickness:** This characteristic shows how often the connected words appear in the same sentence. The more often they are together, the thicker the line is.
- **Edge color:** The color shows the overall sentiment of the words that are being connected. Red was used for negative sentiments, blue for neutral ones, and green for positive sentiments.

This package was developed by Felix Jose Funes as part of his master's dissertation at NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Portugal, which was supervised by Prof. Nuno Antonio, PhD.

## Installation
To install `ChordReviewsVis`, use pip:
```
pip install ChordReviewsVis
```

## Usage
First, import the necessary libraries and the `ChordReviews` function:
```
import pandas as pd
from ChordReviewsVis import ChordReviews
```

Prepare the DataFrame with a text column containing review data. Then call the `ChordReviews` function:
```
# Load DataFrame
df = pd.read_csv("filepath")

# Generate chord plot
ChordReviews(df, 'review')
```

Some datasets that can be used for this purpose are:

* [IMDB Movie Reviews](https://www.kaggle.com/datasets/atulanandjha/imdb-50k-movie-reviews-test-your-bert)
* [Women's E-Commerce Clothing Reviews](https://www.kaggle.com/datasets/nicapotato/womens-ecommerce-clothing-reviews)
* [Amazon Fine Food Reviews](https://www.kaggle.com/datasets/snap/amazon-fine-food-reviews)

## Function Parameters
- **df** (pandas.DataFrame): DataFrame containing review data.
- **text_column** (str): Name of the column containing the text data.
- **size** (int, optional): Size of the output chord plot. Default is 300.
- **stopwords_to_add** (list, optional): Additional stopwords to include in the stop words set. Default is an empty list.
- **stemming** (bool, optional): Whether to apply stemming to words. Default is False.
- **lemmatization** (bool, optional): Whether to apply lemmatization to words. Default is True.
- **words_to_replace** (dict, optional): A dictionary where keys are words to be replaced and values are the replacements. Default is an empty dictionary.
- **label_text_font_size** (int, optional): Font size for the labels in the chord plot. Default is 12.

## Returns
- **hv.Chord**: A chord plot visualization of word co-occurrence patterns and sentiment analysis.

## Examples 
### Basic Usage
```
# Import necessary libraries
import pandas as pd
from ChordReviewsVis import ChordReviews

# Load dataset
df = pd.read_csv("https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv")

# Generate chord plot
ChordReviews(df, 'review')

```

![Chord plot example](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/6984c3720d6c3b2902a6ff70374040fe4d25f97b/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Basic%20usage.svg)

### Custom Parameters

Though lemmatization is used by default, users have the possibility of using stemming.
```
# Import necessary libraries
import pandas as pd
from ChordReviewsVis import ChordReviews

# Load dataset
df = pd.read_csv("https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv")

# Generate chord plot
ChordReviews(df, 'review', stemming=True, lemmatization=False)

```
![Chord plot example with stemming](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/7b50f84045ddb126ba2a6fe5d036e86b23325625/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Stemming.svg)

To refine the visualization, it is possible to use the "stopwords_to_add" parameter to remove irrelevant words and "words_to_replace" to unify terms with the same meaning.

```
# Import necessary libraries
import pandas as pd
from ChordReviewsVis import ChordReviews

# Load dataset
df = pd.read_csv("https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv")

# Generate chord plot
chord_reviews(df, 'Review', stemming=False, lemmatization=True, stopwords_to_add=["wa", "ha"], words_to_replace={"movie": "film"})
```
![Chord plot using the words_to_replace parameter](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/8335a92c77d0420a9a1eee8db509eae5cdde7af3/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Replacing%20words.svg)

Because of the prevalence of the words "film" and "movie", they may be considered stop words. It is possible to remove them using the parameter "stopwords_to_add". For presentation purposes, the final plot and label text can be resized.
```
# Import necessary libraries
import pandas as pd
from ChordReviewsVis import ChordReviews

# Load dataset
df = pd.read_csv("https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv")

# Generate chord plot
chord_reviews(df, 'Review', stemming=False, lemmatization=True, stopwords_to_add=["wa", "ha", "movie", "film"], label_text_font_size=13, size=400)

```
![Large chord plot with stop words](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/8335a92c77d0420a9a1eee8db509eae5cdde7af3/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Stop%20words%20and%20larger%20size.svg)

## Dependencies
Ensure you have the following libraries installed:
- pandas
- numpy
- nltk
- BeautifulSoup
- re
- holoviews

These can be installed via pip:
```
pip install pandas numpy nltk beautifulsoup4 re holoviews
```

## Contact
For any issues or inquiries, please contact the package maintainer via [LinkedIn](https://www.linkedin.com/in/felix-funes/).

## License
```
MIT License

Copyright (c) 2024 Felix Funes

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/felix-funes/ChordReviewsVis",
    "name": "ChordReviewsVis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "customer reviews, sentiment analysis, chord plot",
    "author": null,
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/2b/cf/294f0d4906edbaa25bd8a8a9d88b80329f444fa576ecaff95e7815b59bc6/chordreviewsvis-0.3.3.tar.gz",
    "platform": null,
    "description": "# Chord Reviews\r\n\r\n## Overview\r\n`ChordReviewsVis` is a Python package designed to process and visualize review data by generating chord plots. These visualizations illustrate word co-occurrence patterns and sentiment analysis, providing insights into the textual data. For this, the visualization relies on the following features:\r\n- **Labels for each node:** The top nouns and adjectives extracted from the reviews were displayed around the graphic.\r\n- **Label bars:** Below the labels, there is a bar whose color illustrates the words' overall frequency in reviews. The darker the color, the more frequent the word.\r\n- **Edges:** The line connecting the words that occur together.\r\n- **Edge thickness:** This characteristic shows how often the connected words appear in the same sentence. The more often they are together, the thicker the line is.\r\n- **Edge color:** The color shows the overall sentiment of the words that are being connected. Red was used for negative sentiments, blue for neutral ones, and green for positive sentiments.\r\n\r\nThis package was developed by Felix Jose Funes as part of his master's dissertation at NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, Portugal, which was supervised by Prof. Nuno Antonio, PhD.\r\n\r\n## Installation\r\nTo install `ChordReviewsVis`, use pip:\r\n```\r\npip install ChordReviewsVis\r\n```\r\n\r\n## Usage\r\nFirst, import the necessary libraries and the `ChordReviews` function:\r\n```\r\nimport pandas as pd\r\nfrom ChordReviewsVis import ChordReviews\r\n```\r\n\r\nPrepare the DataFrame with a text column containing review data. Then call the `ChordReviews` function:\r\n```\r\n# Load DataFrame\r\ndf = pd.read_csv(\"filepath\")\r\n\r\n# Generate chord plot\r\nChordReviews(df, 'review')\r\n```\r\n\r\nSome datasets that can be used for this purpose are:\r\n\r\n* [IMDB Movie Reviews](https://www.kaggle.com/datasets/atulanandjha/imdb-50k-movie-reviews-test-your-bert)\r\n* [Women's E-Commerce Clothing Reviews](https://www.kaggle.com/datasets/nicapotato/womens-ecommerce-clothing-reviews)\r\n* [Amazon Fine Food Reviews](https://www.kaggle.com/datasets/snap/amazon-fine-food-reviews)\r\n\r\n## Function Parameters\r\n- **df** (pandas.DataFrame): DataFrame containing review data.\r\n- **text_column** (str): Name of the column containing the text data.\r\n- **size** (int, optional): Size of the output chord plot. Default is 300.\r\n- **stopwords_to_add** (list, optional): Additional stopwords to include in the stop words set. Default is an empty list.\r\n- **stemming** (bool, optional): Whether to apply stemming to words. Default is False.\r\n- **lemmatization** (bool, optional): Whether to apply lemmatization to words. Default is True.\r\n- **words_to_replace** (dict, optional): A dictionary where keys are words to be replaced and values are the replacements. Default is an empty dictionary.\r\n- **label_text_font_size** (int, optional): Font size for the labels in the chord plot. Default is 12.\r\n\r\n## Returns\r\n- **hv.Chord**: A chord plot visualization of word co-occurrence patterns and sentiment analysis.\r\n\r\n## Examples \r\n### Basic Usage\r\n```\r\n# Import necessary libraries\r\nimport pandas as pd\r\nfrom ChordReviewsVis import ChordReviews\r\n\r\n# Load dataset\r\ndf = pd.read_csv(\"https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv\")\r\n\r\n# Generate chord plot\r\nChordReviews(df, 'review')\r\n\r\n```\r\n\r\n![Chord plot example](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/6984c3720d6c3b2902a6ff70374040fe4d25f97b/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Basic%20usage.svg)\r\n\r\n### Custom Parameters\r\n\r\nThough lemmatization is used by default, users have the possibility of using stemming.\r\n```\r\n# Import necessary libraries\r\nimport pandas as pd\r\nfrom ChordReviewsVis import ChordReviews\r\n\r\n# Load dataset\r\ndf = pd.read_csv(\"https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv\")\r\n\r\n# Generate chord plot\r\nChordReviews(df, 'review', stemming=True, lemmatization=False)\r\n\r\n```\r\n![Chord plot example with stemming](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/7b50f84045ddb126ba2a6fe5d036e86b23325625/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Stemming.svg)\r\n\r\nTo refine the visualization, it is possible to use the \"stopwords_to_add\" parameter to remove irrelevant words and \"words_to_replace\" to unify terms with the same meaning.\r\n\r\n```\r\n# Import necessary libraries\r\nimport pandas as pd\r\nfrom ChordReviewsVis import ChordReviews\r\n\r\n# Load dataset\r\ndf = pd.read_csv(\"https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv\")\r\n\r\n# Generate chord plot\r\nchord_reviews(df, 'Review', stemming=False, lemmatization=True, stopwords_to_add=[\"wa\", \"ha\"], words_to_replace={\"movie\": \"film\"})\r\n```\r\n![Chord plot using the words_to_replace parameter](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/8335a92c77d0420a9a1eee8db509eae5cdde7af3/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Replacing%20words.svg)\r\n\r\nBecause of the prevalence of the words \"film\" and \"movie\", they may be considered stop words. It is possible to remove them using the parameter \"stopwords_to_add\". For presentation purposes, the final plot and label text can be resized.\r\n```\r\n# Import necessary libraries\r\nimport pandas as pd\r\nfrom ChordReviewsVis import ChordReviews\r\n\r\n# Load dataset\r\ndf = pd.read_csv(\"https://github.com/felix-funes/ChordReviewsVis/raw/main/Test%20Dataset%20-%20IMDB%20Movie%20Reviews.csv\")\r\n\r\n# Generate chord plot\r\nchord_reviews(df, 'Review', stemming=False, lemmatization=True, stopwords_to_add=[\"wa\", \"ha\", \"movie\", \"film\"], label_text_font_size=13, size=400)\r\n\r\n```\r\n![Large chord plot with stop words](https://raw.githubusercontent.com/felix-funes/ChordReviewsVis/8335a92c77d0420a9a1eee8db509eae5cdde7af3/Sample%20Chord%20Plot%20-%20IMDB%20Dataset%20-%20Stop%20words%20and%20larger%20size.svg)\r\n\r\n## Dependencies\r\nEnsure you have the following libraries installed:\r\n- pandas\r\n- numpy\r\n- nltk\r\n- BeautifulSoup\r\n- re\r\n- holoviews\r\n\r\nThese can be installed via pip:\r\n```\r\npip install pandas numpy nltk beautifulsoup4 re holoviews\r\n```\r\n\r\n## Contact\r\nFor any issues or inquiries, please contact the package maintainer via [LinkedIn](https://www.linkedin.com/in/felix-funes/).\r\n\r\n## License\r\n```\r\nMIT License\r\n\r\nCopyright (c) 2024 Felix Funes\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the \"Software\"), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n\r\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\r\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\r\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\r\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\r\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\r\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\r\nSOFTWARE.\r\n```\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Process reviews data, apply text preprocessing, and generate a chord plot visualization showing word co-occurrence patterns and sentiment analysis.",
    "version": "0.3.3",
    "project_urls": {
        "GitHub": "https://github.com/felix-funes/ChordReviewsVis",
        "Homepage": "https://github.com/felix-funes/ChordReviewsVis"
    },
    "split_keywords": [
        "customer reviews",
        " sentiment analysis",
        " chord plot"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8c2d927612a3d03f6cc5424135c5a0c151f5319415c7dc7b6dfde3d3cc885037",
                "md5": "4b137202440927d96ee25326a5e2ab9c",
                "sha256": "25ae3b072e4001445d580c96ea7e22b5cc5907d5b59f7e2e2a2ad5a45715ef7d"
            },
            "downloads": -1,
            "filename": "ChordReviewsVis-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4b137202440927d96ee25326a5e2ab9c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 8709,
            "upload_time": "2024-07-14T19:18:01",
            "upload_time_iso_8601": "2024-07-14T19:18:01.426276Z",
            "url": "https://files.pythonhosted.org/packages/8c/2d/927612a3d03f6cc5424135c5a0c151f5319415c7dc7b6dfde3d3cc885037/ChordReviewsVis-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2bcf294f0d4906edbaa25bd8a8a9d88b80329f444fa576ecaff95e7815b59bc6",
                "md5": "3ae4196a31969c9cd013e8df6c743f1f",
                "sha256": "201b5e05d0804ec3ab2b158108128b5754113e9753481b49be00e84460b2b11b"
            },
            "downloads": -1,
            "filename": "chordreviewsvis-0.3.3.tar.gz",
            "has_sig": false,
            "md5_digest": "3ae4196a31969c9cd013e8df6c743f1f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 7699,
            "upload_time": "2024-07-14T19:18:02",
            "upload_time_iso_8601": "2024-07-14T19:18:02.857675Z",
            "url": "https://files.pythonhosted.org/packages/2b/cf/294f0d4906edbaa25bd8a8a9d88b80329f444fa576ecaff95e7815b59bc6/chordreviewsvis-0.3.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-14 19:18:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "felix-funes",
    "github_project": "ChordReviewsVis",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "chordreviewsvis"
}
        
Elapsed time: 9.05478s