AnimatedWordCloud


NameAnimatedWordCloud JSON
Version 1.0.9 PyPI version JSON
download
home_pagehttps://github.com/PetrKorab/AnimatedWordCloud
SummaryAnimated version of classic word cloud for time-series text data
upload_time2024-05-05 20:07:35
maintainerNone
docs_urlNone
authorPetr Koráb
requires_python<3.9,>=3.8
licenseOSI Approved :: Apache Software License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![pypi](https://img.shields.io/pypi/v/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)
[![python](https://img.shields.io/pypi/pyversions/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)
[![License: MIT](https://badgen.net/badge/license/apache-2-0/blue)]([https://opensource.org/licenses/MIT](https://opensource.org/license/apache-2-0/))


# AnimatedWordCloud
**Animated version of classic word cloud for time-series text data**

Classic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file.
The core framework for the animation of word frequencies was developed by Michael Cane in the [WordsSwarm](https://github.com/thisIsMikeKane/WordSwarm) project. **AnimatedWordCloud** makes 
the codes efficiently work on various text datasets of the Latin alphabet languages.

## Installation

It requires Python 3.8, [Box2D](https://pypi.org/project/Box2D), [beautifulsoup4](https://pypi.org/project/beautifulsoup4),
[pygame](https://pypi.org/project/pygame), [PyQt6](https://pypi.org/project/PyQt6) - visualization,
[Arabica](https://pypi.org/project/Arabica/) and [ftfy ](https://pypi.org/project/ftfy) for text preprocessing. 

To install using pip, use:

`pip install AnimatedWordCloud`


AnimatedWordCloud has been tested with **PyCharm** community ed. It's recommended to use this IDE and run .py files instead .ipynb.

## Usage

* **Import the library**:

``` python
from AnimatedWordCloud import animated_word_cloud
```

* **Generate frames:**

**animated_word_cloud** generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created *.post_processing/frames*  folder. It currently provides unigram frequencies (bigram frequencies will be added later). It reads dates in:

* **US-style**: *MM/DD/YYYY* (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)
* **European-style**: *DD/MM/YYYY* (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.


It automatically cleans data from punctuation and numbers on input. It can also remove the standard list(s) of stopwods for languages in the [NLTK](https://www.nltk.org) corpus of stopwords.


``` python
def animated_word_cloud(text: str,         # Text
                        time: str,         # Time
                        date_format: str,  # Date format: 'eur' - European, 'us' - American
                        ngram: int,        # N-gram order, 1 = unigram     
                        freq: str ,        # Aggregation period: 'Y'/'M'
                        stopwords: [],     # Languages for stop words
                        skip: []           # Remove additional stop words 
) 
```

To apply the method, use:

``` python
import pandas as pd
data = pd.read_csv("data.csv")
```


``` python
animated_word_cloud(text = data['text'],                         # Read text column
                    time = data['date'],                         # Read date column
                    date_format = 'us',                          # Specify date format
                    ngram = 1,                                   # Show individual word frequencies
                    freq ='Y',                                   # Yearly frequency
                    stopwords = ['english', 'german','french'],  # Clean from English, German and French stop words
                    skip = ['good', 'bad','yellow'])             # Remove 'good', 'bad', and 'yellow' as additional stop words                                                               

```


* **Create video from frames:**

Download the *ffmpeg* folder and the *frames2video.bat* file from [here](https://github.com/thisIsMikeKane/WordSwarm/tree/master/3-Postprocessing) and place them into the *postprocessing* folder.  Next, run *frames2video.bat*, which will generate a *wordSwarmOut.mp4* file, which is the desired output.

[![AnimatedWordCloud](https://github.com/PetrKorab/AnimatedWordCloud/raw/main/screenshot_awc.png)](https://github.com/PetrKorab/AnimatedWordCloud)


## Documentation, examples and tutorials

> [Data Storytelling with Animated Word Clouds](https://towardsdatascience.com/data-storytelling-with-animated-word-clouds-1889fdeb97b8) 

* For more examples of coding, read these  tutorials: TBA

Here are examples of animated word clouds:

> Research trends in Economics [Youtube](https://www.youtube.com/watch?v=-2gH7Xfn0AI&t=10s)

> European Central Bankers' speeches [Youtube](https://www.youtube.com/watch?v=oOgEpGtsJaI)

---

Please visit [here](https://github.com/PetrKorab/AnimatedWordCloud/issues) for any questions, issues, bugs, and suggestions.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/PetrKorab/AnimatedWordCloud",
    "name": "AnimatedWordCloud",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.9,>=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Petr Kor\u00e1b",
    "author_email": "xpetrkorab@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/76/45/891648204b02f0691c5558dffbcf53a14c9e43b894566b52e3f5338e1fe8/animatedwordcloud-1.0.9.tar.gz",
    "platform": null,
    "description": "[![pypi](https://img.shields.io/pypi/v/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)\r\n[![python](https://img.shields.io/pypi/pyversions/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)\r\n[![License: MIT](https://badgen.net/badge/license/apache-2-0/blue)]([https://opensource.org/licenses/MIT](https://opensource.org/license/apache-2-0/))\r\n\r\n\r\n# AnimatedWordCloud\r\n**Animated version of classic word cloud for time-series text data**\r\n\r\nClassic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file.\r\nThe core framework for the animation of word frequencies was developed by Michael Cane in the [WordsSwarm](https://github.com/thisIsMikeKane/WordSwarm) project. **AnimatedWordCloud** makes \r\nthe codes efficiently work on various text datasets of the Latin alphabet languages.\r\n\r\n## Installation\r\n\r\nIt requires Python 3.8, [Box2D](https://pypi.org/project/Box2D), [beautifulsoup4](https://pypi.org/project/beautifulsoup4),\r\n[pygame](https://pypi.org/project/pygame), [PyQt6](https://pypi.org/project/PyQt6) - visualization,\r\n[Arabica](https://pypi.org/project/Arabica/) and [ftfy ](https://pypi.org/project/ftfy) for text preprocessing. \r\n\r\nTo install using pip, use:\r\n\r\n`pip install AnimatedWordCloud`\r\n\r\n\r\nAnimatedWordCloud has been tested with **PyCharm** community ed. It's recommended to use this IDE and run .py files instead .ipynb.\r\n\r\n## Usage\r\n\r\n* **Import the library**:\r\n\r\n``` python\r\nfrom AnimatedWordCloud import animated_word_cloud\r\n```\r\n\r\n* **Generate frames:**\r\n\r\n**animated_word_cloud** generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created *.post_processing/frames*  folder. It currently provides unigram frequencies (bigram frequencies will be added later). It reads dates in:\r\n\r\n* **US-style**: *MM/DD/YYYY* (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)\r\n* **European-style**: *DD/MM/YYYY* (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.\r\n\r\n\r\nIt automatically cleans data from punctuation and numbers on input. It can also remove the standard list(s) of stopwods for languages in the [NLTK](https://www.nltk.org) corpus of stopwords.\r\n\r\n\r\n``` python\r\ndef animated_word_cloud(text: str,         # Text\r\n                        time: str,         # Time\r\n                        date_format: str,  # Date format: 'eur' - European, 'us' - American\r\n                        ngram: int,        # N-gram order, 1 = unigram     \r\n                        freq: str ,        # Aggregation period: 'Y'/'M'\r\n                        stopwords: [],     # Languages for stop words\r\n                        skip: []           # Remove additional stop words \r\n) \r\n```\r\n\r\nTo apply the method, use:\r\n\r\n``` python\r\nimport pandas as pd\r\ndata = pd.read_csv(\"data.csv\")\r\n```\r\n\r\n\r\n``` python\r\nanimated_word_cloud(text = data['text'],                         # Read text column\r\n                    time = data['date'],                         # Read date column\r\n                    date_format = 'us',                          # Specify date format\r\n                    ngram = 1,                                   # Show individual word frequencies\r\n                    freq ='Y',                                   # Yearly frequency\r\n                    stopwords = ['english', 'german','french'],  # Clean from English, German and French stop words\r\n                    skip = ['good', 'bad','yellow'])             # Remove 'good', 'bad', and 'yellow' as additional stop words                                                               \r\n\r\n```\r\n\r\n\r\n* **Create video from frames:**\r\n\r\nDownload the *ffmpeg* folder and the *frames2video.bat* file from [here](https://github.com/thisIsMikeKane/WordSwarm/tree/master/3-Postprocessing) and place them into the *postprocessing* folder.  Next, run *frames2video.bat*, which will generate a *wordSwarmOut.mp4* file, which is the desired output.\r\n\r\n[![AnimatedWordCloud](https://github.com/PetrKorab/AnimatedWordCloud/raw/main/screenshot_awc.png)](https://github.com/PetrKorab/AnimatedWordCloud)\r\n\r\n\r\n## Documentation, examples and tutorials\r\n\r\n> [Data Storytelling with Animated Word Clouds](https://towardsdatascience.com/data-storytelling-with-animated-word-clouds-1889fdeb97b8) \r\n\r\n* For more examples of coding, read these  tutorials: TBA\r\n\r\nHere are examples of animated word clouds:\r\n\r\n> Research trends in Economics [Youtube](https://www.youtube.com/watch?v=-2gH7Xfn0AI&t=10s)\r\n\r\n> European Central Bankers' speeches [Youtube](https://www.youtube.com/watch?v=oOgEpGtsJaI)\r\n\r\n---\r\n\r\nPlease visit [here](https://github.com/PetrKorab/AnimatedWordCloud/issues) for any questions, issues, bugs, and suggestions.\r\n",
    "bugtrack_url": null,
    "license": "OSI Approved :: Apache Software License",
    "summary": "Animated version of classic word cloud for time-series text data",
    "version": "1.0.9",
    "project_urls": {
        "Homepage": "https://github.com/PetrKorab/AnimatedWordCloud"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e02e818d6c0463f18e82fecbcc74347b167f2e12089073e01dcfc23644025022",
                "md5": "ca6b8d0727a0a186d98ce9dcf5bdc3a4",
                "sha256": "12e98c72d55ac2e0904b76ab164442063491a14baf0f84674ef3f1aba7b8c9d6"
            },
            "downloads": -1,
            "filename": "AnimatedWordCloud-1.0.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ca6b8d0727a0a186d98ce9dcf5bdc3a4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.9,>=3.8",
            "size": 35264592,
            "upload_time": "2024-05-05T20:07:31",
            "upload_time_iso_8601": "2024-05-05T20:07:31.362484Z",
            "url": "https://files.pythonhosted.org/packages/e0/2e/818d6c0463f18e82fecbcc74347b167f2e12089073e01dcfc23644025022/AnimatedWordCloud-1.0.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7645891648204b02f0691c5558dffbcf53a14c9e43b894566b52e3f5338e1fe8",
                "md5": "a3553c55f6e45bbe0edeefb672a1b22f",
                "sha256": "f0d13deaa2d9773fbd94c93bc8ba39cb5996e0f534cd65f07be7484c5f1da5a7"
            },
            "downloads": -1,
            "filename": "animatedwordcloud-1.0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "a3553c55f6e45bbe0edeefb672a1b22f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.9,>=3.8",
            "size": 35077728,
            "upload_time": "2024-05-05T20:07:35",
            "upload_time_iso_8601": "2024-05-05T20:07:35.813370Z",
            "url": "https://files.pythonhosted.org/packages/76/45/891648204b02f0691c5558dffbcf53a14c9e43b894566b52e3f5338e1fe8/animatedwordcloud-1.0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-05 20:07:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "PetrKorab",
    "github_project": "AnimatedWordCloud",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "animatedwordcloud"
}
        
Elapsed time: 0.29452s