[![pypi](https://img.shields.io/pypi/v/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)
[![python](https://img.shields.io/pypi/pyversions/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)
[![License: MIT](https://badgen.net/badge/license/apache-2-0/blue)]([https://opensource.org/licenses/MIT](https://opensource.org/license/apache-2-0/))
# AnimatedWordCloud
**Animated version of classic word cloud for time-series text data**
Classic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file.
The core framework for the animation of word frequencies was developed by Michael Cane in the [WordsSwarm](https://github.com/thisIsMikeKane/WordSwarm) project. **AnimatedWordCloud** makes
the codes efficiently work on various text datasets of the Latin alphabet languages.
## Installation
It requires Python 3.8, [Box2D](https://pypi.org/project/Box2D), [beautifulsoup4](https://pypi.org/project/beautifulsoup4),
[pygame](https://pypi.org/project/pygame), [PyQt6](https://pypi.org/project/PyQt6) - visualization,
[Arabica](https://pypi.org/project/Arabica/) and [ftfy ](https://pypi.org/project/ftfy) for text preprocessing.
To install using pip, use:
`pip install AnimatedWordCloud`
AnimatedWordCloud has been tested with **PyCharm** community ed. It's recommended to use this IDE and run .py files instead .ipynb.
## Usage
* **Import the library**:
``` python
from AnimatedWordCloud import animated_word_cloud
```
* **Generate frames:**
**animated_word_cloud** generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created *.post_processing/frames* folder. It currently provides unigram frequencies (bigram frequencies will be added later). It reads dates in:
* **US-style**: *MM/DD/YYYY* (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)
* **European-style**: *DD/MM/YYYY* (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.
It automatically cleans data from punctuation and numbers on input. It can also remove the standard list(s) of stopwods for languages in the [NLTK](https://www.nltk.org) corpus of stopwords.
``` python
def animated_word_cloud(text: str, # Text
time: str, # Time
date_format: str, # Date format: 'eur' - European, 'us' - American
ngram: int, # N-gram order, 1 = unigram
freq: str , # Aggregation period: 'Y'/'M'
stopwords: [], # Languages for stop words
skip: [] # Remove additional stop words
)
```
To apply the method, use:
``` python
import pandas as pd
data = pd.read_csv("data.csv")
```
``` python
animated_word_cloud(text = data['text'], # Read text column
time = data['date'], # Read date column
date_format = 'us', # Specify date format
ngram = 1, # Show individual word frequencies
freq ='Y', # Yearly frequency
stopwords = ['english', 'german','french'], # Clean from English, German and French stop words
skip = ['good', 'bad','yellow']) # Remove 'good', 'bad', and 'yellow' as additional stop words
```
* **Create video from frames:**
Download the *ffmpeg* folder and the *frames2video.bat* file from [here](https://github.com/thisIsMikeKane/WordSwarm/tree/master/3-Postprocessing) and place them into the *postprocessing* folder. Next, run *frames2video.bat*, which will generate a *wordSwarmOut.mp4* file, which is the desired output.
[![AnimatedWordCloud](https://github.com/PetrKorab/AnimatedWordCloud/raw/main/screenshot_awc.png)](https://github.com/PetrKorab/AnimatedWordCloud)
## Documentation, examples and tutorials
> [Data Storytelling with Animated Word Clouds](https://towardsdatascience.com/data-storytelling-with-animated-word-clouds-1889fdeb97b8)
* For more examples of coding, read these tutorials: TBA
Here are examples of animated word clouds:
> Research trends in Economics [Youtube](https://www.youtube.com/watch?v=-2gH7Xfn0AI&t=10s)
> European Central Bankers' speeches [Youtube](https://www.youtube.com/watch?v=oOgEpGtsJaI)
---
Please visit [here](https://github.com/PetrKorab/AnimatedWordCloud/issues) for any questions, issues, bugs, and suggestions.
Raw data
{
"_id": null,
"home_page": "https://github.com/PetrKorab/AnimatedWordCloud",
"name": "AnimatedWordCloud",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.9,>=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Petr Kor\u00e1b",
"author_email": "xpetrkorab@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/76/45/891648204b02f0691c5558dffbcf53a14c9e43b894566b52e3f5338e1fe8/animatedwordcloud-1.0.9.tar.gz",
"platform": null,
"description": "[![pypi](https://img.shields.io/pypi/v/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)\r\n[![python](https://img.shields.io/pypi/pyversions/AnimatedWordCloud.svg)](https://pypi.python.org/pypi/AnimatedWordCloud)\r\n[![License: MIT](https://badgen.net/badge/license/apache-2-0/blue)]([https://opensource.org/licenses/MIT](https://opensource.org/license/apache-2-0/))\r\n\r\n\r\n# AnimatedWordCloud\r\n**Animated version of classic word cloud for time-series text data**\r\n\r\nClassic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file.\r\nThe core framework for the animation of word frequencies was developed by Michael Cane in the [WordsSwarm](https://github.com/thisIsMikeKane/WordSwarm) project. **AnimatedWordCloud** makes \r\nthe codes efficiently work on various text datasets of the Latin alphabet languages.\r\n\r\n## Installation\r\n\r\nIt requires Python 3.8, [Box2D](https://pypi.org/project/Box2D), [beautifulsoup4](https://pypi.org/project/beautifulsoup4),\r\n[pygame](https://pypi.org/project/pygame), [PyQt6](https://pypi.org/project/PyQt6) - visualization,\r\n[Arabica](https://pypi.org/project/Arabica/) and [ftfy ](https://pypi.org/project/ftfy) for text preprocessing. \r\n\r\nTo install using pip, use:\r\n\r\n`pip install AnimatedWordCloud`\r\n\r\n\r\nAnimatedWordCloud has been tested with **PyCharm** community ed. It's recommended to use this IDE and run .py files instead .ipynb.\r\n\r\n## Usage\r\n\r\n* **Import the library**:\r\n\r\n``` python\r\nfrom AnimatedWordCloud import animated_word_cloud\r\n```\r\n\r\n* **Generate frames:**\r\n\r\n**animated_word_cloud** generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created *.post_processing/frames* folder. It currently provides unigram frequencies (bigram frequencies will be added later). It reads dates in:\r\n\r\n* **US-style**: *MM/DD/YYYY* (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)\r\n* **European-style**: *DD/MM/YYYY* (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.\r\n\r\n\r\nIt automatically cleans data from punctuation and numbers on input. It can also remove the standard list(s) of stopwods for languages in the [NLTK](https://www.nltk.org) corpus of stopwords.\r\n\r\n\r\n``` python\r\ndef animated_word_cloud(text: str, # Text\r\n time: str, # Time\r\n date_format: str, # Date format: 'eur' - European, 'us' - American\r\n ngram: int, # N-gram order, 1 = unigram \r\n freq: str , # Aggregation period: 'Y'/'M'\r\n stopwords: [], # Languages for stop words\r\n skip: [] # Remove additional stop words \r\n) \r\n```\r\n\r\nTo apply the method, use:\r\n\r\n``` python\r\nimport pandas as pd\r\ndata = pd.read_csv(\"data.csv\")\r\n```\r\n\r\n\r\n``` python\r\nanimated_word_cloud(text = data['text'], # Read text column\r\n time = data['date'], # Read date column\r\n date_format = 'us', # Specify date format\r\n ngram = 1, # Show individual word frequencies\r\n freq ='Y', # Yearly frequency\r\n stopwords = ['english', 'german','french'], # Clean from English, German and French stop words\r\n skip = ['good', 'bad','yellow']) # Remove 'good', 'bad', and 'yellow' as additional stop words \r\n\r\n```\r\n\r\n\r\n* **Create video from frames:**\r\n\r\nDownload the *ffmpeg* folder and the *frames2video.bat* file from [here](https://github.com/thisIsMikeKane/WordSwarm/tree/master/3-Postprocessing) and place them into the *postprocessing* folder. Next, run *frames2video.bat*, which will generate a *wordSwarmOut.mp4* file, which is the desired output.\r\n\r\n[![AnimatedWordCloud](https://github.com/PetrKorab/AnimatedWordCloud/raw/main/screenshot_awc.png)](https://github.com/PetrKorab/AnimatedWordCloud)\r\n\r\n\r\n## Documentation, examples and tutorials\r\n\r\n> [Data Storytelling with Animated Word Clouds](https://towardsdatascience.com/data-storytelling-with-animated-word-clouds-1889fdeb97b8) \r\n\r\n* For more examples of coding, read these tutorials: TBA\r\n\r\nHere are examples of animated word clouds:\r\n\r\n> Research trends in Economics [Youtube](https://www.youtube.com/watch?v=-2gH7Xfn0AI&t=10s)\r\n\r\n> European Central Bankers' speeches [Youtube](https://www.youtube.com/watch?v=oOgEpGtsJaI)\r\n\r\n---\r\n\r\nPlease visit [here](https://github.com/PetrKorab/AnimatedWordCloud/issues) for any questions, issues, bugs, and suggestions.\r\n",
"bugtrack_url": null,
"license": "OSI Approved :: Apache Software License",
"summary": "Animated version of classic word cloud for time-series text data",
"version": "1.0.9",
"project_urls": {
"Homepage": "https://github.com/PetrKorab/AnimatedWordCloud"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e02e818d6c0463f18e82fecbcc74347b167f2e12089073e01dcfc23644025022",
"md5": "ca6b8d0727a0a186d98ce9dcf5bdc3a4",
"sha256": "12e98c72d55ac2e0904b76ab164442063491a14baf0f84674ef3f1aba7b8c9d6"
},
"downloads": -1,
"filename": "AnimatedWordCloud-1.0.9-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ca6b8d0727a0a186d98ce9dcf5bdc3a4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.9,>=3.8",
"size": 35264592,
"upload_time": "2024-05-05T20:07:31",
"upload_time_iso_8601": "2024-05-05T20:07:31.362484Z",
"url": "https://files.pythonhosted.org/packages/e0/2e/818d6c0463f18e82fecbcc74347b167f2e12089073e01dcfc23644025022/AnimatedWordCloud-1.0.9-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7645891648204b02f0691c5558dffbcf53a14c9e43b894566b52e3f5338e1fe8",
"md5": "a3553c55f6e45bbe0edeefb672a1b22f",
"sha256": "f0d13deaa2d9773fbd94c93bc8ba39cb5996e0f534cd65f07be7484c5f1da5a7"
},
"downloads": -1,
"filename": "animatedwordcloud-1.0.9.tar.gz",
"has_sig": false,
"md5_digest": "a3553c55f6e45bbe0edeefb672a1b22f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.9,>=3.8",
"size": 35077728,
"upload_time": "2024-05-05T20:07:35",
"upload_time_iso_8601": "2024-05-05T20:07:35.813370Z",
"url": "https://files.pythonhosted.org/packages/76/45/891648204b02f0691c5558dffbcf53a14c9e43b894566b52e3f5338e1fe8/animatedwordcloud-1.0.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-05 20:07:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "PetrKorab",
"github_project": "AnimatedWordCloud",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "animatedwordcloud"
}