chat-miner


Namechat-miner JSON
Version 0.5.4 PyPI version JSON
download
home_pageNone
SummaryLean parsers and visualizations for chat data.
upload_time2024-11-01 23:28:22
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords chat chatdata messenger parser wordcloud
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <picture>
  <source media="(prefers-color-scheme: dark)" srcset="doc/_static/logo-wide-dark.png">
  <source media="(prefers-color-scheme: light)" srcset="doc/_static/logo-wide-light.png">
  <img alt="chat-miner: turn your chats into artwork" src="doc/_static/logo-wide-light.png">
</picture>

-----------------

# chat-miner: turn your chats into artwork

[![PyPI Version](https://img.shields.io/pypi/v/chat-miner.svg)](https://pypi.org/project/chat-miner/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Downloads](https://static.pepy.tech/badge/chat-miner/month)](https://pepy.tech/project/chat-miner)
[![codecov](https://codecov.io/gh/joweich/chat-miner/branch/main/graph/badge.svg?token=6EQF0YNGLK)](https://codecov.io/gh/joweich/chat-miner)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

-----------------

**chat-miner** provides lean parsers for every major platform transforming chats into dataframes. Artistic visualizations allow you to explore your data and create artwork from your chats.


## 1. Installation
Latest release including dependencies can be installed via PyPI:
```sh
pip install chat-miner
```

If you're interested in contributing, running the latest source code, or just like to build everything yourself:
```sh
git clone https://github.com/joweich/chat-miner.git
cd chat-miner
pip install .
```

## 2. Exporting chat logs
Have a look at the official tutorials for [WhatsApp](https://faq.whatsapp.com/1180414079177245/), [Signal](https://github.com/carderne/signal-export), [Telegram](https://telegram.org/blog/export-and-more), [Facebook Messenger](https://www.facebook.com/help/messenger-app/713635396288741), or [Instagram Chats](https://help.instagram.com/181231772500920) to learn how to export chat logs for your platform.

## 3. Parsing
Following code showcases the ``WhatsAppParser`` module.
The usage of ``SignalParser``, ``TelegramJsonParser``, ``FacebookMessengerParser``, and ``InstagramJsonParser`` follows the same pattern.
```python
from chatminer.chatparsers import WhatsAppParser

parser = WhatsAppParser(FILEPATH)
parser.parse_file()
df = parser.parsed_messages.get_df(as_pandas=True) # as_pandas=False returns polars dataframe
```
**Note:**
Depending on your source system, Python requires to convert the filepath to a raw string.
```python
import os
FILEPATH = r"C:\Users\Username\chat.txt" # Windows
FILEPATH = "/home/username/chat.txt" # Unix
assert os.path.isfile(FILEPATH)

```

## 4. Visualizing
```python
import chatminer.visualizations as vis
import matplotlib.pyplot as plt
```
### 4.1 Heatmap: Message count per day
```python
fig, ax = plt.subplots(2, 1, figsize=(9, 3))
ax[0] = vis.calendar_heatmap(df, year=2020, cmap='Oranges', ax=ax[0])
ax[1] = vis.calendar_heatmap(df, year=2021, linewidth=0, monthly_border=True, ax=ax[1])
```

<p align="center">
  <img src="examples/heatmap.svg">
</p>

### 4.2 Sunburst: Message count per daytime
```python
fig, ax = plt.subplots(1, 2, figsize=(7, 3), subplot_kw={'projection': 'polar'})
ax[0] = vis.sunburst(df, highlight_max=True, isolines=[2500, 5000], isolines_relative=False, ax=ax[0])
ax[1] = vis.sunburst(df, highlight_max=False, isolines=[0.5, 1], color='C1', ax=ax[1])
```

<p align="center">
  <img src="examples/sunburst.svg">
</p>

### 4.3 Wordcloud: Word frequencies
```python
fig, ax = plt.subplots(figsize=(8, 3))
stopwords = ['these', 'are', 'stopwords']
kwargs={"background_color": "white", "width": 800, "height": 300, "max_words": 500}
ax = vis.wordcloud(df, ax=ax, stopwords=stopwords, **kwargs)
```
<p align="center">
  <img src="examples/wordcloud.svg">
</p>

### 4.4 Radarchart: Message count per weekday
```python
if not vis.is_radar_registered():
	vis.radar_factory(7, frame="polygon")
fig, ax = plt.subplots(1, 2, figsize=(7, 3), subplot_kw={'projection': 'radar'})
ax[0] = vis.radar(df, ax=ax[0])
ax[1] = vis.radar(df, ax=ax[1], color='C1', alpha=0)
```
<p align="center">
  <img src="examples/radar.svg">
</p>

## 5. Natural Language Processing

### 5.1 Add Sentiment 

```python
from chatminer.nlp import add_sentiment

df_sentiment = add_sentiment(df)
```
### 5.2 Example Plot: Sentiment per Author in Groupchat

```python
df_grouped = df_sentiment.groupby(['author', 'sentiment']).size().unstack(fill_value=0)
ax = df_grouped.plot(kind='bar', stacked=True, figsize=(8, 3))
```

<p align="center">
  <img src="examples/nlp.svg">
</p>


## 6. Command Line Interface
The CLI supports parsing chat logs into csv files.
As of now, you **can't** create visualizations from the CLI directly.

Example usage:
```bash
$ chatminer -p whatsapp -i exportfile.txt -o output.csv
```

Usage guide:
```
usage: chatminer [-h] [-p {whatsapp,instagram,facebook,signal,telegram}] [-i INPUT] [-o OUTPUT]

options:
  -h, --help 
                        Show this help message and exit
  -p {whatsapp,instagram,facebook,signal,telegram}, --parser {whatsapp,instagram,facebook,signal,telegram}
                        The platform from which the chats are imported
  -i INPUT, --input INPUT
                        Input file to be processed
  -o OUTPUT, --output OUTPUT
                        Output file for the results
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "chat-miner",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Jonas Weich <jns.wch@gmail.com>",
    "keywords": "chat, chatdata, messenger, parser, wordcloud",
    "author": null,
    "author_email": "Jonas Weich <jns.wch@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/47/ff/5ce0117919d65b03dd7879db7819eebd05a45f88e134d40637445ea05b8b/chat_miner-0.5.4.tar.gz",
    "platform": null,
    "description": "<picture>\n  <source media=\"(prefers-color-scheme: dark)\" srcset=\"doc/_static/logo-wide-dark.png\">\n  <source media=\"(prefers-color-scheme: light)\" srcset=\"doc/_static/logo-wide-light.png\">\n  <img alt=\"chat-miner: turn your chats into artwork\" src=\"doc/_static/logo-wide-light.png\">\n</picture>\n\n-----------------\n\n# chat-miner: turn your chats into artwork\n\n[![PyPI Version](https://img.shields.io/pypi/v/chat-miner.svg)](https://pypi.org/project/chat-miner/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Downloads](https://static.pepy.tech/badge/chat-miner/month)](https://pepy.tech/project/chat-miner)\n[![codecov](https://codecov.io/gh/joweich/chat-miner/branch/main/graph/badge.svg?token=6EQF0YNGLK)](https://codecov.io/gh/joweich/chat-miner)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n-----------------\n\n**chat-miner** provides lean parsers for every major platform transforming chats into dataframes. Artistic visualizations allow you to explore your data and create artwork from your chats.\n\n\n## 1. Installation\nLatest release including dependencies can be installed via PyPI:\n```sh\npip install chat-miner\n```\n\nIf you're interested in contributing, running the latest source code, or just like to build everything yourself:\n```sh\ngit clone https://github.com/joweich/chat-miner.git\ncd chat-miner\npip install .\n```\n\n## 2. Exporting chat logs\nHave a look at the official tutorials for [WhatsApp](https://faq.whatsapp.com/1180414079177245/), [Signal](https://github.com/carderne/signal-export), [Telegram](https://telegram.org/blog/export-and-more), [Facebook Messenger](https://www.facebook.com/help/messenger-app/713635396288741), or [Instagram Chats](https://help.instagram.com/181231772500920) to learn how to export chat logs for your platform.\n\n## 3. Parsing\nFollowing code showcases the ``WhatsAppParser`` module.\nThe usage of ``SignalParser``, ``TelegramJsonParser``, ``FacebookMessengerParser``, and ``InstagramJsonParser`` follows the same pattern.\n```python\nfrom chatminer.chatparsers import WhatsAppParser\n\nparser = WhatsAppParser(FILEPATH)\nparser.parse_file()\ndf = parser.parsed_messages.get_df(as_pandas=True) # as_pandas=False returns polars dataframe\n```\n**Note:**\nDepending on your source system, Python requires to convert the filepath to a raw string.\n```python\nimport os\nFILEPATH = r\"C:\\Users\\Username\\chat.txt\" # Windows\nFILEPATH = \"/home/username/chat.txt\" # Unix\nassert os.path.isfile(FILEPATH)\n\n```\n\n## 4. Visualizing\n```python\nimport chatminer.visualizations as vis\nimport matplotlib.pyplot as plt\n```\n### 4.1 Heatmap: Message count per day\n```python\nfig, ax = plt.subplots(2, 1, figsize=(9, 3))\nax[0] = vis.calendar_heatmap(df, year=2020, cmap='Oranges', ax=ax[0])\nax[1] = vis.calendar_heatmap(df, year=2021, linewidth=0, monthly_border=True, ax=ax[1])\n```\n\n<p align=\"center\">\n  <img src=\"examples/heatmap.svg\">\n</p>\n\n### 4.2 Sunburst: Message count per daytime\n```python\nfig, ax = plt.subplots(1, 2, figsize=(7, 3), subplot_kw={'projection': 'polar'})\nax[0] = vis.sunburst(df, highlight_max=True, isolines=[2500, 5000], isolines_relative=False, ax=ax[0])\nax[1] = vis.sunburst(df, highlight_max=False, isolines=[0.5, 1], color='C1', ax=ax[1])\n```\n\n<p align=\"center\">\n  <img src=\"examples/sunburst.svg\">\n</p>\n\n### 4.3 Wordcloud: Word frequencies\n```python\nfig, ax = plt.subplots(figsize=(8, 3))\nstopwords = ['these', 'are', 'stopwords']\nkwargs={\"background_color\": \"white\", \"width\": 800, \"height\": 300, \"max_words\": 500}\nax = vis.wordcloud(df, ax=ax, stopwords=stopwords, **kwargs)\n```\n<p align=\"center\">\n  <img src=\"examples/wordcloud.svg\">\n</p>\n\n### 4.4 Radarchart: Message count per weekday\n```python\nif not vis.is_radar_registered():\n\tvis.radar_factory(7, frame=\"polygon\")\nfig, ax = plt.subplots(1, 2, figsize=(7, 3), subplot_kw={'projection': 'radar'})\nax[0] = vis.radar(df, ax=ax[0])\nax[1] = vis.radar(df, ax=ax[1], color='C1', alpha=0)\n```\n<p align=\"center\">\n  <img src=\"examples/radar.svg\">\n</p>\n\n## 5. Natural Language Processing\n\n### 5.1 Add Sentiment \n\n```python\nfrom chatminer.nlp import add_sentiment\n\ndf_sentiment = add_sentiment(df)\n```\n### 5.2 Example Plot: Sentiment per Author in Groupchat\n\n```python\ndf_grouped = df_sentiment.groupby(['author', 'sentiment']).size().unstack(fill_value=0)\nax = df_grouped.plot(kind='bar', stacked=True, figsize=(8, 3))\n```\n\n<p align=\"center\">\n  <img src=\"examples/nlp.svg\">\n</p>\n\n\n## 6. Command Line Interface\nThe CLI supports parsing chat logs into csv files.\nAs of now, you **can't** create visualizations from the CLI directly.\n\nExample usage:\n```bash\n$ chatminer -p whatsapp -i exportfile.txt -o output.csv\n```\n\nUsage guide:\n```\nusage: chatminer [-h] [-p {whatsapp,instagram,facebook,signal,telegram}] [-i INPUT] [-o OUTPUT]\n\noptions:\n  -h, --help \n                        Show this help message and exit\n  -p {whatsapp,instagram,facebook,signal,telegram}, --parser {whatsapp,instagram,facebook,signal,telegram}\n                        The platform from which the chats are imported\n  -i INPUT, --input INPUT\n                        Input file to be processed\n  -o OUTPUT, --output OUTPUT\n                        Output file for the results\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Lean parsers and visualizations for chat data.",
    "version": "0.5.4",
    "project_urls": {
        "Bug Tracker": "https://github.com/joweich/chat-miner/issues",
        "Source Code": "https://github.com/joweich/chat-miner"
    },
    "split_keywords": [
        "chat",
        " chatdata",
        " messenger",
        " parser",
        " wordcloud"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1e276b366d660fd905642c3c7412ade4aa2eb20ca0c9d5826acf49ca9ab9e321",
                "md5": "4b64fef7f43333e24a7bd6b76201be56",
                "sha256": "8899fff7a8059ad3d5dd8841bd812ae4f056488c3168556d5a6d986cdb3e8a05"
            },
            "downloads": -1,
            "filename": "chat_miner-0.5.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4b64fef7f43333e24a7bd6b76201be56",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 13636,
            "upload_time": "2024-11-01T23:28:21",
            "upload_time_iso_8601": "2024-11-01T23:28:21.605120Z",
            "url": "https://files.pythonhosted.org/packages/1e/27/6b366d660fd905642c3c7412ade4aa2eb20ca0c9d5826acf49ca9ab9e321/chat_miner-0.5.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "47ff5ce0117919d65b03dd7879db7819eebd05a45f88e134d40637445ea05b8b",
                "md5": "3f948d1cf6ff3336d3a075318d9b82e9",
                "sha256": "116510ce7f1166fba78698f637d69a305b87dbf0e687a05b58dffb83396341ca"
            },
            "downloads": -1,
            "filename": "chat_miner-0.5.4.tar.gz",
            "has_sig": false,
            "md5_digest": "3f948d1cf6ff3336d3a075318d9b82e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 13014,
            "upload_time": "2024-11-01T23:28:22",
            "upload_time_iso_8601": "2024-11-01T23:28:22.686403Z",
            "url": "https://files.pythonhosted.org/packages/47/ff/5ce0117919d65b03dd7879db7819eebd05a45f88e134d40637445ea05b8b/chat_miner-0.5.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-01 23:28:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "joweich",
    "github_project": "chat-miner",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "chat-miner"
}
        
Elapsed time: 0.49769s