classgraphic


Nameclassgraphic JSON
Version 0.3.1 PyPI version JSON
download
home_pagehttps://github.com/dionresearch/classgraphic
SummaryInteractive classification diagnostic plots
upload_time2023-09-21 13:01:49
maintainer
docs_urlNone
authorFrancois Dion
requires_python>=3.6
licenseMIT license
keywords classgraphic classification clustering visualization ml machine learning plotly interactive
VCS
bugtrack_url
requirements numpy pandas plotly scikit-learn
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # classgraphic
[![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)
[![image](https://img.shields.io/pypi/v/classgraphic.svg)](https://pypi.python.org/pypi/classgraphic) 
![Dev](https://github.com/dionresearch/classgraphic/actions/workflows/dev.yml/badge.svg)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dionresearch/classgraphic/HEAD?labpath=notebooks%2FClassGraphic_iris_demo.ipynb)

Interactive classification diagnostic plots for scikit-learn.

![coin sorting machine](https://github.com/dionresearch/classgraphic/raw/main/docs/source/sorter_patent.jpg)

> We classify things for the purpose of doing something to them. Any classification which does not assist manipulation is worse than useless. - Randolph S. Bourne,
  "Education and Living", The Century Co (April 1917)

# Major features:

Plotly based tables for:

- class_imbalance_table 
- classification_table
- confusion_matrix_table
- describe (dataframe stats)
- prediction_table
- table

And the following charts:

- class_imbalance 
- class_error
- det
- feature_importance
- missing
- precision_recall
- roc
- prediction_histogram
- threshold

For clustering:
- Delauney triangulations
- Voronoi tessalations

# Try it

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dionresearch/classgraphic/HEAD?labpath=notebooks%2FClassGraphic_iris_demo.ipynb)

By trying it on binder, you'll see all the details and interactivity. The quickstart below
has static images, but if you run these commands in a jupyter notebook, ipython or IDE you will
be able to interact with them.

# Quickstart

```python
from classgraphic.essential import *

# loading the data
df = px.data.iris()

# let's see what kind of data we have
describe(df, transpose=True).show()
```
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/describe.png)
```python
# any missing?
missing(df)
```
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/missing.png)
```python
# features
X = df.drop(columns=["species", "species_id"])

#target
y = df["species"]

# Let's check our classes we will be training on and predicting
class_imbalance_table(y, condition="all")
```
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/imbalance_table.png)
```python
# train / test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.5, random_state=random_state
)

# we want to see total count for each, default for bars is to be stacked, so that works
# we could also pass to class_imbalance barmode="overlay" if we prefer
class_imbalance(y_train, y_test, condition="train,test")
```
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/class_imbalance.png)
```python
# model
model = LogisticRegression(max_iter=max_iter, random_state=random_state)
model.fit(X_train, y_train)

# predictions
y_score = model.predict_proba(X_test)
y_pred = model.predict(X_test)

confusion_matrix_table(model, y_test, y_pred).show()
classification_table(model, y_test, y_pred)
```
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/confusion.png)
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/classification_table.png)
```python
feature_importance(model, y, transpose=True)
```
![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/feature.png)

This concludes the quickstart. There are many more visualizations and tables to explore.

See the `notebooks` and `docs` folders on [github](https://github.com/dionresearch/classgraphic) and the documentation
[web site](https://dionresearch.github.io/classgraphic/) for more information.

# Requirements

- Python 3.8 or later
- numpy
- pandas
- plotly>=5.0
- scikit-learn
- nbformat

# Install

If you use conda, create an environment named `classgraphic`, then activate it:

- in Linux:
`source activate pilot`

- In Windows:
`conda activate pilot`

If you use another environment management create and activate your environment
using the normal steps.

Then execute:

```sh
python setup.py install
```

or for installing in [development mode](https://pip.pypa.io/en/latest/cli/pip_install/#install-editable):


```sh
python -m pip install -e . --no-build-isolation
```

or alternatively

```sh
python setup.py develop
```

To install from github instead:
```shell
pip install git+https://github.com/dionresearch/classgraphic
```


# See also

- [stemgraphic](https://github.com/dionresearch/stemgraphic) python package for visualization of data and text
- [Hotelling](https://github.com/dionresearch/hotelling) one and two sample Hotelling T2 tests, T2 and f statistics and univariate and multivariate control charts and anomaly detection


# History


## 0.3.1 (2023-09-20)
* bugfix for describe with pandas 2.x

## 0.3.0 (2023-05-01)

* added 2D clustering visualization
* defaults to Voronoi tessalation
* optional Delauney triangulation

## 0.2.1 (2022-09-20)

* fixed image not showing on pypi
* fixed feature importance error
* warning = False didn't prevent warning to show

## 0.2.1 (2022-09-19)

* added binary classification notebook example
* fixed issue with non dataframe binary classification

## 0.2.0 (2022-09-18)

The previous version was a first step to doing a public release. This
release added:
* documented
* updated the code to be in line with plotly 5.x

It was released to [github](https://github.com/dionresearch/classgraphic) and pypy.

## 0.1.0 (2019-10-27)

* First private release

## Origins

Inspired by Dion Research LLC Internal EDA/anomaly and end to end data science platform.
A dozen charts and tables were initially designed to provide better diagnostic reporting.
Some can also be used for exploratory or explanatory purposes.

See:
https://blog.dionresearch.com/2019/10/visualizations-explanatory-exploratory.html

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dionresearch/classgraphic",
    "name": "classgraphic",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "classgraphic,classification,clustering,visualization,ml,machine learning,plotly,interactive",
    "author": "Francois Dion",
    "author_email": "fdion@dionresearch.com",
    "download_url": "https://files.pythonhosted.org/packages/8d/95/3cd340432d090a24c87c3c8429534fd4485419f4dab0f368a6331b973247/classgraphic-0.3.1.tar.gz",
    "platform": null,
    "description": "# classgraphic\n[![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)\n[![image](https://img.shields.io/pypi/v/classgraphic.svg)](https://pypi.python.org/pypi/classgraphic) \n![Dev](https://github.com/dionresearch/classgraphic/actions/workflows/dev.yml/badge.svg)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dionresearch/classgraphic/HEAD?labpath=notebooks%2FClassGraphic_iris_demo.ipynb)\n\nInteractive classification diagnostic plots for scikit-learn.\n\n![coin sorting machine](https://github.com/dionresearch/classgraphic/raw/main/docs/source/sorter_patent.jpg)\n\n> We classify things for the purpose of doing something to them. Any classification which does not assist manipulation is worse than useless. - Randolph S. Bourne,\n  \"Education and Living\", The Century Co (April 1917)\n\n# Major features:\n\nPlotly based tables for:\n\n- class_imbalance_table \n- classification_table\n- confusion_matrix_table\n- describe (dataframe stats)\n- prediction_table\n- table\n\nAnd the following charts:\n\n- class_imbalance \n- class_error\n- det\n- feature_importance\n- missing\n- precision_recall\n- roc\n- prediction_histogram\n- threshold\n\nFor clustering:\n- Delauney triangulations\n- Voronoi tessalations\n\n# Try it\n\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dionresearch/classgraphic/HEAD?labpath=notebooks%2FClassGraphic_iris_demo.ipynb)\n\nBy trying it on binder, you'll see all the details and interactivity. The quickstart below\nhas static images, but if you run these commands in a jupyter notebook, ipython or IDE you will\nbe able to interact with them.\n\n# Quickstart\n\n```python\nfrom classgraphic.essential import *\n\n# loading the data\ndf = px.data.iris()\n\n# let's see what kind of data we have\ndescribe(df, transpose=True).show()\n```\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/describe.png)\n```python\n# any missing?\nmissing(df)\n```\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/missing.png)\n```python\n# features\nX = df.drop(columns=[\"species\", \"species_id\"])\n\n#target\ny = df[\"species\"]\n\n# Let's check our classes we will be training on and predicting\nclass_imbalance_table(y, condition=\"all\")\n```\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/imbalance_table.png)\n```python\n# train / test split\nX_train, X_test, y_train, y_test = train_test_split(\n    X, y, test_size=0.5, random_state=random_state\n)\n\n# we want to see total count for each, default for bars is to be stacked, so that works\n# we could also pass to class_imbalance barmode=\"overlay\" if we prefer\nclass_imbalance(y_train, y_test, condition=\"train,test\")\n```\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/class_imbalance.png)\n```python\n# model\nmodel = LogisticRegression(max_iter=max_iter, random_state=random_state)\nmodel.fit(X_train, y_train)\n\n# predictions\ny_score = model.predict_proba(X_test)\ny_pred = model.predict(X_test)\n\nconfusion_matrix_table(model, y_test, y_pred).show()\nclassification_table(model, y_test, y_pred)\n```\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/confusion.png)\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/classification_table.png)\n```python\nfeature_importance(model, y, transpose=True)\n```\n![dataframe describe tale](https://github.com/dionresearch/classgraphic/raw/main/docs/source/feature.png)\n\nThis concludes the quickstart. There are many more visualizations and tables to explore.\n\nSee the `notebooks` and `docs` folders on [github](https://github.com/dionresearch/classgraphic) and the documentation\n[web site](https://dionresearch.github.io/classgraphic/) for more information.\n\n# Requirements\n\n- Python 3.8 or later\n- numpy\n- pandas\n- plotly>=5.0\n- scikit-learn\n- nbformat\n\n# Install\n\nIf you use conda, create an environment named `classgraphic`, then activate it:\n\n- in Linux:\n`source activate pilot`\n\n- In Windows:\n`conda activate pilot`\n\nIf you use another environment management create and activate your environment\nusing the normal steps.\n\nThen execute:\n\n```sh\npython setup.py install\n```\n\nor for installing in [development mode](https://pip.pypa.io/en/latest/cli/pip_install/#install-editable):\n\n\n```sh\npython -m pip install -e . --no-build-isolation\n```\n\nor alternatively\n\n```sh\npython setup.py develop\n```\n\nTo install from github instead:\n```shell\npip install git+https://github.com/dionresearch/classgraphic\n```\n\n\n# See also\n\n- [stemgraphic](https://github.com/dionresearch/stemgraphic) python package for visualization of data and text\n- [Hotelling](https://github.com/dionresearch/hotelling) one and two sample Hotelling T2 tests, T2 and f statistics and univariate and multivariate control charts and anomaly detection\n\n\n# History\n\n\n## 0.3.1 (2023-09-20)\n* bugfix for describe with pandas 2.x\n\n## 0.3.0 (2023-05-01)\n\n* added 2D clustering visualization\n* defaults to Voronoi tessalation\n* optional Delauney triangulation\n\n## 0.2.1 (2022-09-20)\n\n* fixed image not showing on pypi\n* fixed feature importance error\n* warning = False didn't prevent warning to show\n\n## 0.2.1 (2022-09-19)\n\n* added binary classification notebook example\n* fixed issue with non dataframe binary classification\n\n## 0.2.0 (2022-09-18)\n\nThe previous version was a first step to doing a public release. This\nrelease added:\n* documented\n* updated the code to be in line with plotly 5.x\n\nIt was released to [github](https://github.com/dionresearch/classgraphic) and pypy.\n\n## 0.1.0 (2019-10-27)\n\n* First private release\n\n## Origins\n\nInspired by Dion Research LLC Internal EDA/anomaly and end to end data science platform.\nA dozen charts and tables were initially designed to provide better diagnostic reporting.\nSome can also be used for exploratory or explanatory purposes.\n\nSee:\nhttps://blog.dionresearch.com/2019/10/visualizations-explanatory-exploratory.html\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "Interactive classification diagnostic plots",
    "version": "0.3.1",
    "project_urls": {
        "Homepage": "https://github.com/dionresearch/classgraphic"
    },
    "split_keywords": [
        "classgraphic",
        "classification",
        "clustering",
        "visualization",
        "ml",
        "machine learning",
        "plotly",
        "interactive"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e11ee16382d15f79f855f6f9c75263dc93535daccb9c5cdea71f549001a4c25c",
                "md5": "6c9ee27529b57e210599291b2c69dc6b",
                "sha256": "d0e5a9cd816845f583d156a11b4b8f1782b694ba85b9b1c08941db3c777a127b"
            },
            "downloads": -1,
            "filename": "classgraphic-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6c9ee27529b57e210599291b2c69dc6b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 20183,
            "upload_time": "2023-09-21T13:01:47",
            "upload_time_iso_8601": "2023-09-21T13:01:47.723146Z",
            "url": "https://files.pythonhosted.org/packages/e1/1e/e16382d15f79f855f6f9c75263dc93535daccb9c5cdea71f549001a4c25c/classgraphic-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8d953cd340432d090a24c87c3c8429534fd4485419f4dab0f368a6331b973247",
                "md5": "6694ffabda467806d6bd9880689bd1df",
                "sha256": "75803f1e09d4990661d03e08c0ce73278782e08835da8970f24ea07012f029d5"
            },
            "downloads": -1,
            "filename": "classgraphic-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "6694ffabda467806d6bd9880689bd1df",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 546320,
            "upload_time": "2023-09-21T13:01:49",
            "upload_time_iso_8601": "2023-09-21T13:01:49.570367Z",
            "url": "https://files.pythonhosted.org/packages/8d/95/3cd340432d090a24c87c3c8429534fd4485419f4dab0f368a6331b973247/classgraphic-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-21 13:01:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dionresearch",
    "github_project": "classgraphic",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "plotly",
            "specs": [
                [
                    ">=",
                    "5.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": []
        }
    ],
    "lcname": "classgraphic"
}
        
Elapsed time: 0.11710s