# CLASSICALLY
Classifier Comparison and Feature Analysis Tools in Python
A python package for the evaluation of classification performance.
## Installation
__Classically__ is published on PyPi so you can install it with `pip`.
>>> python -m pip install classically
## Applications
### Critical Difference Diagram
A main application is the comparison of categorized paired metric data, e.g. accuracy results of
different classification methods in machine learning.
Therefore __Classically__ implements the *critical difference diagram* (described in [[1]](#1)).
<a id="2"><u>Example</u></a>
Imagine that we have five different classification methods tested on 14 different datasets.
Every classifiers returns an accuracy result on each test set in the corresponding dataset.
We collect the results in a table like this:
Classifier | | | | | | | | | | | | | | |
-----------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|
A | 0.60 | 0.81 | 0.62 | 0.19 | 0.93 | 0.54 | 0.53 | 0.41 | 0.21 | 0.97 | 0.32 | 0.82 | 0.38 | 0.75 |
B | 0.33 | 0.68 | 0.43 | 0.23 | 0.90 | 0.43 | 0.32 | 0.20 | 0.22 | 0.86 | 0.21 | 0.82 | 0.41 | 0.73 |
C | 0.25 | 0.64 | 0.40 | 0.10 | 0.85 | 0.39 | 0.31 | 0.19 | 0.18 | 0.90 | 0.23 | 0.78 | 0.43 | 0.71 |
D | 0.64 | 0.84 | 0.60 | 0.26 | 0.95 | 0.60 | 0.36 | 0.37 | 0.19 | 0.95 | 0.44 | 0.84 | 0.41 | 0.84 |
E | 0.37 | 0.68 | 0.47 | 0.18 | 0.88 | 0.37 | 0.27 | 0.25 | 0.24 | 0.79 | 0.25 | 0.83 | 0.36 | 0.64 |
We load this table in a `numpy array` of shape `(5, 14)` and call the function
`classically.critical_difference_diagram`. The resulting plot can be seen below.
![critical difference diagram](example/cdd_example.png)
Markings on this number line represent the average ranks of one classifier based on his accuracy
over all datasets. The lowest rank corresponds to the highest accuracy. Classifiers are connected
by a horizontal line if they do not have a significant difference. This significance is based on
post-hoc Wilcoxon signed rank tests for each pair of classifiers.
Therefore, it seems like classifier D is the best choice for the overall classification task.
It works best on the 14 chosen datasets, altough it's not the best classfier for every single
dataset on its own. But we can also see, that there is no significant (`alpha=0.05`) difference in
the accuracy results of classifier D and A. If D would be much more computationally expensive than
A, then we should consider choosing A as the better classifier.
### Scatter Matrix
For an in-depth comparison of the classifiers on single datasets a special type of scatter matrix
that is designed to compare multiple categories of data can be found in __Classically__.
<u>Example</u>
For a more elaborate decision in the [example above](#2) we could directly compare the best three
classifiers A, B and D using the function `classically.scatter_comparison.`
![scatter comparison](example/scatter_example.png)
Points above the diagonal line represent datasets that are better classified by the method in the
upper left corner. A horizontal and vertical line indicates the mean accuracy of the corresponding
classifier. A solid line marks the higher mean.
A choice can now be easily made for the comparison of classifier A and B as well as B and D.
We also see that D is better than A in mean accuracy but that A has a big advantage on one dataset
that is well beyond the diagonal line for five percent difference.
The datasets could now be further analyzed by, for example, looking at the number of training and
test instances. An option for setting the opacity value of points in the scatterplots accordingly
is available.
### Feature Score
Evaluating the importance of features in data can be very helpful in reducing the dimensionality of
the feature space. While principal component analysis transforms original features into new ones,
it can also be used to creating a ranking of those features. __Classically__ is able to compute the
feature score for a given dataset.
<u>Example</u>
We can analyze the importance of the four features in the well-known IRIS dataset by using the
method `classically.plot_feature_score`.
<p align="center"> <img width=512 src="example/score_example.png"> </align>
The plot shows the normalized feature score. The 'most important' feature based on that score is
'petal length (cm)'. All other features then have a relatively low score, e.g. sepal length's score
is about 80% lower. The red markings show the accuracy results of a ridge classifier where only the
first `n` features (in descending score order) are used (`n` gets incremented with each step on the
x-axis).
## References
<a id="1">[1]</a>
Demšar, Janez (2006).
"Statistical comparisons of classifiers over multiple data sets."
The Journal of Machine learning research 7, 1-30.
Raw data
{
"_id": null,
"home_page": "https://github.com/alienkrieg/classically",
"name": "classically",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9,<3.10",
"maintainer_email": "",
"keywords": "machine learning,classification,analysis",
"author": "alienkrieg",
"author_email": "alienkrieg@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/b6/69/4b06f7814becc265d9336ccceee78e92b9b39036c6933e768ee5c2470a16/classically-1.0.4.tar.gz",
"platform": null,
"description": "# CLASSICALLY\nClassifier Comparison and Feature Analysis Tools in Python\n\nA python package for the evaluation of classification performance.\n\n## Installation\n\n__Classically__ is published on PyPi so you can install it with `pip`.\n\n >>> python -m pip install classically\n\n## Applications\n\n### Critical Difference Diagram\n\nA main application is the comparison of categorized paired metric data, e.g. accuracy results of\ndifferent classification methods in machine learning.\nTherefore __Classically__ implements the *critical difference diagram* (described in [[1]](#1)).\n\n<a id=\"2\"><u>Example</u></a>\n\nImagine that we have five different classification methods tested on 14 different datasets.\nEvery classifiers returns an accuracy result on each test set in the corresponding dataset.\nWe collect the results in a table like this:\n\nClassifier | | | | | | | | | | | | | | |\n-----------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|\nA | 0.60 | 0.81 | 0.62 | 0.19 | 0.93 | 0.54 | 0.53 | 0.41 | 0.21 | 0.97 | 0.32 | 0.82 | 0.38 | 0.75 |\nB | 0.33 | 0.68 | 0.43 | 0.23 | 0.90 | 0.43 | 0.32 | 0.20 | 0.22 | 0.86 | 0.21 | 0.82 | 0.41 | 0.73 |\nC | 0.25 | 0.64 | 0.40 | 0.10 | 0.85 | 0.39 | 0.31 | 0.19 | 0.18 | 0.90 | 0.23 | 0.78 | 0.43 | 0.71 |\nD | 0.64 | 0.84 | 0.60 | 0.26 | 0.95 | 0.60 | 0.36 | 0.37 | 0.19 | 0.95 | 0.44 | 0.84 | 0.41 | 0.84 |\nE | 0.37 | 0.68 | 0.47 | 0.18 | 0.88 | 0.37 | 0.27 | 0.25 | 0.24 | 0.79 | 0.25 | 0.83 | 0.36 | 0.64 |\n\nWe load this table in a `numpy array` of shape `(5, 14)` and call the function\n`classically.critical_difference_diagram`. The resulting plot can be seen below.\n\n![critical difference diagram](example/cdd_example.png)\n\nMarkings on this number line represent the average ranks of one classifier based on his accuracy\nover all datasets. The lowest rank corresponds to the highest accuracy. Classifiers are connected\nby a horizontal line if they do not have a significant difference. This significance is based on\npost-hoc Wilcoxon signed rank tests for each pair of classifiers.\n\nTherefore, it seems like classifier D is the best choice for the overall classification task.\nIt works best on the 14 chosen datasets, altough it's not the best classfier for every single\ndataset on its own. But we can also see, that there is no significant (`alpha=0.05`) difference in\nthe accuracy results of classifier D and A. If D would be much more computationally expensive than\nA, then we should consider choosing A as the better classifier.\n\n### Scatter Matrix\n\nFor an in-depth comparison of the classifiers on single datasets a special type of scatter matrix\nthat is designed to compare multiple categories of data can be found in __Classically__.\n\n<u>Example</u>\n\nFor a more elaborate decision in the [example above](#2) we could directly compare the best three\nclassifiers A, B and D using the function `classically.scatter_comparison.`\n\n![scatter comparison](example/scatter_example.png)\n\nPoints above the diagonal line represent datasets that are better classified by the method in the\nupper left corner. A horizontal and vertical line indicates the mean accuracy of the corresponding\nclassifier. A solid line marks the higher mean.\nA choice can now be easily made for the comparison of classifier A and B as well as B and D.\nWe also see that D is better than A in mean accuracy but that A has a big advantage on one dataset\nthat is well beyond the diagonal line for five percent difference.\nThe datasets could now be further analyzed by, for example, looking at the number of training and\ntest instances. An option for setting the opacity value of points in the scatterplots accordingly\nis available.\n\n### Feature Score\n\nEvaluating the importance of features in data can be very helpful in reducing the dimensionality of\nthe feature space. While principal component analysis transforms original features into new ones,\nit can also be used to creating a ranking of those features. __Classically__ is able to compute the\nfeature score for a given dataset.\n\n<u>Example</u>\n\nWe can analyze the importance of the four features in the well-known IRIS dataset by using the\nmethod `classically.plot_feature_score`.\n\n<p align=\"center\"> <img width=512 src=\"example/score_example.png\"> </align>\n\nThe plot shows the normalized feature score. The 'most important' feature based on that score is\n'petal length (cm)'. All other features then have a relatively low score, e.g. sepal length's score\nis about 80% lower. The red markings show the accuracy results of a ridge classifier where only the\nfirst `n` features (in descending score order) are used (`n` gets incremented with each step on the\nx-axis).\n\n## References\n\n<a id=\"1\">[1]</a>\nDem\u0161ar, Janez (2006).\n\"Statistical comparisons of classifiers over multiple data sets.\"\nThe Journal of Machine learning research 7, 1-30.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Classifier Comparison and Feature Analysis Tools in Python",
"version": "1.0.4",
"project_urls": {
"Homepage": "https://github.com/alienkrieg/classically",
"Repository": "https://github.com/alienkrieg/classically"
},
"split_keywords": [
"machine learning",
"classification",
"analysis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f071e1d026660da7a731ac0d89f93221f071e9cf44fd39b54516b16f9f81cccc",
"md5": "0707f7e685a84771a094df7911b5b061",
"sha256": "39ec2eec4b8bc5ecf1bdf95822f4e20638f90703da4e6d7b34b6dd130507f13a"
},
"downloads": -1,
"filename": "classically-1.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0707f7e685a84771a094df7911b5b061",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9,<3.10",
"size": 13604,
"upload_time": "2023-06-27T14:12:19",
"upload_time_iso_8601": "2023-06-27T14:12:19.026877Z",
"url": "https://files.pythonhosted.org/packages/f0/71/e1d026660da7a731ac0d89f93221f071e9cf44fd39b54516b16f9f81cccc/classically-1.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b6694b06f7814becc265d9336ccceee78e92b9b39036c6933e768ee5c2470a16",
"md5": "a3debb1ff01228cf5a15766bd956fdd3",
"sha256": "5e0d9960d7c33b30c8592ebf472c130749b7377168fb73b520617afbccb2e7c8"
},
"downloads": -1,
"filename": "classically-1.0.4.tar.gz",
"has_sig": false,
"md5_digest": "a3debb1ff01228cf5a15766bd956fdd3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9,<3.10",
"size": 12373,
"upload_time": "2023-06-27T14:12:20",
"upload_time_iso_8601": "2023-06-27T14:12:20.467206Z",
"url": "https://files.pythonhosted.org/packages/b6/69/4b06f7814becc265d9336ccceee78e92b9b39036c6933e768ee5c2470a16/classically-1.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-27 14:12:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "alienkrieg",
"github_project": "classically",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "classically"
}