# Multi-Comparison Matrix (MCM)
### This repository contains the software for our paper titled "[An Approach to Multiple Comparison Benchmark Evaluations that is Stable Under Manipulation of the Comparate Set](https://arxiv.org/abs/2305.11921)". This work has been done by [Ali Ismail-Fawaz](https://hadifawaz1999.github.io/), [Angus Dempster](https://dblp.uni-trier.de/pid/251/8985.html), [Chang Wei Tan](https://changweitan.com/), [Matthieu Herrmann](https://orcid.org/0000-0002-0074-470X), [Lynn Miller](https://au.linkedin.com/in/lynn-miller-bb1aa539), [Daniel Schmidt](https://research.monash.edu/en/persons/daniel-schmidt), [Stefano Berretti](http://www.micc.unifi.it/berretti/), [Jonathan Weber](https://www.jonathan-weber.eu/), [Maxime Devanne](https://maxime-devanne.com/), [Germain Forestier](https://germain-forestier.info/) and [Geoff I. Webb](https://i.giwebb.com/).
## Can now be used through [PyPl](https://pypi.org/project/multi-comp-matrix/)
Simply install using `pip install multi-comp-matrix` and use it as explained in the below example section
## Papers Using the MCM:
1. Middlehurst et al. 2024 "[Bake off redux: a review and experimental evaluation of recent time series classification algorithms](https://link.springer.com/article/10.1007/s10618-024-01022-1)" Data Mining and Knowledge Discovery
2. Ismail-Fawaz et al. 2024 "[Finding foundation models for time series classification with a pretext task](https://doi.org/10.1007/978-981-97-2650-9_10)" The Pacific-Asia Conference on Knowledge Discovery and Data Mining - International Workshop on Temporal Analytics
3. Foumani et al. 2023 "[Series2Vec: Similarity-based Self-supervised Representation Learning for Time Series Classification](https://doi.org/10.1007/s10618-024-01043-w)" Data Mining and Knowledge Discovery
4. Holder et al. 2023 "[A review and evaluation of elastic distance functions for time series clustering]([A review and evaluation of elastic distance functions for time series clustering](https://link.springer.com/article/10.1007/s10115-023-01952-0))" Knowledge and Information Systems
5. Ismail-Fawaz et al. 2023 "[LITE: Light Inception with boosTing tEchniques for Time Series Classification](https://ieeexplore.ieee.org/abstract/document/10302569)" IEEE 10th International Conference on Data Science and Advanced Analytics
6. Koh et al. 2023 "[PSICHIC: physicochemical graph neural network for learning protein-ligand interaction fingerprints from sequence data](https://www.biorxiv.org/content/10.1101/2023.09.17.558145v1.abstract)" bioRxiv
7. Ayllón-Gavilán et al. 2023 "[Convolutional and Deep Learning based techniques for Time Series Ordinal Classification](https://arxiv.org/abs/2306.10084)"
8. Ismail-Fawaz et al. 2023 "[ShapeDBA: Generating Effective Time Series Prototypes Using ShapeDTW Barycenter Averaging](https://link.springer.com/chapter/10.1007/978-3-031-49896-1_9)" The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshop on Advanced Analytics and Learning on Temporal Data
9. Dempster et al. 2023 "[QUANT: A Minimalist Interval Method for Time Series Classification](https://doi.org/10.1007/s10618-024-01036-9)" Data Mining and Knowledge Discovery
10. Holder et al. 2023 "[Clustering Time Series with k-Medoids Based Algorithms](https://link.springer.com/chapter/10.1007/978-3-031-49896-1_4)" The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshop on Advanced Analytics and Learning on Temporal Data
11. Guijo-Rubio et al. 2023 "[Unsupervised feature based algorithms for time series extrinsic regression](https://doi.org/10.1007/s10618-024-01027-w)" Data Mining and Knowledge Discovery
12. Fischer et al. 2024 "[Towards more sustainable and trustworthy reporting in machine learning](https://doi.org/10.1007/s10618-024-01020-3)." Data Mining and Knowledge Discovery
13. Middlehurst et al. 2024 "[aeon: a Python toolkit for learning from time series](https://www.jmlr.org/papers/v25/23-1444.html)."
14. da Silva et al. 2024 "[Artist Similarity based on Heterogeneous Graph Neural Networks](https://doi.org/10.1109/TASLP.2024.3437170)." IEEE/ACM Transactions on Audio, Speech, and Language Processing
15. Renault, Aurélien, et al. ["Early Classification of Time Series: Taxonomy and Benchmark."](https://arxiv.org/abs/2406.18332) arXiv preprint arXiv:2406.18332 (2024).
16. Spinnato, Francesco, et al. ["Fast, Interpretable and Deterministic Time Series Classification with a Bag-Of-Receptive-Fields."](https://link.springer.com/article/10.1007/s10618-024-01020-3) Data Mining and Knowledge Discovery (2024).
17. Lo, Mouhamadou Mansour, et al. ["Time series classification with random convolution kernels based transforms: pooling operators and input representations matter."](https://arxiv.org/abs/2409.01115) arXiv preprint arXiv:2409.01115 (2024).
## Summary
This repo is a long term used benchmark method that generates a Multi-Comparison Matrix where the user ca choose whether to include a full pairwise multi-comparate comparison or to choose which ones to be included or excluded in the rows and columns of the matrix.
## Input Format
The input format is in a ```.csv``` file containing the statistics of each classifiers as the format of [this example](https://github.com/MSD-IRIMAS/Multi_Pairwise_Comparison/blob/main/results_example.csv).
## Usage of Code - Plot the MCM
In order for the user to plot the MCM, first thing is to load the ```.csv``` file into a ```pandas``` dataframe and feed it to the ```compare``` function. The user should specify the ```pdf_savename```, ```png_savename```, ```csv_savename``` or ```tex_savename``` parameter in order to save the output figure in ```pdf```, ```png```, ```csv``` or ```tex``` formats respecively.
## Examples
Generating the MCM on the [following example](https://github.com/MSD-IRIMAS/Multi_Pairwise_Comparison/blob/main/results_example.csv) produces the following. To generate the following figure, the user follows this simple code:
```
import pandas as pd
from multi_comp_matrix import MCM
df_results = pd.read_csv('path/to/csv')
output_dir = '/output/directory/desired'
MCM.compare(
output_dir=output_dir,
df_results=df_results,
pdf_savename="heatmap",
png_savename="heatmap",
)
```
<p align="center" width="100%">
<img src="heatmap.png" alt="heatmap-example"/>
</p>
Generating the MCM on the [following example](https://github.com/MSD-IRIMAS/Multi_Pairwise_Comparison/blob/main/results_example.csv) by excluding ```clf1``` and ```clf3``` from the columns.
```
import pandas as pd
from multi_comp_matrix import MCM
df_results = pd.read_csv('path/to/csv')
output_dir = '/output/directory/desired'
MCM.compare(
output_dir=output_dir,
df_results=df_results,
excluded_col_comparates=['clf1','clf3'],
png_savename='heatline_vertical',
tex_savename='heatline_vertical',
include_ProbaWinTieLoss=True
)
```
<p align="center" width="100%">
<img src="heatline_vertical.png" alt="heatline-vertical-example"/>
</p>
and by excluding them in the rows.
```
import pandas as pd
from multi_comp_matrix import MCM
df_results = pd.read_csv('path/to/csv')
output_dir = '/output/directory/desired'
MCM.compare(
output_dir=output_dir,
df_results=df_results,
excluded_row_comparates=['clf1','clf3'],
png_savename='heatline_horizontal',
csv_savename='heatline_horizontal',
)
```
<p align="center" width="100%">
<img src="heatline_horizontal.png" alt="heatline-horizontal-example"/>
</p>
## Requirements
The following python packages are required for the usage of the module:
1. ```numpy==1.24.4```
2. ```pandas==2.0.3```
3. ```matplotlib==3.7.4```
4. ```scipy==1.10.0```
5. ```baycomp==1.0```
6. ```tqdm==4.66.1```
## Citation
If you use this work please make sure you cite this paper:
```
@article{ismail2023approach,
title={An Approach To Multiple Comparison Benchmark Evaluations That Is Stable Under Manipulation Of The Comparate Set},
author={Ismail-Fawaz, Ali and Dempster, Angus and Tan, Chang Wei and Herrmann, Matthieu and Miller, Lynn and Schmidt, Daniel F and Berretti, Stefano and Weber, Jonathan and Devanne, Maxime and Forestier, Germain and Webb, Geoff I},
journal={arXiv preprint arXiv:2305.11921},
year={2023}
}
```
## Acknowledgments
The work reported in this paper has been supported by the Australian Research Council
under grant DP210100072; the ANR TIMES project (grant ANR-17- CE23-0015); and ANR
DELEGATION project (grant ANR-21-CE23-0014) of the French Agence Nationale de la
Recherche. The authors would like to thank Professor Eamonn Keogh and all the people
who have contributed to the UCR time series classification archive.
Raw data
{
"_id": null,
"home_page": null,
"name": "multi-comp-matrix",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Ali Ismail-Fawaz <ali-el-hadi.ismail-fawaz@uha.fr>, Angus Dempster <Angus.Dempster1@monash.edu>, Chang Wei Tan <Chang.Tan@monash.edu>",
"keywords": "data-science, machine-learning, data-mining, time-series, time-series-analysis, time-series-classification, time-series-regression, time-series-machine-learning, benchmarking, benchmarking-machine-learning",
"author": null,
"author_email": "Ali Ismail-Fawaz <ali-el-hadi.ismail-fawaz@uha.fr>, Angus Dempster <Angus.Dempster1@monash.edu>, Chang Wei Tan <Chang.Tan@monash.edu>, Matthieu Herrmann <matthieu.herrmann@monash.edu>, Lynn Miller <lynn.miller1@monash.edu>, \"Daniel F. Schmidt\" <Daniel.Schmidt@monash.edu>, Stefano Berretti <stefano.berretti@unifi.it>, Jonathan Weber <jonathan.weber@uha.fr>, Maxime Devanne <maxime.devanne@uha.fr>, Germain Forestier <germain.forestier@uha.fr>, \"Geoffrey I. Webb\" <geoff.webb@monash.edu>",
"download_url": "https://files.pythonhosted.org/packages/90/b8/fa04fabe221e552ca82b66317e455558e825376ef0a517f553175ac07930/multi_comp_matrix-0.0.2.tar.gz",
"platform": null,
"description": "# Multi-Comparison Matrix (MCM)\n\n### This repository contains the software for our paper titled \"[An Approach to Multiple Comparison Benchmark Evaluations that is Stable Under Manipulation of the Comparate Set](https://arxiv.org/abs/2305.11921)\". This work has been done by [Ali Ismail-Fawaz](https://hadifawaz1999.github.io/), [Angus Dempster](https://dblp.uni-trier.de/pid/251/8985.html), [Chang Wei Tan](https://changweitan.com/), [Matthieu Herrmann](https://orcid.org/0000-0002-0074-470X), [Lynn Miller](https://au.linkedin.com/in/lynn-miller-bb1aa539), [Daniel Schmidt](https://research.monash.edu/en/persons/daniel-schmidt), [Stefano Berretti](http://www.micc.unifi.it/berretti/), [Jonathan Weber](https://www.jonathan-weber.eu/), [Maxime Devanne](https://maxime-devanne.com/), [Germain Forestier](https://germain-forestier.info/) and [Geoff I. Webb](https://i.giwebb.com/).\n\n## Can now be used through [PyPl](https://pypi.org/project/multi-comp-matrix/)\n\nSimply install using `pip install multi-comp-matrix` and use it as explained in the below example section\n\n## Papers Using the MCM:\n\n1. Middlehurst et al. 2024 \"[Bake off redux: a review and experimental evaluation of recent time series classification algorithms](https://link.springer.com/article/10.1007/s10618-024-01022-1)\" Data Mining and Knowledge Discovery\n2. Ismail-Fawaz et al. 2024 \"[Finding foundation models for time series classification with a pretext task](https://doi.org/10.1007/978-981-97-2650-9_10)\" The Pacific-Asia Conference on Knowledge Discovery and Data Mining - International Workshop on Temporal Analytics\n3. Foumani et al. 2023 \"[Series2Vec: Similarity-based Self-supervised Representation Learning for Time Series Classification](https://doi.org/10.1007/s10618-024-01043-w)\" Data Mining and Knowledge Discovery\n4. Holder et al. 2023 \"[A review and evaluation of elastic distance functions for time series clustering]([A review and evaluation of elastic distance functions for time series clustering](https://link.springer.com/article/10.1007/s10115-023-01952-0))\" Knowledge and Information Systems\n5. Ismail-Fawaz et al. 2023 \"[LITE: Light Inception with boosTing tEchniques for Time Series Classification](https://ieeexplore.ieee.org/abstract/document/10302569)\" IEEE 10th International Conference on Data Science and Advanced Analytics\n6. Koh et al. 2023 \"[PSICHIC: physicochemical graph neural network for learning protein-ligand interaction fingerprints from sequence data](https://www.biorxiv.org/content/10.1101/2023.09.17.558145v1.abstract)\" bioRxiv\n7. Ayll\u00f3n-Gavil\u00e1n et al. 2023 \"[Convolutional and Deep Learning based techniques for Time Series Ordinal Classification](https://arxiv.org/abs/2306.10084)\"\n8. Ismail-Fawaz et al. 2023 \"[ShapeDBA: Generating Effective Time Series Prototypes Using ShapeDTW Barycenter Averaging](https://link.springer.com/chapter/10.1007/978-3-031-49896-1_9)\" The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshop on Advanced Analytics and Learning on Temporal Data\n9. Dempster et al. 2023 \"[QUANT: A Minimalist Interval Method for Time Series Classification](https://doi.org/10.1007/s10618-024-01036-9)\" Data Mining and Knowledge Discovery\n10. Holder et al. 2023 \"[Clustering Time Series with k-Medoids Based Algorithms](https://link.springer.com/chapter/10.1007/978-3-031-49896-1_4)\" The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases Workshop on Advanced Analytics and Learning on Temporal Data\n11. Guijo-Rubio et al. 2023 \"[Unsupervised feature based algorithms for time series extrinsic regression](https://doi.org/10.1007/s10618-024-01027-w)\" Data Mining and Knowledge Discovery\n12. Fischer et al. 2024 \"[Towards more sustainable and trustworthy reporting in machine learning](https://doi.org/10.1007/s10618-024-01020-3).\" Data Mining and Knowledge Discovery\n13. Middlehurst et al. 2024 \"[aeon: a Python toolkit for learning from time series](https://www.jmlr.org/papers/v25/23-1444.html).\"\n14. da Silva et al. 2024 \"[Artist Similarity based on Heterogeneous Graph Neural Networks](https://doi.org/10.1109/TASLP.2024.3437170).\" IEEE/ACM Transactions on Audio, Speech, and Language Processing\n15. Renault, Aur\u00e9lien, et al. [\"Early Classification of Time Series: Taxonomy and Benchmark.\"](https://arxiv.org/abs/2406.18332) arXiv preprint arXiv:2406.18332 (2024).\n16. Spinnato, Francesco, et al. [\"Fast, Interpretable and Deterministic Time Series Classification with a Bag-Of-Receptive-Fields.\"](https://link.springer.com/article/10.1007/s10618-024-01020-3) Data Mining and Knowledge Discovery (2024).\n17. Lo, Mouhamadou Mansour, et al. [\"Time series classification with random convolution kernels based transforms: pooling operators and input representations matter.\"](https://arxiv.org/abs/2409.01115) arXiv preprint arXiv:2409.01115 (2024).\n\n## Summary\n\nThis repo is a long term used benchmark method that generates a Multi-Comparison Matrix where the user ca choose whether to include a full pairwise multi-comparate comparison or to choose which ones to be included or excluded in the rows and columns of the matrix.\n\n## Input Format\n\nThe input format is in a ```.csv``` file containing the statistics of each classifiers as the format of [this example](https://github.com/MSD-IRIMAS/Multi_Pairwise_Comparison/blob/main/results_example.csv).\n\n## Usage of Code - Plot the MCM\n\nIn order for the user to plot the MCM, first thing is to load the ```.csv``` file into a ```pandas``` dataframe and feed it to the ```compare``` function. The user should specify the ```pdf_savename```, ```png_savename```, ```csv_savename``` or ```tex_savename``` parameter in order to save the output figure in ```pdf```, ```png```, ```csv``` or ```tex``` formats respecively.\n\n## Examples\n\nGenerating the MCM on the [following example](https://github.com/MSD-IRIMAS/Multi_Pairwise_Comparison/blob/main/results_example.csv) produces the following. To generate the following figure, the user follows this simple code:\n\n```\nimport pandas as pd\nfrom multi_comp_matrix import MCM\n\ndf_results = pd.read_csv('path/to/csv')\n\noutput_dir = '/output/directory/desired'\n\nMCM.compare(\n output_dir=output_dir,\n df_results=df_results,\n pdf_savename=\"heatmap\",\n png_savename=\"heatmap\",\n )\n```\n\n<p align=\"center\" width=\"100%\">\n<img src=\"heatmap.png\" alt=\"heatmap-example\"/>\n</p>\n\nGenerating the MCM on the [following example](https://github.com/MSD-IRIMAS/Multi_Pairwise_Comparison/blob/main/results_example.csv) by excluding ```clf1``` and ```clf3``` from the columns.\n\n```\nimport pandas as pd\nfrom multi_comp_matrix import MCM\n\ndf_results = pd.read_csv('path/to/csv')\n\noutput_dir = '/output/directory/desired'\n\nMCM.compare(\n output_dir=output_dir,\n df_results=df_results,\n excluded_col_comparates=['clf1','clf3'],\n png_savename='heatline_vertical',\n tex_savename='heatline_vertical',\n include_ProbaWinTieLoss=True\n )\n```\n\n<p align=\"center\" width=\"100%\">\n<img src=\"heatline_vertical.png\" alt=\"heatline-vertical-example\"/>\n</p>\n\nand by excluding them in the rows.\n\n```\nimport pandas as pd\nfrom multi_comp_matrix import MCM\n\ndf_results = pd.read_csv('path/to/csv')\n\noutput_dir = '/output/directory/desired'\n\nMCM.compare(\n output_dir=output_dir,\n df_results=df_results,\n excluded_row_comparates=['clf1','clf3'],\n png_savename='heatline_horizontal',\n csv_savename='heatline_horizontal',\n )\n```\n\n<p align=\"center\" width=\"100%\">\n<img src=\"heatline_horizontal.png\" alt=\"heatline-horizontal-example\"/>\n</p>\n\n## Requirements\n\nThe following python packages are required for the usage of the module:\n\n1. ```numpy==1.24.4```\n2. ```pandas==2.0.3```\n3. ```matplotlib==3.7.4```\n4. ```scipy==1.10.0```\n5. ```baycomp==1.0```\n6. ```tqdm==4.66.1```\n\n\n## Citation\n\nIf you use this work please make sure you cite this paper:\n```\n@article{ismail2023approach,\n title={An Approach To Multiple Comparison Benchmark Evaluations That Is Stable Under Manipulation Of The Comparate Set},\n author={Ismail-Fawaz, Ali and Dempster, Angus and Tan, Chang Wei and Herrmann, Matthieu and Miller, Lynn and Schmidt, Daniel F and Berretti, Stefano and Weber, Jonathan and Devanne, Maxime and Forestier, Germain and Webb, Geoff I},\n journal={arXiv preprint arXiv:2305.11921},\n year={2023}\n}\n```\n\n## Acknowledgments\n\nThe work reported in this paper has been supported by the Australian Research Council\nunder grant DP210100072; the ANR TIMES project (grant ANR-17- CE23-0015); and ANR\nDELEGATION project (grant ANR-21-CE23-0014) of the French Agence Nationale de la\nRecherche. The authors would like to thank Professor Eamonn Keogh and all the people\nwho have contributed to the UCR time series classification archive.\n",
"bugtrack_url": null,
"license": "GPL-3.0-only",
"summary": "Multi Comparison Matrix: A long term approach to benchmark evaluations",
"version": "0.0.2",
"project_urls": null,
"split_keywords": [
"data-science",
" machine-learning",
" data-mining",
" time-series",
" time-series-analysis",
" time-series-classification",
" time-series-regression",
" time-series-machine-learning",
" benchmarking",
" benchmarking-machine-learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "85f4e8a1ae70c0b0b05221ba53eb929d9e7c77e84a8c33c0297109ba02421b58",
"md5": "a23c9ca5676cabe6382c980a5a3d4859",
"sha256": "d5afc046ac5d34833e25f9a475006d7104f497cf34aff8ce755e7d3ecba9b406"
},
"downloads": -1,
"filename": "multi_comp_matrix-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a23c9ca5676cabe6382c980a5a3d4859",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 30898,
"upload_time": "2024-11-04T17:02:28",
"upload_time_iso_8601": "2024-11-04T17:02:28.314250Z",
"url": "https://files.pythonhosted.org/packages/85/f4/e8a1ae70c0b0b05221ba53eb929d9e7c77e84a8c33c0297109ba02421b58/multi_comp_matrix-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "90b8fa04fabe221e552ca82b66317e455558e825376ef0a517f553175ac07930",
"md5": "58fe0239b6b17a086494bb2f35d54ef2",
"sha256": "e0de5b6d7b5c3a399059d20fd01bb07254524838aee86f5248f3c6aaa38aecc4"
},
"downloads": -1,
"filename": "multi_comp_matrix-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "58fe0239b6b17a086494bb2f35d54ef2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 244897,
"upload_time": "2024-11-04T17:02:29",
"upload_time_iso_8601": "2024-11-04T17:02:29.644264Z",
"url": "https://files.pythonhosted.org/packages/90/b8/fa04fabe221e552ca82b66317e455558e825376ef0a517f553175ac07930/multi_comp_matrix-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-04 17:02:29",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "multi-comp-matrix"
}