# Significance Analysis
[![PyPI version](https://img.shields.io/pypi/v/significance-analysis?color=informational)](https://pypi.org/project/significance-analysis/)
[![Python versions](https://img.shields.io/pypi/pyversions/significance-analysis)](https://pypi.org/project/significance-analysis/)
[![License](https://img.shields.io/pypi/l/significance-analysis?color=informational)](LICENSE)
[![Coverage Status](./tests/coverage-badge.svg?dummy=8484744)](./tests/reports/cov_html/index.html)
[![Tests Status](./tests/tests-badge.svg?dummy=8484744)](./tests/reports/junit/report.html)
[![arXiv](https://img.shields.io/badge/arXiv-2408.02533-b31b1b.svg)](https://arxiv.org/abs/2408.02533)
This package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks, using a Linear Mixed-Effects Model-based approach.
## Note
As indicated with the `v0.x.x` version number, Significance Analysis is early stage code and APIs might change in the future.
## Documentation
For an interactive overview, please have a look at our [example](significance_analysis_example/analysis_example.ipynb).
Every dataset should be a pandas dataframe of the following format:
| algorithm | benchmark | metric | optional: budget/prior/... |
| ---------- | ---------- | ------ | -------------------------- |
| Algorithm1 | Benchmark1 | 3.141 | 1.0 |
| Algorithm1 | Benchmark1 | 6.283 | 2.0 |
| Algorithm1 | Benchmark2 | 2.718 | 1.0 |
| ... | ... | ... | ... |
| Algorithm2 | Benchmark2 | 0.621 | 2.0 |
As it is used to train a model, there can not be missing values, but duplicates are allowed.
Our function `dataset_validator` checks for this format.
## Installation
Using R, >=4.0.0
install packages: Matrix, emmeans, lmerTest and lme4
Using pip
```bash
pip install significance-analysis
```
## Usage for significance testing
1. Generate data from HPO-algorithms on benchmarks, saving data according to our format.
1. Build a model with all interesting factors
1. Do post-hoc testing
1. Plot the results as CD-diagram
In code, the usage pattern can look like this:
```python
import pandas as pd
from significance_analysis import dataframe_validator, model, cd_diagram
# 1. Generate/import dataset
data = dataframe_validator(pd.read_parquet("datasets/priorband_data.parquet"))
# 2. Build the model
mod = model("value ~ algorithm + (1|benchmark) + prior", data)
# 3. Conduct the post-hoc analysis
post_hoc_results = mod.post_hoc("algorithm")
# 4. Plot the results
cd_diagram(post_hoc_results)
```
## Usage for hypothesis testing
Use the GLRT implementation or our prepared `sanity checks` to conduct LMEM-based hypothesis testing.
In code:
```python
from significance_analysis import (
dataframe_validator,
glrt,
model,
seed_dependency_check,
benchmark_information_check,
fidelity_check,
)
# 1. Generate/import dataset
data = dataframe_validator(pd.read_parquet("datasets/priorband_data.parquet"))
# 2. Run the preconfigured sanity checks
seed_dependency_check(data)
benchmark_information_check(data)
fidelity_check(data)
# 3. Run a custom hypothesis test, comparing model_1 and model_2
model_1 = model("value ~ algorithm", data)
model_2 = model("value ~ 1", data)
glrt(model_1, model_2)
```
## Usage for metafeature impact analysis
Analyzing the influence, a metafeature has on two algorithms performances.
In code:
```python
from significance_analysis import dataframe_validator, metafeature_analysis
# 1. Generate/import dataset
data = dataframe_validator(pd.read_parquet("datasets/priorband_data.parquet"))
# 2. Run the metafeature analysis
scores = metafeature_analysis(data, ("HB", "PB"), "prior")
```
For more details and features please have a look at our [example](significance_analysis_example/analysis_example.py).
## Contributing
We welcome contributions from everyone, feel free to raise issues or submit pull requests.
______________________________________________________________________
### To cite the paper or code
```bibtex
@misc{geburek2024lmemsposthocanalysishpo,
title={LMEMs for post-hoc analysis of HPO Benchmarking},
author={Anton Geburek and Neeratyoy Mallik and Danny Stoll and Xavier Bouthillier and Frank Hutter},
year={2024},
eprint={2408.02533},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "significance_analysis",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.9",
"maintainer_email": null,
"keywords": "Hyperparameter Optimization, AutoML",
"author": "Anton Merlin Geburek",
"author_email": "gebureka@cs.uni-freiburg.de",
"download_url": "https://files.pythonhosted.org/packages/d7/0a/dae02ba72dfd9561af64ab173c2cb08957f6f0f9968f5374a689f5e126ef/significance_analysis-0.2.4.tar.gz",
"platform": null,
"description": "# Significance Analysis\n\n[![PyPI version](https://img.shields.io/pypi/v/significance-analysis?color=informational)](https://pypi.org/project/significance-analysis/)\n[![Python versions](https://img.shields.io/pypi/pyversions/significance-analysis)](https://pypi.org/project/significance-analysis/)\n[![License](https://img.shields.io/pypi/l/significance-analysis?color=informational)](LICENSE)\n[![Coverage Status](./tests/coverage-badge.svg?dummy=8484744)](./tests/reports/cov_html/index.html)\n[![Tests Status](./tests/tests-badge.svg?dummy=8484744)](./tests/reports/junit/report.html)\n[![arXiv](https://img.shields.io/badge/arXiv-2408.02533-b31b1b.svg)](https://arxiv.org/abs/2408.02533)\n\nThis package is used to analyse datasets of different HPO-algorithms performing on multiple benchmarks, using a Linear Mixed-Effects Model-based approach.\n\n## Note\n\nAs indicated with the `v0.x.x` version number, Significance Analysis is early stage code and APIs might change in the future.\n\n## Documentation\n\nFor an interactive overview, please have a look at our [example](significance_analysis_example/analysis_example.ipynb).\n\nEvery dataset should be a pandas dataframe of the following format:\n\n| algorithm | benchmark | metric | optional: budget/prior/... |\n| ---------- | ---------- | ------ | -------------------------- |\n| Algorithm1 | Benchmark1 | 3.141 | 1.0 |\n| Algorithm1 | Benchmark1 | 6.283 | 2.0 |\n| Algorithm1 | Benchmark2 | 2.718 | 1.0 |\n| ... | ... | ... | ... |\n| Algorithm2 | Benchmark2 | 0.621 | 2.0 |\n\nAs it is used to train a model, there can not be missing values, but duplicates are allowed.\nOur function `dataset_validator` checks for this format.\n\n## Installation\n\nUsing R, >=4.0.0\ninstall packages: Matrix, emmeans, lmerTest and lme4\n\nUsing pip\n\n```bash\npip install significance-analysis\n```\n\n## Usage for significance testing\n\n1. Generate data from HPO-algorithms on benchmarks, saving data according to our format.\n1. Build a model with all interesting factors\n1. Do post-hoc testing\n1. Plot the results as CD-diagram\n\nIn code, the usage pattern can look like this:\n\n```python\nimport pandas as pd\nfrom significance_analysis import dataframe_validator, model, cd_diagram\n\n\n# 1. Generate/import dataset\ndata = dataframe_validator(pd.read_parquet(\"datasets/priorband_data.parquet\"))\n\n# 2. Build the model\nmod = model(\"value ~ algorithm + (1|benchmark) + prior\", data)\n\n# 3. Conduct the post-hoc analysis\npost_hoc_results = mod.post_hoc(\"algorithm\")\n\n# 4. Plot the results\ncd_diagram(post_hoc_results)\n```\n\n## Usage for hypothesis testing\n\nUse the GLRT implementation or our prepared `sanity checks` to conduct LMEM-based hypothesis testing.\n\nIn code:\n\n```python\nfrom significance_analysis import (\n dataframe_validator,\n glrt,\n model,\n seed_dependency_check,\n benchmark_information_check,\n fidelity_check,\n)\n\n# 1. Generate/import dataset\ndata = dataframe_validator(pd.read_parquet(\"datasets/priorband_data.parquet\"))\n\n# 2. Run the preconfigured sanity checks\nseed_dependency_check(data)\nbenchmark_information_check(data)\nfidelity_check(data)\n\n# 3. Run a custom hypothesis test, comparing model_1 and model_2\nmodel_1 = model(\"value ~ algorithm\", data)\nmodel_2 = model(\"value ~ 1\", data)\nglrt(model_1, model_2)\n```\n\n## Usage for metafeature impact analysis\n\nAnalyzing the influence, a metafeature has on two algorithms performances.\n\nIn code:\n\n```python\nfrom significance_analysis import dataframe_validator, metafeature_analysis\n\n# 1. Generate/import dataset\ndata = dataframe_validator(pd.read_parquet(\"datasets/priorband_data.parquet\"))\n\n# 2. Run the metafeature analysis\nscores = metafeature_analysis(data, (\"HB\", \"PB\"), \"prior\")\n```\n\nFor more details and features please have a look at our [example](significance_analysis_example/analysis_example.py).\n\n## Contributing\n\nWe welcome contributions from everyone, feel free to raise issues or submit pull requests.\n\n______________________________________________________________________\n\n### To cite the paper or code\n\n```bibtex\n@misc{geburek2024lmemsposthocanalysishpo,\ntitle={LMEMs for post-hoc analysis of HPO Benchmarking},\nauthor={Anton Geburek and Neeratyoy Mallik and Danny Stoll and Xavier Bouthillier and Frank Hutter},\nyear={2024},\neprint={2408.02533},\narchivePrefix={arXiv},\nprimaryClass={cs.LG}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Significance Analysis for HPO-algorithms performing on multiple benchmarks",
"version": "0.2.4",
"project_urls": null,
"split_keywords": [
"hyperparameter optimization",
" automl"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7ca39e179649decbcd53e4620add74921f3e1e89a3bcea48114b9825dfe1c442",
"md5": "e24a34f4fe4f7822bea2641c44c5dc60",
"sha256": "f013cf5c9fa95a13483400b3e04f09585dca5b80d40686612c75f12e0016ccb8"
},
"downloads": -1,
"filename": "significance_analysis-0.2.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e24a34f4fe4f7822bea2641c44c5dc60",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.9",
"size": 11791,
"upload_time": "2024-11-01T13:39:31",
"upload_time_iso_8601": "2024-11-01T13:39:31.269644Z",
"url": "https://files.pythonhosted.org/packages/7c/a3/9e179649decbcd53e4620add74921f3e1e89a3bcea48114b9825dfe1c442/significance_analysis-0.2.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d70adae02ba72dfd9561af64ab173c2cb08957f6f0f9968f5374a689f5e126ef",
"md5": "17663078328cedf027a228a644e02271",
"sha256": "8bd1784870849360230528c3a9b0f70bda61e8084fe313e9261f31a53f15db07"
},
"downloads": -1,
"filename": "significance_analysis-0.2.4.tar.gz",
"has_sig": false,
"md5_digest": "17663078328cedf027a228a644e02271",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.9",
"size": 14523,
"upload_time": "2024-11-01T13:39:32",
"upload_time_iso_8601": "2024-11-01T13:39:32.989052Z",
"url": "https://files.pythonhosted.org/packages/d7/0a/dae02ba72dfd9561af64ab173c2cb08957f6f0f9968f5374a689f5e126ef/significance_analysis-0.2.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-01 13:39:32",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "significance_analysis"
}