[![](https://martinweigl.github.io/pycaleva/assets/logo.svg)](https://martinweigl.github.io/pycaleva/)
[Documentation]: https://martinweigl.github.io/pycaleva/
### A framework for calibration evaluation of binary classification models.
---
When performing classification tasks you sometimes want to obtain the probability of a class label instead of the class label itself. For example, it might be interesting to determine the risk of cancer for a patient. It is desireable to have a calibrated model which delivers predicted probabilities very close to the actual class membership probabilities. For this reason, this framework was developed allowing users to **measure the calibration of binary classification models**.
- Evaluate the calibration of binary classification models with probabilistic output (LogisticRegression, SVM, NeuronalNets ...).
- Apply your model to testdata and use true class labels and predicted probabilities as input for the framework.
- Various statistical tests, metrics and plots are available.
- Supports creating a calibration report in pdf-format for your model.
\
<img src="https://martinweigl.github.io/pycaleva/assets/design.png" width="600" alt="Image Design">
\
\
See the [documentation] for detailed information about classes and methods.
## Installation
$ pip install pycaleva
or build on your own
$ git clone https://github.com/MartinWeigl/pycaleva.git
$ cd pycaleva
$ python setup.py install
## Requirements
- numpy>=1.26
- scipy>=1.13
- scikit-learn>=1.4
- matplotlib>=3.8
- tqdm>=4.66
- pandas>=2.2
- statsmodels>=0.14
- fpdf2>=2.7
- ipython>=8.24
## Usage
- Import and initialize
```python
from pycaleva import CalibrationEvaluator
ce = CalibrationEvaluator(y_test, pred_prob, outsample=True, n_groups='auto')
```
- Apply statistical tests
```python
ce.hosmerlemeshow() # Hosmer Lemeshow Test
ce.pigeonheyse() # Pigeon Heyse Test
ce.z_test() # Spiegelhalter z-Test
ce.calbelt(plot=False) # Calibrationi Belt (Test only)
```
- Show calibration plot
```python
ce.calibration_plot()
```
- Show calibration belt
```python
ce.calbelt(plot=True)
```
- Get various metrics
```python
ce.metrics()
```
- Create pdf calibration report
```python
ce.calibration_report('report.pdf', 'my_model')
```
See the [documentation] of single methods for detailed usage examples.
## Example Results
| Well calibrated model | Poorly calibrated model |
| :---------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------: |
| <img src="https://martinweigl.github.io/pycaleva/assets/calplot_well.png" width="65%" alt="Image Calibration plot well calibrated"> | <img src="https://martinweigl.github.io/pycaleva/assets/calplot_poorly.png" width="65%" alt="Image Calibration plot poorly calibrated"> |
| <img src="https://martinweigl.github.io/pycaleva/assets/calbelt_well.png" width="65%" alt="Image Calibration belt well calibrated"> | <img src="https://martinweigl.github.io/pycaleva/assets/calbelt_poorly.png" width="65%" alt="Image Calibration belt well calibrated"> |
| <pre lang="python">hltest_result(statistic=4.982635477424991, pvalue=0.8358193332183672, dof=9)</pre> | <pre lang="python">hltest_result(statistic=26.32792475118742, pvalue=0.0018051545107069522, dof=9)</pre> |
| <pre lang="python">ztest_result(statistic=-0.21590257919669287, pvalue=0.829063686607032)</pre> | <pre lang="python">ztest_result(statistic=-3.196125145498827, pvalue=0.0013928668407116645)</pre> |
## Features
- Statistical tests for binary model calibration
- Hosmer Lemeshow Test
- Pigeon Heyse Test
- Spiegelhalter z-test
- Calibration belt
- Graphical represantions showing calibration of binary models
- Calibration plot
- Calibration belt
- Various Metrics
- Brier Score
- Adaptive Calibration Error
- Maximum Calibration Error
- Area within LOWESS Curve
- (AUROC)
The above features are explained in more detail in PyCalEva's [documentation]
## References
- **Statistical tests and metrics**:
[1] Hosmer Jr, David W., Stanley Lemeshow, and Rodney X. Sturdivant.
Applied logistic regression. Vol. 398. John Wiley & Sons, 2013.
[2] Pigeon, Joseph G., and Joseph F. Heyse.
An improved goodness of fit statistic for probability prediction models.
Biometrical Journal: Journal of Mathematical Methods in Biosciences 41.1 (1999): 71-82.
[3] Spiegelhalter, D. J. (1986). Probabilistic prediction in patient management and clinical trials.
Statistics in medicine, 5(5), 421-433.
[4] Huang, Y., Li, W., Macheret, F., Gabriel, R. A., & Ohno-Machado, L. (2020).
A tutorial on calibration measurements and calibration models for clinical prediction models.
Journal of the American Medical Informatics Association, 27(4), 621-633.
- **Calibration plot**:
[5] Jr, F. E. H. (2021). rms: Regression modeling strategies (R package version
6.2-0) [Computer software]. The Comprehensive R Archive Network.
Available from https://CRAN.R-project.org/package=rms
- **Calibration belt**:
[6] Nattino, G., Finazzi, S., & Bertolini, G. (2014). A new calibration test
and a reappraisal of the calibration belt for the assessment of prediction models
based on dichotomous outcomes. Statistics in medicine, 33(14), 2390-2407.
[7] Bulgarelli, L. (2021). calibrattion-belt: Assessment of calibration in binomial prediction models [Computer software].
Available from https://github.com/fabiankueppers/calibration-framework
[8] Nattino, G., Finazzi, S., Bertolini, G., Rossi, C., & Carrara, G. (2017).
givitiR: The giviti calibration test and belt (R package version 1.3) [Computer
software]. The Comprehensive R Archive Network.
Available from https://CRAN.R-project.org/package=givitiR
- **Others**:
[9] Sturges, H. A. (1926). The choice of a class interval.
Journal of the american statistical association, 21(153), 65-66.
For most of the implemented methods in this software you can find references in the [documentation] as well.
Raw data
{
"_id": null,
"home_page": "https://github.com/MartinWeigl/pycaleva",
"name": "pycaleva",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "calibration, classification, model, machine_learning, statistics",
"author": "Martin Weigl",
"author_email": "martinweigl48@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/40/37/a3cccb5e5274c1783a27cd1c4ea6378e315e9c7c4e42887e594be574f981/pycaleva-0.8.2.tar.gz",
"platform": null,
"description": "[![](https://martinweigl.github.io/pycaleva/assets/logo.svg)](https://martinweigl.github.io/pycaleva/)\n\n[Documentation]: https://martinweigl.github.io/pycaleva/\n\n### A framework for calibration evaluation of binary classification models.\n\n---\n\nWhen performing classification tasks you sometimes want to obtain the probability of a class label instead of the class label itself. For example, it might be interesting to determine the risk of cancer for a patient. It is desireable to have a calibrated model which delivers predicted probabilities very close to the actual class membership probabilities. For this reason, this framework was developed allowing users to **measure the calibration of binary classification models**.\n\n- Evaluate the calibration of binary classification models with probabilistic output (LogisticRegression, SVM, NeuronalNets ...).\n- Apply your model to testdata and use true class labels and predicted probabilities as input for the framework.\n- Various statistical tests, metrics and plots are available.\n- Supports creating a calibration report in pdf-format for your model.\n\n\\\n<img src=\"https://martinweigl.github.io/pycaleva/assets/design.png\" width=\"600\" alt=\"Image Design\">\n\\\n\\\nSee the [documentation] for detailed information about classes and methods.\n\n## Installation\n\n $ pip install pycaleva\n\nor build on your own\n\n $ git clone https://github.com/MartinWeigl/pycaleva.git\n $ cd pycaleva\n $ python setup.py install\n\n## Requirements\n\n- numpy>=1.26\n- scipy>=1.13\n- scikit-learn>=1.4\n- matplotlib>=3.8\n- tqdm>=4.66\n- pandas>=2.2\n- statsmodels>=0.14\n- fpdf2>=2.7\n- ipython>=8.24\n\n## Usage\n\n- Import and initialize\n ```python\n from pycaleva import CalibrationEvaluator\n ce = CalibrationEvaluator(y_test, pred_prob, outsample=True, n_groups='auto')\n ```\n- Apply statistical tests\n ```python\n ce.hosmerlemeshow() # Hosmer Lemeshow Test\n ce.pigeonheyse() # Pigeon Heyse Test\n ce.z_test() # Spiegelhalter z-Test\n ce.calbelt(plot=False) # Calibrationi Belt (Test only)\n ```\n- Show calibration plot\n ```python\n ce.calibration_plot()\n ```\n- Show calibration belt\n ```python\n ce.calbelt(plot=True)\n ```\n- Get various metrics\n ```python\n ce.metrics()\n ```\n- Create pdf calibration report\n ```python\n ce.calibration_report('report.pdf', 'my_model')\n ```\n\nSee the [documentation] of single methods for detailed usage examples.\n\n## Example Results\n\n| Well calibrated model | Poorly calibrated model |\n| :---------------------------------------------------------------------------------------------------------------------------------: | :-------------------------------------------------------------------------------------------------------------------------------------: |\n| <img src=\"https://martinweigl.github.io/pycaleva/assets/calplot_well.png\" width=\"65%\" alt=\"Image Calibration plot well calibrated\"> | <img src=\"https://martinweigl.github.io/pycaleva/assets/calplot_poorly.png\" width=\"65%\" alt=\"Image Calibration plot poorly calibrated\"> |\n| <img src=\"https://martinweigl.github.io/pycaleva/assets/calbelt_well.png\" width=\"65%\" alt=\"Image Calibration belt well calibrated\"> | <img src=\"https://martinweigl.github.io/pycaleva/assets/calbelt_poorly.png\" width=\"65%\" alt=\"Image Calibration belt well calibrated\"> |\n| <pre lang=\"python\">hltest_result(statistic=4.982635477424991, pvalue=0.8358193332183672, dof=9)</pre> | <pre lang=\"python\">hltest_result(statistic=26.32792475118742, pvalue=0.0018051545107069522, dof=9)</pre> |\n| <pre lang=\"python\">ztest_result(statistic=-0.21590257919669287, pvalue=0.829063686607032)</pre> | <pre lang=\"python\">ztest_result(statistic=-3.196125145498827, pvalue=0.0013928668407116645)</pre> |\n\n## Features\n\n- Statistical tests for binary model calibration\n - Hosmer Lemeshow Test\n - Pigeon Heyse Test\n - Spiegelhalter z-test\n - Calibration belt\n- Graphical represantions showing calibration of binary models\n - Calibration plot\n - Calibration belt\n- Various Metrics\n - Brier Score\n - Adaptive Calibration Error\n - Maximum Calibration Error\n - Area within LOWESS Curve\n - (AUROC)\n\nThe above features are explained in more detail in PyCalEva's [documentation]\n\n## References\n\n- **Statistical tests and metrics**:\n\n [1] Hosmer Jr, David W., Stanley Lemeshow, and Rodney X. Sturdivant.\n Applied logistic regression. Vol. 398. John Wiley & Sons, 2013.\n\n [2] Pigeon, Joseph G., and Joseph F. Heyse.\n An improved goodness of fit statistic for probability prediction models.\n Biometrical Journal: Journal of Mathematical Methods in Biosciences\u00a041.1 (1999): 71-82.\n\n [3] Spiegelhalter, D. J. (1986). Probabilistic prediction in patient management and clinical trials.\n Statistics in medicine, 5(5), 421-433.\n\n [4] Huang, Y., Li, W., Macheret, F., Gabriel, R. A., & Ohno-Machado, L. (2020).\n A tutorial on calibration measurements and calibration models for clinical prediction models.\n Journal of the American Medical Informatics Association, 27(4), 621-633.\n\n- **Calibration plot**:\n\n [5] Jr, F. E. H. (2021). rms: Regression modeling strategies (R package version\n 6.2-0) [Computer software]. The Comprehensive R Archive Network.\n Available from https://CRAN.R-project.org/package=rms\n\n- **Calibration belt**:\n\n [6] Nattino, G., Finazzi, S., & Bertolini, G. (2014). A new calibration test\n and a reappraisal of the calibration belt for the assessment of prediction models\n based on dichotomous outcomes. Statistics in medicine, 33(14), 2390-2407.\n\n [7] Bulgarelli, L. (2021). calibrattion-belt: Assessment of calibration in binomial prediction models [Computer software].\n Available from https://github.com/fabiankueppers/calibration-framework\n\n [8] Nattino, G., Finazzi, S., Bertolini, G., Rossi, C., & Carrara, G. (2017).\n givitiR: The giviti calibration test and belt (R package version 1.3) [Computer\n software]. The Comprehensive R Archive Network.\n Available from https://CRAN.R-project.org/package=givitiR\n\n- **Others**:\n\n [9] Sturges, H. A. (1926). The choice of a class interval.\n Journal of the american statistical association, 21(153), 65-66.\n\nFor most of the implemented methods in this software you can find references in the [documentation] as well.\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A framework for calibration evaluation of binary classification models",
"version": "0.8.2",
"project_urls": {
"Documentation": "https://martinweigl.github.io/pycaleva/",
"Homepage": "https://github.com/MartinWeigl/pycaleva",
"Source": "https://github.com/MartinWeigl/pycaleva"
},
"split_keywords": [
"calibration",
" classification",
" model",
" machine_learning",
" statistics"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "6e793c3eacc17a20e64c4f3a28d470ba71f4761d769574a8a47f9915ae6488a9",
"md5": "0364b64530d764ec3e1a90c39aec7285",
"sha256": "a6b5b8b94c34d56f9555290ce5d14fe976f62180c99011edbbbd8e4d0f844b10"
},
"downloads": -1,
"filename": "pycaleva-0.8.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0364b64530d764ec3e1a90c39aec7285",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 24531,
"upload_time": "2024-05-02T20:51:06",
"upload_time_iso_8601": "2024-05-02T20:51:06.624614Z",
"url": "https://files.pythonhosted.org/packages/6e/79/3c3eacc17a20e64c4f3a28d470ba71f4761d769574a8a47f9915ae6488a9/pycaleva-0.8.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4037a3cccb5e5274c1783a27cd1c4ea6378e315e9c7c4e42887e594be574f981",
"md5": "9e69f367d0dcbc19901b64b8450457d9",
"sha256": "3fd086804dd5752daefb5c6e125378d9a0cdde91083df8b9a866828e164b2419"
},
"downloads": -1,
"filename": "pycaleva-0.8.2.tar.gz",
"has_sig": false,
"md5_digest": "9e69f367d0dcbc19901b64b8450457d9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 23061,
"upload_time": "2024-05-02T20:51:08",
"upload_time_iso_8601": "2024-05-02T20:51:08.671820Z",
"url": "https://files.pythonhosted.org/packages/40/37/a3cccb5e5274c1783a27cd1c4ea6378e315e9c7c4e42887e594be574f981/pycaleva-0.8.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-02 20:51:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "MartinWeigl",
"github_project": "pycaleva",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "pycaleva"
}