![PyPI - Version](https://img.shields.io/pypi/v/MLstatkit)
![PyPI - License](https://img.shields.io/pypi/l/MLstatkit)
![PyPI - Status](https://img.shields.io/pypi/status/MLstatkit)
![PyPI - Wheel](https://img.shields.io/pypi/wheel/MLstatkit)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/MLstatkit)
![PyPI - Download](https://img.shields.io/pypi/dm/MLstatkit)
[![Downloads](https://static.pepy.tech/badge/MLstatkit)](https://pepy.tech/project/MLstatkit)
# MLstatkit
MLstatkit is a comprehensive Python library designed to seamlessly integrate established statistical methods into machine learning projects. It encompasses a variety of tools, including **Delong's test** for comparing areas under two correlated Receiver Operating Characteristic (ROC) curves, **Bootstrapping** for calculating confidence intervals, **AUC2OR** for converting the Area Under the Receiver Operating Characteristic Curve (AUC) into several related statistics such as Cohen’s d, Pearson’s rpb, odds-ratio, and natural log odds-ratio, and **Permutation_test** for assessing the statistical significance of the difference between two models' metrics by randomly shuffling the data and recalculating the metrics to create a distribution of differences. With its modular design, MLstatkit offers researchers and data scientists a flexible and powerful toolkit to augment their analyses and model evaluations, catering to a broad spectrum of statistical testing needs within the domain of machine learning.
## Installation
Install MLstatkit directly from PyPI using pip:
```bash
pip install MLstatkit
```
## Usage
### Delong's Test
`Delong_test` function enables a statistical evaluation of the differences between the **areas under two correlated Receiver Operating Characteristic (ROC) curves derived from distinct models**. This facilitates a deeper understanding of comparative model performance.
#### Parameters:
- **true** : array-like of shape (n_samples,)
True binary labels in range {0, 1}.
- **prob_A** : array-like of shape (n_samples,)
Predicted probabilities by the first model.
- **prob_B** : array-like of shape (n_samples,)
Predicted probabilities by the second model.
#### Returns:
- **z_score** : float
The z score from comparing the AUCs of two models.
- **p_value** : float
The p value from comparing the AUCs of two models.
#### Example:
```python
from MLstatkit.stats import Delong_test
# Example data
true = np.array([0, 1, 0, 1])
prob_A = np.array([0.1, 0.4, 0.35, 0.8])
prob_B = np.array([0.2, 0.3, 0.4, 0.7])
# Perform DeLong's test
z_score, p_value = Delong_test(true, prob_A, prob_B)
print(f"Z-Score: {z_score}, P-Value: {p_value}")
```
This demonstrates the usage of `Delong_test` to statistically compare the AUCs of two models based on their probabilities and the true labels. The returned z-score and p-value help in understanding if the difference in model performances is statistically significant.
### Bootstrapping for Confidence Intervals
The `Bootstrapping` function calculates confidence intervals for specified performance metrics using bootstrapping, providing a measure of the estimation's reliability. It supports calculation for AUROC (area under the ROC curve), AUPRC (area under the precision-recall curve), and F1 score metrics.
#### Parameters:
- **true** : array-like of shape (n_samples,)
True binary labels, where the labels are either {0, 1}.
- **prob** : array-like of shape (n_samples,)
Predicted probabilities, as returned by a classifier's predict_proba method, or binary predictions based on the specified scoring function and threshold.
- **metric_str** : str, default='f1'
Identifier for the scoring function to use. Supported values include 'f1', 'accuracy', 'recall', 'precision', 'roc_auc', 'pr_auc', and 'average_precision'.
- **n_bootstraps** : int, default=1000
The number of bootstrap iterations to perform. Increasing this number improves the reliability of the confidence interval estimation but also increases computational time.
- **confidence_level** : float, default=0.95
The confidence level for the interval estimation. For instance, 0.95 represents a 95% confidence interval.
- **threshold** : float, default=0.5
A threshold value used for converting probabilities to binary labels for metrics like 'f1', where applicable.
- **average** : str, default='macro'
Specifies the method of averaging to apply to multi-class/multi-label targets. Other options include 'micro', 'samples', 'weighted', and 'binary'.
- **random_state** : int, default=0
Seed for the random number generator. This parameter ensures reproducibility of results.
#### Returns:
- **original_score** : float
The score calculated from the original dataset without bootstrapping.
- **confidence_lower** : float
The lower bound of the confidence interval.
- **confidence_upper** : float
The upper bound of the confidence interval.
#### Examples:
```python
from MLstatkit.stats import Bootstrapping
# Example data
y_true = np.array([0, 1, 0, 0, 1, 1, 0, 1, 0])
y_prob = np.array([0.1, 0.4, 0.35, 0.8, 0.2, 0.3, 0.4, 0.7, 0.05])
# Calculate confidence intervals for AUROC
original_score, confidence_lower, confidence_upper = Bootstrapping(y_true, y_prob, 'roc_auc')
print(f"AUROC: {original_score:.3f}, Confidence interval: [{confidence_lower:.3f} - {confidence_upper:.3f}]")
# Calculate confidence intervals for AUPRC
original_score, confidence_lower, confidence_upper = Bootstrapping(y_true, y_prob, 'pr_auc')
print(f"AUPRC: {original_score:.3f}, Confidence interval: [{confidence_lower:.3f} - {confidence_upper:.3f}]")
# Calculate confidence intervals for F1 score with a custom threshold
original_score, confidence_lower, confidence_upper = Bootstrapping(y_true, y_prob, 'f1', threshold=0.5)
print(f"F1 Score: {original_score:.3f}, Confidence interval: [{confidence_lower:.3f} - {confidence_upper:.3f}]")
# Calculate confidence intervals for AUROC, AUPRC, F1 score
for score in ['roc_auc', 'pr_auc', 'f1']:
original_score, conf_lower, conf_upper = Bootstrapping(y_true, y_prob, score, threshold=0.5)
print(f"{score.upper()} original score: {original_score:.3f}, confidence interval: [{conf_lower:.3f} - {conf_upper:.3f}]")
```
### Permutation Test for Statistical Significance
The `Permutation_test` function assesses the statistical significance of the difference between two models' metrics by randomly shuffling the data and recalculating the metrics to create a distribution of differences. This method does not assume a specific distribution of the data, making it a robust choice for comparing model performance.
#### Parameters:
- **y_true** : array-like of shape (n_samples,)
True binary labels, where the labels are either {0, 1}.
- **prob_model_A** : array-like of shape (n_samples,)
Predicted probabilities from the first model.
- **prob_model_B** : array-like of shape (n_samples,)
Predicted probabilities from the second model.
- **metric_str** : str, default='f1'
The metric for comparison. Supported metrics include 'f1', 'accuracy', 'recall', 'precision', 'roc_auc', 'pr_auc', and 'average_precision'.
- **n_bootstraps** : int, default=1000
The number of permutation samples to generate.
- **threshold** : float, default=0.5
A threshold value used for converting probabilities to binary labels for metrics like 'f1', where applicable.
- **average** : str, default='macro'
Specifies the method of averaging to apply to multi-class/multi-label targets. Other options include 'micro', 'samples', 'weighted', and 'binary'.
- **random_state** : int, default=0
Seed for the random number generator. This parameter ensures reproducibility of results.
#### Returns:
- **metric_a** : float
The calculated metric for model A using the original data.
- **metric_b** : float
The calculated metric for model B using the original data.
- **p_value** : float
The p-value from the permutation test, indicating the probability of observing a difference as extreme as, or more extreme than, the observed difference under the null hypothesis.
- **benchmark** : float
The observed difference between the metrics of model A and model B.
- **samples_mean** : float
The mean of the permuted differences.
- **samples_std** : float
The standard deviation of the permuted differences.
#### Examples:
```python
from MLstatkit.stats import Permutation_test
y_true = np.array([0, 1, 0, 0, 1, 1, 0, 1, 0])
prob_model_A = np.array([0.1, 0.4, 0.35, 0.8, 0.2, 0.3, 0.4, 0.7, 0.05])
prob_model_B = np.array([0.2, 0.3, 0.25, 0.85, 0.15, 0.35, 0.45, 0.65, 0.01])
# Conduct a permutation test to compare F1 scores
metric_a, metric_b, p_value, benchmark, samples_mean, samples_std = Permutation_test(
y_true, prob_model_A, prob_model_B, 'f1'
)
print(f"F1 Score Model A: {metric_a:.5f}, Model B: {metric_b:.5f}")
print(f"Observed Difference: {benchmark:.5f}, p-value: {p_value:.5f}")
print(f"Permuted Differences Mean: {samples_mean:.5f}, Std: {samples_std:.5f}")
```
### Conversion of AUC to Odds Ratio (OR)
The `AUC2OR` function converts an Area Under the Curve (AUC) value to an Odds Ratio (OR) and optionally returns intermediate values such as t, z, d, and ln_OR. This conversion is useful for understanding the relationship between AUC, a common metric in binary classification, and OR, which is often used in statistical analyses.
#### Parameters:
- **AUC** : float
The Area Under the Curve (AUC) value to be converted.
- **return_all** : bool, default=False
If True, returns intermediate values (t, z, d, ln_OR) in addition to OR.
#### Returns:
- **OR** : float
The calculated Odds Ratio (OR) from the given AUC value.
- **t** : float, optional
Intermediate value calculated from AUC.
- **z** : float, optional
Intermediate value calculated from t.
- **d** : float, optional
Intermediate value calculated from z.
- **ln_OR** : float, optional
The natural logarithm of the Odds Ratio.
#### Examples:
```python
from MLstatkit.stats import AUC2OR
AUC = 0.7 # Example AUC value
# Convert AUC to OR and retrieve all intermediate values
t, z, d, ln_OR, OR = AUC2OR(AUC, return_all=True)
print(f"t: {t:.5f}, z: {z:.5f}, d: {d:.5f}, ln_OR: {ln_OR:.5f}, OR: {OR:.5f}")
# Convert AUC to OR without intermediate values
OR = AUC2OR(AUC)
print(f"OR: {OR:.5f}")
```
## References
### Delong's Test
The implementation of `Delong_test` in MLstatkit is based on the following publication:
- Xu Sun and Weichao Xu, "Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves," in *IEEE Signal Processing Letters*, vol. 21, no. 11, pp. 1389-1393, 2014, IEEE.
### Bootstrapping
The `Bootstrapping` method for calculating confidence intervals does not directly reference a single publication but is a widely accepted statistical technique for estimating the distribution of a metric by resampling with replacement. For a comprehensive overview of bootstrapping methods, see:
- B. Efron and R. Tibshirani, "An Introduction to the Bootstrap," Chapman & Hall/CRC Monographs on Statistics & Applied Probability, 1994.
### Permutation Test
The `Permutation_tests` are utilized to assess the significance of the difference in performance metrics between two models by randomly reallocating observations to groups and computing the metric. This approach does not make specific distributional assumptions, making it versatile for various data types. For a foundational discussion on permutation tests, refer to:
- P. Good, "Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses," Springer Series in Statistics, 2000.
These references lay the groundwork for the statistical tests and methodologies implemented in MLstatkit, providing users with a deep understanding of their scientific basis and applicability.
### AUC2OR
The `AUR2OR` function converts the Area Under the Receiver Operating Characteristic Curve (AUC) into several related statistics including Cohen’s d, Pearson’s rpb, odds-ratio, and natural log odds-ratio. This conversion is particularly useful in interpreting the performance of classification models. For a detailed explanation of the mathematical formulas used in this conversion, refer to:
- Salgado, J. F. (2018). "Transforming the area under the normal curve (AUC) into Cohen’s d, Pearson’s rpb, odds-ratio, and natural log odds-ratio: Two conversion tables." European Journal of Psychology Applied to Legal Context, 10(1), 35-47.
These references provide the mathematical foundation for the AUR2OR function, ensuring that users can accurately interpret the statistical significance and practical implications of their model performance metrics.
## Contributing
We welcome contributions to MLstatkit! Please see our contribution guidelines for more details.
## License
MLstatkit is distributed under the MIT License. For more information, see the LICENSE file in the GitHub repository.
### Update log
- `0.1.7` Update `README.md`
- `0.1.6` Debug.
- `0.1.5` Update `README.md`, Add `AUC2OR` function.
- `0.1.4` Update `README.md`, Add `Permutation_tests` function, Re-do `Bootstrapping` Parameters.
- `0.1.3` Update `README.md`.
- `0.1.2` Add `Bootstrapping` operation process progress display.
- `0.1.1` Update `README.md`, `setup.py`. Add `CONTRIBUTING.md`.
- `0.1.0` First edition
Raw data
{
"_id": null,
"home_page": "https://github.com/Brritany/MLstatkit",
"name": "MLstatkit",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "python, statistics, Delong test, Bootstrapping, AUC2OR",
"author": "Yong-Zhen Huang",
"author_email": "m946111005@tmu.edu.tw",
"download_url": "https://files.pythonhosted.org/packages/3d/27/2422e7d02cf0268655b5f9db49a877f9abf6deb5cbe52dab76dd82feb440/MLstatkit-0.1.7.tar.gz",
"platform": null,
"description": "![PyPI - Version](https://img.shields.io/pypi/v/MLstatkit)\n![PyPI - License](https://img.shields.io/pypi/l/MLstatkit)\n![PyPI - Status](https://img.shields.io/pypi/status/MLstatkit)\n![PyPI - Wheel](https://img.shields.io/pypi/wheel/MLstatkit)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/MLstatkit)\n![PyPI - Download](https://img.shields.io/pypi/dm/MLstatkit)\n[![Downloads](https://static.pepy.tech/badge/MLstatkit)](https://pepy.tech/project/MLstatkit)\n\n# MLstatkit\n\nMLstatkit is a comprehensive Python library designed to seamlessly integrate established statistical methods into machine learning projects. It encompasses a variety of tools, including **Delong's test** for comparing areas under two correlated Receiver Operating Characteristic (ROC) curves, **Bootstrapping** for calculating confidence intervals, **AUC2OR** for converting the Area Under the Receiver Operating Characteristic Curve (AUC) into several related statistics such as Cohen\u2019s d, Pearson\u2019s rpb, odds-ratio, and natural log odds-ratio, and **Permutation_test** for assessing the statistical significance of the difference between two models' metrics by randomly shuffling the data and recalculating the metrics to create a distribution of differences. With its modular design, MLstatkit offers researchers and data scientists a flexible and powerful toolkit to augment their analyses and model evaluations, catering to a broad spectrum of statistical testing needs within the domain of machine learning.\n\n## Installation\n\nInstall MLstatkit directly from PyPI using pip:\n\n```bash\npip install MLstatkit\n```\n\n## Usage\n\n### Delong's Test\n\n`Delong_test` function enables a statistical evaluation of the differences between the **areas under two correlated Receiver Operating Characteristic (ROC) curves derived from distinct models**. This facilitates a deeper understanding of comparative model performance.\n\n#### Parameters:\n- **true** : array-like of shape (n_samples,) \n True binary labels in range {0, 1}.\n\n- **prob_A** : array-like of shape (n_samples,) \n Predicted probabilities by the first model.\n\n- **prob_B** : array-like of shape (n_samples,) \n Predicted probabilities by the second model.\n\n#### Returns:\n- **z_score** : float \n The z score from comparing the AUCs of two models.\n\n- **p_value** : float \n The p value from comparing the AUCs of two models.\n\n#### Example:\n\n```python\nfrom MLstatkit.stats import Delong_test\n\n# Example data\ntrue = np.array([0, 1, 0, 1])\nprob_A = np.array([0.1, 0.4, 0.35, 0.8])\nprob_B = np.array([0.2, 0.3, 0.4, 0.7])\n\n# Perform DeLong's test\nz_score, p_value = Delong_test(true, prob_A, prob_B)\n\nprint(f\"Z-Score: {z_score}, P-Value: {p_value}\")\n```\n\nThis demonstrates the usage of `Delong_test` to statistically compare the AUCs of two models based on their probabilities and the true labels. The returned z-score and p-value help in understanding if the difference in model performances is statistically significant.\n\n### Bootstrapping for Confidence Intervals\n\nThe `Bootstrapping` function calculates confidence intervals for specified performance metrics using bootstrapping, providing a measure of the estimation's reliability. It supports calculation for AUROC (area under the ROC curve), AUPRC (area under the precision-recall curve), and F1 score metrics.\n\n#### Parameters:\n- **true** : array-like of shape (n_samples,) \n True binary labels, where the labels are either {0, 1}.\n- **prob** : array-like of shape (n_samples,) \n Predicted probabilities, as returned by a classifier's predict_proba method, or binary predictions based on the specified scoring function and threshold.\n- **metric_str** : str, default='f1' \n Identifier for the scoring function to use. Supported values include 'f1', 'accuracy', 'recall', 'precision', 'roc_auc', 'pr_auc', and 'average_precision'.\n- **n_bootstraps** : int, default=1000 \n The number of bootstrap iterations to perform. Increasing this number improves the reliability of the confidence interval estimation but also increases computational time.\n- **confidence_level** : float, default=0.95 \n The confidence level for the interval estimation. For instance, 0.95 represents a 95% confidence interval.\n- **threshold** : float, default=0.5 \n A threshold value used for converting probabilities to binary labels for metrics like 'f1', where applicable.\n- **average** : str, default='macro' \n Specifies the method of averaging to apply to multi-class/multi-label targets. Other options include 'micro', 'samples', 'weighted', and 'binary'.\n- **random_state** : int, default=0 \n Seed for the random number generator. This parameter ensures reproducibility of results.\n\n#### Returns:\n- **original_score** : float \n The score calculated from the original dataset without bootstrapping.\n- **confidence_lower** : float \n The lower bound of the confidence interval.\n- **confidence_upper** : float \n The upper bound of the confidence interval.\n\n#### Examples:\n\n```python\nfrom MLstatkit.stats import Bootstrapping\n\n# Example data\ny_true = np.array([0, 1, 0, 0, 1, 1, 0, 1, 0])\ny_prob = np.array([0.1, 0.4, 0.35, 0.8, 0.2, 0.3, 0.4, 0.7, 0.05])\n\n# Calculate confidence intervals for AUROC\noriginal_score, confidence_lower, confidence_upper = Bootstrapping(y_true, y_prob, 'roc_auc')\nprint(f\"AUROC: {original_score:.3f}, Confidence interval: [{confidence_lower:.3f} - {confidence_upper:.3f}]\")\n\n# Calculate confidence intervals for AUPRC\noriginal_score, confidence_lower, confidence_upper = Bootstrapping(y_true, y_prob, 'pr_auc')\nprint(f\"AUPRC: {original_score:.3f}, Confidence interval: [{confidence_lower:.3f} - {confidence_upper:.3f}]\")\n\n# Calculate confidence intervals for F1 score with a custom threshold\noriginal_score, confidence_lower, confidence_upper = Bootstrapping(y_true, y_prob, 'f1', threshold=0.5)\nprint(f\"F1 Score: {original_score:.3f}, Confidence interval: [{confidence_lower:.3f} - {confidence_upper:.3f}]\")\n\n# Calculate confidence intervals for AUROC, AUPRC, F1 score\nfor score in ['roc_auc', 'pr_auc', 'f1']:\n original_score, conf_lower, conf_upper = Bootstrapping(y_true, y_prob, score, threshold=0.5)\n print(f\"{score.upper()} original score: {original_score:.3f}, confidence interval: [{conf_lower:.3f} - {conf_upper:.3f}]\")\n```\n\n### Permutation Test for Statistical Significance\n\nThe `Permutation_test` function assesses the statistical significance of the difference between two models' metrics by randomly shuffling the data and recalculating the metrics to create a distribution of differences. This method does not assume a specific distribution of the data, making it a robust choice for comparing model performance.\n\n#### Parameters:\n- **y_true** : array-like of shape (n_samples,) \n True binary labels, where the labels are either {0, 1}.\n- **prob_model_A** : array-like of shape (n_samples,) \n Predicted probabilities from the first model.\n- **prob_model_B** : array-like of shape (n_samples,) \n Predicted probabilities from the second model.\n- **metric_str** : str, default='f1' \n The metric for comparison. Supported metrics include 'f1', 'accuracy', 'recall', 'precision', 'roc_auc', 'pr_auc', and 'average_precision'.\n- **n_bootstraps** : int, default=1000 \n The number of permutation samples to generate.\n- **threshold** : float, default=0.5 \n A threshold value used for converting probabilities to binary labels for metrics like 'f1', where applicable.\n- **average** : str, default='macro' \n Specifies the method of averaging to apply to multi-class/multi-label targets. Other options include 'micro', 'samples', 'weighted', and 'binary'.\n- **random_state** : int, default=0 \n Seed for the random number generator. This parameter ensures reproducibility of results.\n\n#### Returns:\n- **metric_a** : float \n The calculated metric for model A using the original data.\n- **metric_b** : float \n The calculated metric for model B using the original data.\n- **p_value** : float \n The p-value from the permutation test, indicating the probability of observing a difference as extreme as, or more extreme than, the observed difference under the null hypothesis.\n- **benchmark** : float \n The observed difference between the metrics of model A and model B.\n- **samples_mean** : float \n The mean of the permuted differences.\n- **samples_std** : float \n The standard deviation of the permuted differences.\n\n#### Examples:\n```python\nfrom MLstatkit.stats import Permutation_test\n\ny_true = np.array([0, 1, 0, 0, 1, 1, 0, 1, 0])\nprob_model_A = np.array([0.1, 0.4, 0.35, 0.8, 0.2, 0.3, 0.4, 0.7, 0.05])\nprob_model_B = np.array([0.2, 0.3, 0.25, 0.85, 0.15, 0.35, 0.45, 0.65, 0.01])\n\n# Conduct a permutation test to compare F1 scores\nmetric_a, metric_b, p_value, benchmark, samples_mean, samples_std = Permutation_test(\n y_true, prob_model_A, prob_model_B, 'f1'\n)\n\nprint(f\"F1 Score Model A: {metric_a:.5f}, Model B: {metric_b:.5f}\")\nprint(f\"Observed Difference: {benchmark:.5f}, p-value: {p_value:.5f}\")\nprint(f\"Permuted Differences Mean: {samples_mean:.5f}, Std: {samples_std:.5f}\")\n```\n\n### Conversion of AUC to Odds Ratio (OR)\n\nThe `AUC2OR` function converts an Area Under the Curve (AUC) value to an Odds Ratio (OR) and optionally returns intermediate values such as t, z, d, and ln_OR. This conversion is useful for understanding the relationship between AUC, a common metric in binary classification, and OR, which is often used in statistical analyses.\n\n#### Parameters:\n- **AUC** : float \n The Area Under the Curve (AUC) value to be converted.\n- **return_all** : bool, default=False \n If True, returns intermediate values (t, z, d, ln_OR) in addition to OR.\n\n#### Returns:\n- **OR** : float \n The calculated Odds Ratio (OR) from the given AUC value.\n- **t** : float, optional \n Intermediate value calculated from AUC.\n- **z** : float, optional \n Intermediate value calculated from t.\n- **d** : float, optional \n Intermediate value calculated from z.\n- **ln_OR** : float, optional \n The natural logarithm of the Odds Ratio.\n\n#### Examples:\n```python\nfrom MLstatkit.stats import AUC2OR\n\nAUC = 0.7 # Example AUC value\n\n# Convert AUC to OR and retrieve all intermediate values\nt, z, d, ln_OR, OR = AUC2OR(AUC, return_all=True)\n\nprint(f\"t: {t:.5f}, z: {z:.5f}, d: {d:.5f}, ln_OR: {ln_OR:.5f}, OR: {OR:.5f}\")\n\n# Convert AUC to OR without intermediate values\nOR = AUC2OR(AUC)\nprint(f\"OR: {OR:.5f}\")\n```\n\n## References\n\n### Delong's Test\nThe implementation of `Delong_test` in MLstatkit is based on the following publication:\n- Xu Sun and Weichao Xu, \"Fast implementation of DeLong\u2019s algorithm for comparing the areas under correlated receiver operating characteristic curves,\" in *IEEE Signal Processing Letters*, vol. 21, no. 11, pp. 1389-1393, 2014, IEEE.\n\n### Bootstrapping\nThe `Bootstrapping` method for calculating confidence intervals does not directly reference a single publication but is a widely accepted statistical technique for estimating the distribution of a metric by resampling with replacement. For a comprehensive overview of bootstrapping methods, see:\n- B. Efron and R. Tibshirani, \"An Introduction to the Bootstrap,\" Chapman & Hall/CRC Monographs on Statistics & Applied Probability, 1994.\n\n### Permutation Test\nThe `Permutation_tests` are utilized to assess the significance of the difference in performance metrics between two models by randomly reallocating observations to groups and computing the metric. This approach does not make specific distributional assumptions, making it versatile for various data types. For a foundational discussion on permutation tests, refer to:\n- P. Good, \"Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses,\" Springer Series in Statistics, 2000.\n\nThese references lay the groundwork for the statistical tests and methodologies implemented in MLstatkit, providing users with a deep understanding of their scientific basis and applicability.\n\n### AUC2OR\nThe `AUR2OR` function converts the Area Under the Receiver Operating Characteristic Curve (AUC) into several related statistics including Cohen\u2019s d, Pearson\u2019s rpb, odds-ratio, and natural log odds-ratio. This conversion is particularly useful in interpreting the performance of classification models. For a detailed explanation of the mathematical formulas used in this conversion, refer to:\n- Salgado, J. F. (2018). \"Transforming the area under the normal curve (AUC) into Cohen\u2019s d, Pearson\u2019s rpb, odds-ratio, and natural log odds-ratio: Two conversion tables.\" European Journal of Psychology Applied to Legal Context, 10(1), 35-47.\n\nThese references provide the mathematical foundation for the AUR2OR function, ensuring that users can accurately interpret the statistical significance and practical implications of their model performance metrics.\n\n## Contributing\n\nWe welcome contributions to MLstatkit! Please see our contribution guidelines for more details.\n\n## License\n\nMLstatkit is distributed under the MIT License. For more information, see the LICENSE file in the GitHub repository.\n\n### Update log\n- `0.1.7` Update `README.md`\n- `0.1.6` Debug.\n- `0.1.5` Update `README.md`, Add `AUC2OR` function.\n- `0.1.4` Update `README.md`, Add `Permutation_tests` function, Re-do `Bootstrapping` Parameters.\n- `0.1.3` Update `README.md`.\n- `0.1.2` Add `Bootstrapping` operation process progress display.\n- `0.1.1` Update `README.md`, `setup.py`. Add `CONTRIBUTING.md`.\n- `0.1.0` First edition\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "MLstatkit is a comprehensive Python library designed to seamlessly integrate established statistical methods into machine learning projects.",
"version": "0.1.7",
"project_urls": {
"Homepage": "https://github.com/Brritany/MLstatkit",
"Tracker": "https://github.com/Brritany/MLstatkit/issues"
},
"split_keywords": [
"python",
" statistics",
" delong test",
" bootstrapping",
" auc2or"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ff5bc561088d2a7d5018fe2a043f1865d76a0f3ce0aa2ade7e3dd0422b3c1bde",
"md5": "a53d9cf2a08966b58e8ebddd1452321a",
"sha256": "9f973202b996fd61cf0fbc9cfaca5a5b03b2afa2f02f8aae11442cdcbfa2c390"
},
"downloads": -1,
"filename": "MLstatkit-0.1.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a53d9cf2a08966b58e8ebddd1452321a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 10018,
"upload_time": "2024-07-05T05:41:59",
"upload_time_iso_8601": "2024-07-05T05:41:59.466899Z",
"url": "https://files.pythonhosted.org/packages/ff/5b/c561088d2a7d5018fe2a043f1865d76a0f3ce0aa2ade7e3dd0422b3c1bde/MLstatkit-0.1.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3d272422e7d02cf0268655b5f9db49a877f9abf6deb5cbe52dab76dd82feb440",
"md5": "c2e39c8c523291c9970f3b4a95f357aa",
"sha256": "b504c588f31e1b59cd0f413181c0c01df2981e9295c993e19f527133a5afb158"
},
"downloads": -1,
"filename": "MLstatkit-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "c2e39c8c523291c9970f3b4a95f357aa",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 9733,
"upload_time": "2024-07-05T05:42:01",
"upload_time_iso_8601": "2024-07-05T05:42:01.357279Z",
"url": "https://files.pythonhosted.org/packages/3d/27/2422e7d02cf0268655b5f9db49a877f9abf6deb5cbe52dab76dd82feb440/MLstatkit-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-05 05:42:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Brritany",
"github_project": "MLstatkit",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "mlstatkit"
}