# MED3pa Package
## Table of Contents
- [Overview](#overview)
- [Key Functionalities](#key-functionalities)
- [Subpackages](#subpackages)
- [Getting Started with the Package](#getting-started)
- [Installation](#installation)
- [A Simple Example](#a-simple-example)
- [Tutorials](#tutorials)
- [Acknowledgement](#acknowledgement)
- [References](#references)
- [Authors](#authors)
- [Statement](#statement)
- [Supported Python Versions](#supported-python-versions)
## Overview
<img src="https://github.com/lyna1404/MED3pa/blob/main/docs/diagrams/package_white_bg.svg" alt="Overview" style="width:100%;">
The **MED3pa** package is specifically designed to address critical challenges in deploying machine learning models, particularly focusing on the robustness and reliability of models under real-world conditions. It provides comprehensive tools for evaluating model stability and performance in the face of **covariate shifts**, **uncertainty**, and **problematic data profiles**.
## Key Functionalities
- **Covariate Shift Detection**: Utilizing the Detectron subpackage, MED3pa can identify significant shifts in data distributions that might affect the model’s predictions. This feature is crucial for applications such as healthcare, where early detection of shifts can prevent erroneous decisions.
- **Uncertainty and Confidence Estimation**: Through the med3pa subpackage, the package measures the uncertainty and predictive confidence at both individual and group levels. This helps in understanding the reliability of model predictions and in making informed decisions based on model outputs.
- **Identification of Problematic Profiles**: MED3pa analyzes data profiles that consistently lead to poor model performance. This capability allows developers to refine training datasets or retrain models to handle these edge cases effectively.
## Subpackages
<p align="center">
<img src="https://github.com/lyna1404/MED3pa/blob/main/docs/diagrams/subpackages.svg" alt="Overview">
</p>
The package is structured into four distinct subpackages:
- **datasets**: Stores and manages the dataset.
- **models**: Handles ML models operations.
- **detectron**: Evaluates the model against covariate shift.
- **med3pa**: Evaluates the model’s performance & extracts problematic profiles.
This modularity allows users to easily integrate and utilize specific functionalities tailored to their needs without dealing with unnecessary complexities.
## Getting Started with the Package
To get started with MED3pa, follow the installation instructions and usage examples provided in the documentation.
### Installation
```bash
pip install MED3pa
```
### A simple exemple
```python
from MED3pa.datasets import DatasetsManager
from MED3pa.models import BaseModelManager, ModelFactory
from MED3pa.med3pa import Med3paDetectronExperiment
# Initialize the DatasetsManager
datasets = DatasetsManager()
# Load datasets for training, validation, reference, and testing
datasets.set_from_file(dataset_type="training", file='./tutorials/data/train_data.csv', target_column_name='Outcome')
datasets.set_from_file(dataset_type="validation", file='./tutorials/data/val_data.csv', target_column_name='Outcome')
datasets.set_from_file(dataset_type="reference", file='./tutorials/data/test_data.csv', target_column_name='Outcome')
datasets.set_from_file(dataset_type="testing", file='./tutorials/data/test_data_shifted_0.6.csv', target_column_name='Outcome')
# Initialize the model factory and load the pre-trained model
factory = ModelFactory()
model = factory.create_model_from_pickled("./tutorials/models/diabetes_xgb_model.pkl")
# Set the base model using BaseModelManager
base_model_manager = BaseModelManager()
base_model_manager.set_base_model(model=model)
# Execute the integrated MED3PA and Detectron experiment
reference_results, test_results, detectron_results = Med3paDetectronExperiment.run(
datasets=datasets,
base_model_manager=base_model_manager,
)
# Save the results to a specified directory
reference_det_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/reference')
test_det_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/test')
detectron_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/detectron')
```
### Tutorials
We have created many [tutorial notebooks](https://github.com/lyna1404/MED3pa/tree/main/tutorials) to assist you in learning how to use the different parts of the package.
## Acknowledgement
MED3pa is an open-source package developed at the [MEDomics-Udes](https://www.medomics-udes.org/en/) laboratory with the collaboration of the international consortium [MEDomics](https://www.medomics.ai/). We welcome any contribution and feedback.
## References
This package utilizes the methods described in the following work:
Ginsberg, T., Liang, Z., & Krishnan, R. G. (2023). [A Learning Based Hypothesis Test for Harmful Covariate Shift](https://openreview.net/forum?id=rdfgqiwz7lZ). In *The Eleventh International Conference on Learning Representations*.
## Authors
* [Lyna Chikouche: ](https://www.linkedin.com/in/lynahiba-chikouche-62a5181bb/) Research intern at MEDomics-Udes laboratory.
* [Ludmila Amriou: ](https://www.linkedin.com/in/ludmila-amriou-875b58238//) Research intern at MEDomics-Udes laboratory.
* [Olivier Lefebvre: ](https://www.linkedin.com/in/olivier-lefebvre-bb8837162/) Student (Ph. D. Computer science) at université de Sherbrooke
* [Martin Vallières: ](https://www.linkedin.com/in/martvallieres/) Assistant professor, computer science department at université de Sherbrooke
## Statement
This package is part of https://github.com/medomics, a package providing research utility tools for developing precision medicine applications.
```
Copyright (C) 2024 MEDomics consortium
GPL3 LICENSE SYNOPSIS
Here's what the license entails:
1. Anyone can copy, modify and distribute this software.
2. You have to include the license and copyright notice with each and every distribution.
3. You can use this software privately.
4. You can use this software for commercial purposes.
5. If you dare build your business solely from this code, you risk open-sourcing the whole code base.
6. If you modify it, you have to indicate changes made to the code.
7. Any modifications of this code base MUST be distributed with the same license, GPLv3.
8. This software is provided without warranty.
9. The software author or license can not be held liable for any damages inflicted by the software.
```
More information on about the [LICENSE can be found here](https://github.com/MEDomics-UdeS/MEDimage/blob/main/LICENSE.md)
## Supported Python Versions
The **MED3pa** package is developed and tested with Python 3.12.3.
Additionally, it is compatible with the following Python versions:
- Python 3.11.x
- Python 3.10.x
- Python 3.9.x
While the package may work with other versions of Python, these are the versions we officially support and recommend.
Raw data
{
"_id": null,
"home_page": "https://github.com/lyna1404/MED3pa",
"name": "MED3pa",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "MEDomics consortium",
"author_email": "medomics.info@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/b1/70/51d5739339c51fafa1c938590bb89e297eabe135ee478fd72e066926695f/med3pa-0.1.37.tar.gz",
"platform": null,
"description": "# MED3pa Package\n\n## Table of Contents\n- [Overview](#overview)\n- [Key Functionalities](#key-functionalities)\n- [Subpackages](#subpackages)\n- [Getting Started with the Package](#getting-started)\n - [Installation](#installation)\n - [A Simple Example](#a-simple-example)\n- [Tutorials](#tutorials)\n- [Acknowledgement](#acknowledgement)\n- [References](#references)\n- [Authors](#authors)\n- [Statement](#statement)\n- [Supported Python Versions](#supported-python-versions)\n\n## Overview\n\n<img src=\"https://github.com/lyna1404/MED3pa/blob/main/docs/diagrams/package_white_bg.svg\" alt=\"Overview\" style=\"width:100%;\">\n\nThe **MED3pa** package is specifically designed to address critical challenges in deploying machine learning models, particularly focusing on the robustness and reliability of models under real-world conditions. It provides comprehensive tools for evaluating model stability and performance in the face of **covariate shifts**, **uncertainty**, and **problematic data profiles**.\n\n## Key Functionalities\n\n- **Covariate Shift Detection**: Utilizing the Detectron subpackage, MED3pa can identify significant shifts in data distributions that might affect the model\u2019s predictions. This feature is crucial for applications such as healthcare, where early detection of shifts can prevent erroneous decisions.\n\n- **Uncertainty and Confidence Estimation**: Through the med3pa subpackage, the package measures the uncertainty and predictive confidence at both individual and group levels. This helps in understanding the reliability of model predictions and in making informed decisions based on model outputs.\n\n- **Identification of Problematic Profiles**: MED3pa analyzes data profiles that consistently lead to poor model performance. This capability allows developers to refine training datasets or retrain models to handle these edge cases effectively.\n\n## Subpackages\n\n<p align=\"center\">\n <img src=\"https://github.com/lyna1404/MED3pa/blob/main/docs/diagrams/subpackages.svg\" alt=\"Overview\">\n</p>\n\nThe package is structured into four distinct subpackages:\n\n- **datasets**: Stores and manages the dataset.\n- **models**: Handles ML models operations.\n- **detectron**: Evaluates the model against covariate shift.\n- **med3pa**: Evaluates the model\u2019s performance & extracts problematic profiles.\n\nThis modularity allows users to easily integrate and utilize specific functionalities tailored to their needs without dealing with unnecessary complexities.\n\n## Getting Started with the Package\n\nTo get started with MED3pa, follow the installation instructions and usage examples provided in the documentation.\n\n### Installation\n\n```bash\npip install MED3pa\n```\n\n### A simple exemple\n\n```python\n \n from MED3pa.datasets import DatasetsManager\n from MED3pa.models import BaseModelManager, ModelFactory\n from MED3pa.med3pa import Med3paDetectronExperiment\n \n # Initialize the DatasetsManager\n datasets = DatasetsManager()\n\n # Load datasets for training, validation, reference, and testing\n datasets.set_from_file(dataset_type=\"training\", file='./tutorials/data/train_data.csv', target_column_name='Outcome')\n datasets.set_from_file(dataset_type=\"validation\", file='./tutorials/data/val_data.csv', target_column_name='Outcome')\n datasets.set_from_file(dataset_type=\"reference\", file='./tutorials/data/test_data.csv', target_column_name='Outcome')\n datasets.set_from_file(dataset_type=\"testing\", file='./tutorials/data/test_data_shifted_0.6.csv', target_column_name='Outcome')\n\n\n # Initialize the model factory and load the pre-trained model\n factory = ModelFactory()\n model = factory.create_model_from_pickled(\"./tutorials/models/diabetes_xgb_model.pkl\")\n\n # Set the base model using BaseModelManager\n base_model_manager = BaseModelManager()\n base_model_manager.set_base_model(model=model)\n\n # Execute the integrated MED3PA and Detectron experiment\n reference_results, test_results, detectron_results = Med3paDetectronExperiment.run(\n datasets=datasets,\n base_model_manager=base_model_manager,\n )\n\n # Save the results to a specified directory\n reference_det_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/reference')\n test_det_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/test')\n detectron_results.save(file_path='./tutorials/med3pa_detectron_experiment_results/detectron')\n\n```\n\n### Tutorials\n\nWe have created many [tutorial notebooks](https://github.com/lyna1404/MED3pa/tree/main/tutorials) to assist you in learning how to use the different parts of the package.\n\n\n## Acknowledgement\nMED3pa is an open-source package developed at the [MEDomics-Udes](https://www.medomics-udes.org/en/) laboratory with the collaboration of the international consortium [MEDomics](https://www.medomics.ai/). We welcome any contribution and feedback. \n\n## References\nThis package utilizes the methods described in the following work:\n\nGinsberg, T., Liang, Z., & Krishnan, R. G. (2023). [A Learning Based Hypothesis Test for Harmful Covariate Shift](https://openreview.net/forum?id=rdfgqiwz7lZ). In *The Eleventh International Conference on Learning Representations*.\n\n## Authors\n* [Lyna Chikouche: ](https://www.linkedin.com/in/lynahiba-chikouche-62a5181bb/) Research intern at MEDomics-Udes laboratory.\n* [Ludmila Amriou: ](https://www.linkedin.com/in/ludmila-amriou-875b58238//) Research intern at MEDomics-Udes laboratory.\n* [Olivier Lefebvre: ](https://www.linkedin.com/in/olivier-lefebvre-bb8837162/) Student (Ph. D. Computer science) at universit\u00e9 de Sherbrooke\n* [Martin Valli\u00e8res: ](https://www.linkedin.com/in/martvallieres/) Assistant professor, computer science department at universit\u00e9 de Sherbrooke\n\n## Statement\n\nThis package is part of https://github.com/medomics, a package providing research utility tools for developing precision medicine applications.\n\n```\nCopyright (C) 2024 MEDomics consortium\n\nGPL3 LICENSE SYNOPSIS\n\nHere's what the license entails:\n\n1. Anyone can copy, modify and distribute this software.\n2. You have to include the license and copyright notice with each and every distribution.\n3. You can use this software privately.\n4. You can use this software for commercial purposes.\n5. If you dare build your business solely from this code, you risk open-sourcing the whole code base.\n6. If you modify it, you have to indicate changes made to the code.\n7. Any modifications of this code base MUST be distributed with the same license, GPLv3.\n8. This software is provided without warranty.\n9. The software author or license can not be held liable for any damages inflicted by the software.\n```\n\nMore information on about the [LICENSE can be found here](https://github.com/MEDomics-UdeS/MEDimage/blob/main/LICENSE.md)\n\n## Supported Python Versions\n\nThe **MED3pa** package is developed and tested with Python 3.12.3.\n\nAdditionally, it is compatible with the following Python versions:\n- Python 3.11.x\n- Python 3.10.x\n- Python 3.9.x\n\nWhile the package may work with other versions of Python, these are the versions we officially support and recommend.\n",
"bugtrack_url": null,
"license": null,
"summary": "Python Open-source package for ensuring robust and reliable ML models deployments",
"version": "0.1.37",
"project_urls": {
"Documentation": "https://med3pa.readthedocs.io/en/latest/",
"Github": "https://github.com/lyna1404/MED3pa",
"Homepage": "https://github.com/lyna1404/MED3pa"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "90cf6fe686216f8fc5aca90c64d1b55542b17c0f383cd773454dc0b448280cdf",
"md5": "06f4bc59363205b184a8e6f8c14797f4",
"sha256": "7bb115b3c9fc5daf8622cc484c586af021d2f3c7fa322eafa04618caa2dd955c"
},
"downloads": -1,
"filename": "MED3pa-0.1.37-py3-none-any.whl",
"has_sig": false,
"md5_digest": "06f4bc59363205b184a8e6f8c14797f4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 97981,
"upload_time": "2024-09-10T15:38:42",
"upload_time_iso_8601": "2024-09-10T15:38:42.124285Z",
"url": "https://files.pythonhosted.org/packages/90/cf/6fe686216f8fc5aca90c64d1b55542b17c0f383cd773454dc0b448280cdf/MED3pa-0.1.37-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b17051d5739339c51fafa1c938590bb89e297eabe135ee478fd72e066926695f",
"md5": "dd68158f876afd6f23aa7a15c72004f8",
"sha256": "6dde08329c5a048967470dca5f9b4fa7222395bde2545835050266e0bde48c50"
},
"downloads": -1,
"filename": "med3pa-0.1.37.tar.gz",
"has_sig": false,
"md5_digest": "dd68158f876afd6f23aa7a15c72004f8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 4171655,
"upload_time": "2024-09-10T15:38:43",
"upload_time_iso_8601": "2024-09-10T15:38:43.293284Z",
"url": "https://files.pythonhosted.org/packages/b1/70/51d5739339c51fafa1c938590bb89e297eabe135ee478fd72e066926695f/med3pa-0.1.37.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-10 15:38:43",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lyna1404",
"github_project": "MED3pa",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
],
[
"<",
"2.1.0"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.4.0"
],
[
"<",
"3.0.0"
]
]
},
{
"name": "PyYAML",
"specs": [
[
"<",
"7.0"
],
[
">=",
"5.4"
]
]
},
{
"name": "scikit_learn",
"specs": [
[
">=",
"1.0"
],
[
"<",
"2.0"
]
]
},
{
"name": "scipy",
"specs": [
[
"<",
"2.0"
],
[
">=",
"1.7"
]
]
},
{
"name": "setuptools",
"specs": [
[
">=",
"49.6.0"
],
[
"<",
"70.0"
]
]
},
{
"name": "torch",
"specs": [
[
">=",
"1.9.0"
],
[
"<",
"3.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.50.0"
],
[
"<",
"5.0"
]
]
},
{
"name": "xgboost",
"specs": [
[
"==",
"2.0.3"
]
]
}
],
"lcname": "med3pa"
}