PyEDCR


NamePyEDCR JSON
Version 1.1.3 PyPI version JSON
download
home_pageNone
SummaryPyEDCR is a metacognitive neuro-symbolic method for learning error detection and correction rules in deployed ML models using combinatorial sub-modular set optimization
upload_time2025-02-19 06:34:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords artificial intelligence data analysis data mining data science error correction error detection machine learning metacognition neurosymbolic ai rule learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
    <img alt="Logo" src="https://github.com/lab-v2/PyEDCR/raw/master/images/logo-transparent-png.png" width="440" height="500"/>
</p>

<!-- [![pages-build-deployment](https://github.com/krichelj/PyDiffGame/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/krichelj/PyDiffGame/actions/workflows/pages/pages-build-deployment) -->

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Upload Python Package](https://github.com/lab-v2/PyEDCR/actions/workflows/python-publish.yml/badge.svg)](https://github.com/lab-v2/PyEDCR/actions/workflows/python-publish.yml) [![GitHub Pages Deployment](https://github.com/lab-v2/PyEDCR/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/lab-v2/PyEDCR/actions/workflows/pages/pages-build-deployment)

- [What is this?](#what-is-this)
- [Installation](#installation)
- [Example](#example)
- [Authors](#authors)
- [Acknowledgments](#acknowledgments)

# What is this?

[`PyEDCR`](https://github.com/lab-v2/metacognitive_error_detection_and_correction_v2/tree/master) is a Python implementation of the f-EDR (Focused Error Detection Rules) paradigm. The goal of EDR is to use a set of conditions to learn when a machine learning model makes an incorrect prediction. 

The EDCR method was first introduced in '_Rule-Based Error Detection and Correction to Operationalize Movement Trajectory Classification_' ([ArXiv preprint](https://arxiv.org/abs/2308.14250)) and later extended to the f-EDR method in the conference article '_Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge_' presented at CIKM 2024 ([ACM Publication](https://dl.acm.org/doi/10.1145/3627673.3679918), [ArXiv Preprint](https://arxiv.org/abs/2407.15192)).

The package was tested for Python >= 3.9.

# Installation

To install the package, you can use the following command:

```bash
pip install PyEDCR
```


# Example

To demonstrate the use of the package, we consider a dataset with two levels of hierarchy, such that each image has a fine-grain and coarse-grain label. For example, consider the following example from our curated Military Vehicle (which can be found [here](https://huggingface.co/datasets/lab-v2/military_vehicles)):

<p align="center">
    <img alt="ImageNet100" src="https://github.com/lab-v2/PyEDCR/raw/master/images/spa_tank_diagram.PNG" width="600" height="250"/>
</p>

We further consider a pretrained 'main' model, for example one which employed the small version of [Meta's _DINO_V2_ architecture](https://dinov2.metademolab.com/) and was fine-tuned on ImageNet50 - a subset of the [ImageNet1K dataset](https://www.image-net.org/index.php) with 50 classes (which can be found [here](https://huggingface.co/datasets/lab-v2/ImageNet50)), which we want to analyze its ability to classify both levels of the hierarchy. An instance of such model (which can be found [here](https://huggingface.co/lab-v2/dinov2_vits14_imagenet_lr1e-06_BCE)) has the following performance:

Fine-grain prior combined accuracy: <code style="color:green">76.57%</code> , fine-grain prior combined macro f1: <code style="color:green">76.1%</code>\
Fine-grain prior combined macro precision: <code style="color:green">76.96%</code> , fine-grain prior combined macro recall: <code style="color:green">76.57%</code>

Coarse-grain prior combined accuracy: <code style="color:green">87.14%</code>, coarse-grain prior combined macro f1: <code style="color:green">85.77%</code>\
Coarse-grain prior combined macro precision: <code style="color:green">87.36%</code>, coarse-grain prior combined macro recall: <code style="color:green">84.64%</code>

Total prior inconsistencies <code style="color:red">133/2100</code> (<code style="color:red">6.33%</code>)

We also consider a 'secondary' model (which can be found [here](https://huggingface.co/lab-v2/dinov2_vitl14_imagenet_lr1e-06_BCE)), which employed the large version of the DINO_V2 architecture and was also fine-tuned on the ImageNet50 dataset, along with binary models which were trained on each class of the dataset.
Consider the following code snippet to run the `run_experiment` function from PyEDCR.py:

```python

from PyEDCR.classes import experiment_config
from PyEDCR.PyEDCR import run_experiment

imagenet_config = experiment_config.ExperimentConfig(
    data_str='imagenet',
    main_model_name='dinov2_vits14',
    secondary_model_name='dinov2_vitl14',
    main_lr=1e-6,
    secondary_lr=0.000001,
    binary_lr=0.000001,
    original_num_epochs=8,
    secondary_num_epochs=2,
    binary_num_epochs=5
)

run_experiment(config=imagenet_config)
```

The code will initiate the rule learning pipeline, use the rules learned to mark errors in the predictions of the main model, and print out the performance metrics of the algorithm on the error class after running the f-EDR algorithm, which in this case will be:

```python
error_accuracy: 89.0
error_balanced_accuracy: 84.23
error_precision: 81.65
error_recall: 74.31
error_f1: 77.81
recovered_constraints_precision: 100.0
recovered_constraints_recall: 59.36
recovered_constraints_f1_score: 74.5
```

For further details about the rule learning algorithm, and noise tolerance experiments, please refer to the [paper](https://arxiv.org/abs/2407.15192).

# Authors

If you use this work, please cite our paper:
```
@inproceedings{10.1145/3627673.3679918,
author = {Kricheli, Joshua Shay and Vo, Khoa and Datta, Aniruddha and Ozgur, Spencer and Shakarian, Paulo},
title = {Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge},
year = {2024},
isbn = {9798400704369},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3627673.3679918},
doi = {10.1145/3627673.3679918},
abstract = {Recent advances in Hierarchical Multi-label Classification (HMC), particularly neurosymbolic-based approaches, have demonstrated improved consistency and accuracy by enforcing constraints on a neural model during training. However, such work assumes the existence of such constraints a-priori. In this paper, we relax this strong assumption and present an approach based on Error Detection Rules (EDR) that allow for learning explainable rules about the failure modes of machine learning models. We show that these rules are not only effective in detecting when a machine learning classifier has made an error but also can be leveraged as constraints for HMC, thereby allowing the recovery of explainable constraints even if they are not provided. We show that our approach is effective in detecting machine learning errors and recovering constraints, is noise tolerant, and can function as a source of knowledge for neurosymbolic models on multiple datasets, including a newly introduced military vehicle recognition dataset.},
booktitle = {Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},
pages = {3842–3846},
numpages = {5},
keywords = {hierarchical multi-label classification, learning with constraints, metacognitive ai, neurosymbolic ai, rule learning},
location = {Boise, ID, USA},
series = {CIKM '24}
}
```

# Acknowledgments

This research was funded by ARO grant W911NF-24-1-0007.


<p align="center">
<a href="https://scai.engineering.asu.edu/">
<img src="https://cdn.shopify.com/s/files/1/1095/6418/files/ASU-sun-devils-new-logo.jpg?v=1481918145" height="150"  alt=""/>
</a>
&emsp;
<a href="https://arl.devcom.army.mil/who-we-are/aro/">
<img src="https://upload.wikimedia.org/wikipedia/commons/3/3e/ARO_Logo.png" height="150"  alt=""/>
</a>
&emsp;
&emsp;
<a href="https://labs.engineering.asu.edu/labv2/">
<img src="https://labs.engineering.asu.edu/labv2/wp-content/uploads/sites/82/2022/03/logo_LabV2-1-1000x510.png" height="150"  alt=""/>
</a>
</p>
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "PyEDCR",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "artificial intelligence, data analysis, data mining, data science, error correction, error detection, machine learning, metacognition, neurosymbolic AI, rule learning",
    "author": null,
    "author_email": "Lab V2 <pshak02@asu.edu>",
    "download_url": "https://files.pythonhosted.org/packages/61/2f/1a2a7e113b9e910e808c77013fee1166205a9f7f071a960fb891055865b5/pyedcr-1.1.3.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n    <img alt=\"Logo\" src=\"https://github.com/lab-v2/PyEDCR/raw/master/images/logo-transparent-png.png\" width=\"440\" height=\"500\"/>\n</p>\n\n<!-- [![pages-build-deployment](https://github.com/krichelj/PyDiffGame/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/krichelj/PyDiffGame/actions/workflows/pages/pages-build-deployment) -->\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Upload Python Package](https://github.com/lab-v2/PyEDCR/actions/workflows/python-publish.yml/badge.svg)](https://github.com/lab-v2/PyEDCR/actions/workflows/python-publish.yml) [![GitHub Pages Deployment](https://github.com/lab-v2/PyEDCR/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/lab-v2/PyEDCR/actions/workflows/pages/pages-build-deployment)\n\n- [What is this?](#what-is-this)\n- [Installation](#installation)\n- [Example](#example)\n- [Authors](#authors)\n- [Acknowledgments](#acknowledgments)\n\n# What is this?\n\n[`PyEDCR`](https://github.com/lab-v2/metacognitive_error_detection_and_correction_v2/tree/master) is a Python implementation of the f-EDR (Focused Error Detection Rules) paradigm. The goal of EDR is to use a set of conditions to learn when a machine learning model makes an incorrect prediction. \n\nThe EDCR method was first introduced in '_Rule-Based Error Detection and Correction to Operationalize Movement Trajectory Classification_' ([ArXiv preprint](https://arxiv.org/abs/2308.14250)) and later extended to the f-EDR method in the conference article '_Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge_' presented at CIKM 2024 ([ACM Publication](https://dl.acm.org/doi/10.1145/3627673.3679918), [ArXiv Preprint](https://arxiv.org/abs/2407.15192)).\n\nThe package was tested for Python >= 3.9.\n\n# Installation\n\nTo install the package, you can use the following command:\n\n```bash\npip install PyEDCR\n```\n\n\n# Example\n\nTo demonstrate the use of the package, we consider a dataset with two levels of hierarchy, such that each image has a fine-grain and coarse-grain label. For example, consider the following example from our curated Military Vehicle (which can be found [here](https://huggingface.co/datasets/lab-v2/military_vehicles)):\n\n<p align=\"center\">\n    <img alt=\"ImageNet100\" src=\"https://github.com/lab-v2/PyEDCR/raw/master/images/spa_tank_diagram.PNG\" width=\"600\" height=\"250\"/>\n</p>\n\nWe further consider a pretrained 'main' model, for example one which employed the small version of [Meta's _DINO_V2_ architecture](https://dinov2.metademolab.com/) and was fine-tuned on ImageNet50 - a subset of the [ImageNet1K dataset](https://www.image-net.org/index.php) with 50 classes (which can be found [here](https://huggingface.co/datasets/lab-v2/ImageNet50)), which we want to analyze its ability to classify both levels of the hierarchy. An instance of such model (which can be found [here](https://huggingface.co/lab-v2/dinov2_vits14_imagenet_lr1e-06_BCE)) has the following performance:\n\nFine-grain prior combined accuracy: <code style=\"color:green\">76.57%</code> , fine-grain prior combined macro f1: <code style=\"color:green\">76.1%</code>\\\nFine-grain prior combined macro precision: <code style=\"color:green\">76.96%</code> , fine-grain prior combined macro recall: <code style=\"color:green\">76.57%</code>\n\nCoarse-grain prior combined accuracy: <code style=\"color:green\">87.14%</code>, coarse-grain prior combined macro f1: <code style=\"color:green\">85.77%</code>\\\nCoarse-grain prior combined macro precision: <code style=\"color:green\">87.36%</code>, coarse-grain prior combined macro recall: <code style=\"color:green\">84.64%</code>\n\nTotal prior inconsistencies <code style=\"color:red\">133/2100</code> (<code style=\"color:red\">6.33%</code>)\n\nWe also consider a 'secondary' model (which can be found [here](https://huggingface.co/lab-v2/dinov2_vitl14_imagenet_lr1e-06_BCE)), which employed the large version of the DINO_V2 architecture and was also fine-tuned on the ImageNet50 dataset, along with binary models which were trained on each class of the dataset.\nConsider the following code snippet to run the `run_experiment` function from PyEDCR.py:\n\n```python\n\nfrom PyEDCR.classes import experiment_config\nfrom PyEDCR.PyEDCR import run_experiment\n\nimagenet_config = experiment_config.ExperimentConfig(\n    data_str='imagenet',\n    main_model_name='dinov2_vits14',\n    secondary_model_name='dinov2_vitl14',\n    main_lr=1e-6,\n    secondary_lr=0.000001,\n    binary_lr=0.000001,\n    original_num_epochs=8,\n    secondary_num_epochs=2,\n    binary_num_epochs=5\n)\n\nrun_experiment(config=imagenet_config)\n```\n\nThe code will initiate the rule learning pipeline, use the rules learned to mark errors in the predictions of the main model, and print out the performance metrics of the algorithm on the error class after running the f-EDR algorithm, which in this case will be:\n\n```python\nerror_accuracy: 89.0\nerror_balanced_accuracy: 84.23\nerror_precision: 81.65\nerror_recall: 74.31\nerror_f1: 77.81\nrecovered_constraints_precision: 100.0\nrecovered_constraints_recall: 59.36\nrecovered_constraints_f1_score: 74.5\n```\n\nFor further details about the rule learning algorithm, and noise tolerance experiments, please refer to the [paper](https://arxiv.org/abs/2407.15192).\n\n# Authors\n\nIf you use this work, please cite our paper:\n```\n@inproceedings{10.1145/3627673.3679918,\nauthor = {Kricheli, Joshua Shay and Vo, Khoa and Datta, Aniruddha and Ozgur, Spencer and Shakarian, Paulo},\ntitle = {Error Detection and Constraint Recovery in Hierarchical Multi-Label Classification without Prior Knowledge},\nyear = {2024},\nisbn = {9798400704369},\npublisher = {Association for Computing Machinery},\naddress = {New York, NY, USA},\nurl = {https://doi.org/10.1145/3627673.3679918},\ndoi = {10.1145/3627673.3679918},\nabstract = {Recent advances in Hierarchical Multi-label Classification (HMC), particularly neurosymbolic-based approaches, have demonstrated improved consistency and accuracy by enforcing constraints on a neural model during training. However, such work assumes the existence of such constraints a-priori. In this paper, we relax this strong assumption and present an approach based on Error Detection Rules (EDR) that allow for learning explainable rules about the failure modes of machine learning models. We show that these rules are not only effective in detecting when a machine learning classifier has made an error but also can be leveraged as constraints for HMC, thereby allowing the recovery of explainable constraints even if they are not provided. We show that our approach is effective in detecting machine learning errors and recovering constraints, is noise tolerant, and can function as a source of knowledge for neurosymbolic models on multiple datasets, including a newly introduced military vehicle recognition dataset.},\nbooktitle = {Proceedings of the 33rd ACM International Conference on Information and Knowledge Management},\npages = {3842\u20133846},\nnumpages = {5},\nkeywords = {hierarchical multi-label classification, learning with constraints, metacognitive ai, neurosymbolic ai, rule learning},\nlocation = {Boise, ID, USA},\nseries = {CIKM '24}\n}\n```\n\n# Acknowledgments\n\nThis research was funded by ARO grant W911NF-24-1-0007.\n\n\n<p align=\"center\">\n<a href=\"https://scai.engineering.asu.edu/\">\n<img src=\"https://cdn.shopify.com/s/files/1/1095/6418/files/ASU-sun-devils-new-logo.jpg?v=1481918145\" height=\"150\"  alt=\"\"/>\n</a>\n&emsp;\n<a href=\"https://arl.devcom.army.mil/who-we-are/aro/\">\n<img src=\"https://upload.wikimedia.org/wikipedia/commons/3/3e/ARO_Logo.png\" height=\"150\"  alt=\"\"/>\n</a>\n&emsp;\n&emsp;\n<a href=\"https://labs.engineering.asu.edu/labv2/\">\n<img src=\"https://labs.engineering.asu.edu/labv2/wp-content/uploads/sites/82/2022/03/logo_LabV2-1-1000x510.png\" height=\"150\"  alt=\"\"/>\n</a>\n</p>",
    "bugtrack_url": null,
    "license": null,
    "summary": "PyEDCR is a metacognitive neuro-symbolic method for learning error detection and correction rules in deployed ML models using combinatorial sub-modular set optimization",
    "version": "1.1.3",
    "project_urls": null,
    "split_keywords": [
        "artificial intelligence",
        " data analysis",
        " data mining",
        " data science",
        " error correction",
        " error detection",
        " machine learning",
        " metacognition",
        " neurosymbolic ai",
        " rule learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b46b21f7656bcfbf2e6fba735755585689286d1b4f27b82dbafb19ea2b20f2bb",
                "md5": "f1b544f704b314221e646eb425e20a2f",
                "sha256": "16f72b8256cbea5dbfd8319f5469e12e73cbdfdd5db73471606f41a7cd20a062"
            },
            "downloads": -1,
            "filename": "pyedcr-1.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f1b544f704b314221e646eb425e20a2f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 4981,
            "upload_time": "2025-02-19T06:34:50",
            "upload_time_iso_8601": "2025-02-19T06:34:50.043974Z",
            "url": "https://files.pythonhosted.org/packages/b4/6b/21f7656bcfbf2e6fba735755585689286d1b4f27b82dbafb19ea2b20f2bb/pyedcr-1.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "612f1a2a7e113b9e910e808c77013fee1166205a9f7f071a960fb891055865b5",
                "md5": "556e6a4cb9a034bfa389a1ef3eba19a9",
                "sha256": "14066cd662e4a478df2cba4fdb39afc9352db90bac9033103125b17e25ed9632"
            },
            "downloads": -1,
            "filename": "pyedcr-1.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "556e6a4cb9a034bfa389a1ef3eba19a9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 43433297,
            "upload_time": "2025-02-19T06:34:54",
            "upload_time_iso_8601": "2025-02-19T06:34:54.119557Z",
            "url": "https://files.pythonhosted.org/packages/61/2f/1a2a7e113b9e910e808c77013fee1166205a9f7f071a960fb891055865b5/pyedcr-1.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-19 06:34:54",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "pyedcr"
}
        
Elapsed time: 3.34989s