edgaro


Nameedgaro JSON
Version 1.0.2.1 PyPI version JSON
download
home_pagehttps://github.com/adrianstando/edgaro
SummaryExplainable imbalanceD learninG compARatOr
upload_time2023-06-16 20:12:55
maintainer
docs_urlNone
authorAdrian Stańdo
requires_python>=3.8, <4
license
keywords xai imbalance machine learning ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Explainable imbalanceD learninG compARatOr

[![main-check](https://github.com/adrianstando/EDGAR/actions/workflows/main-check.yaml/badge.svg)](https://github.com/adrianstando/EDGAR/actions/workflows/main-check.yaml)

## Overview

The usage of many balancing methods like Random Undersampling, Random Oversampling, SMOTE, NearMiss is a very popular 
solution when dealing with imbalanced data. However, a question can be posed of whether these techniques can change the
model behaviour or the relationships present in data. 

As there are many kinds of Machine Learning models, this package provides model-agnostic tools to investigate the model
behaviour and its changes. These tools are also known as Explainable Artificial Intelligence (XAI) tools and include
techniques such as Partial Dependence Profile (PDP), Accumulated Local Effects (ALE) and Variable Importance (VI). 

Apart from that, the package implements novel methods to compare the explanations, which are *Standard Deviation of 
Distances* (for PDP and ALE) and the Wilcoxon statistical test (for VI).

Generally speaking, this package aims to giving a user-friendly interface to investigate whether the described 
phenomena take place.

The package was written in Python and consists of four modules: *dataset*, *balancing*, *model* and *explain*. 
It provides a simple and user-friendly interface which aims to automate the process of data balancing with different 
methods, training Machine Learning models and calculating PDP/ALE/VI explanations. The package can be used for one input
dataset or for a number of datasets arranged in arrays or nested arrays.

## Technologies

The package was written in Python and was checked to be compatible with Python 3.8, Python 3.9 and Python 3.10.

It uses most popular libraries for Machine Learning in Python:

* pandas, NumPy
* scikit-learn, xgboost
* imbalanced-learn
* dalex
* scipy, statsmodels
* matplotlib
* openml

## User Manual

User Manual is available as a part of the documentation, [here](https://adrianstando.github.io/edgaro/source/manual.html)

## Installation

The `edgaro` package is available on [PyPI](https://pypi.org/project/edgaro/) and can be installed by:

```console
pip install edgaro
```

## Documentation

The documentation is available at [adrianstando.github.io/edgaro](https://adrianstando.github.io/edgaro)

## Project purpose

This package was created for the purpose of my Engineering Thesis *"The impact of data balancing on model behaviour with 
Explainable Artificial Intelligence tools in imbalanced classification problems"*.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/adrianstando/edgaro",
    "name": "edgaro",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8, <4",
    "maintainer_email": "",
    "keywords": "XAI,imbalance,machine learning,AI",
    "author": "Adrian Sta\u0144do",
    "author_email": "adrian.j.stando@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/09/f4/ce4db16126edd931b8148f5c8e3b11db52dbb0995ada053cb14f4815e91e/edgaro-1.0.2.1.tar.gz",
    "platform": null,
    "description": "# Explainable imbalanceD learninG compARatOr\n\n[![main-check](https://github.com/adrianstando/EDGAR/actions/workflows/main-check.yaml/badge.svg)](https://github.com/adrianstando/EDGAR/actions/workflows/main-check.yaml)\n\n## Overview\n\nThe usage of many balancing methods like Random Undersampling, Random Oversampling, SMOTE, NearMiss is a very popular \nsolution when dealing with imbalanced data. However, a question can be posed of whether these techniques can change the\nmodel behaviour or the relationships present in data. \n\nAs there are many kinds of Machine Learning models, this package provides model-agnostic tools to investigate the model\nbehaviour and its changes. These tools are also known as Explainable Artificial Intelligence (XAI) tools and include\ntechniques such as Partial Dependence Profile (PDP), Accumulated Local Effects (ALE) and Variable Importance (VI). \n\nApart from that, the package implements novel methods to compare the explanations, which are *Standard Deviation of \nDistances* (for PDP and ALE) and the Wilcoxon statistical test (for VI).\n\nGenerally speaking, this package aims to giving a user-friendly interface to investigate whether the described \nphenomena take place.\n\nThe package was written in Python and consists of four modules: *dataset*, *balancing*, *model* and *explain*. \nIt provides a simple and user-friendly interface which aims to automate the process of data balancing with different \nmethods, training Machine Learning models and calculating PDP/ALE/VI explanations. The package can be used for one input\ndataset or for a number of datasets arranged in arrays or nested arrays.\n\n## Technologies\n\nThe package was written in Python and was checked to be compatible with Python 3.8, Python 3.9 and Python 3.10.\n\nIt uses most popular libraries for Machine Learning in Python:\n\n* pandas, NumPy\n* scikit-learn, xgboost\n* imbalanced-learn\n* dalex\n* scipy, statsmodels\n* matplotlib\n* openml\n\n## User Manual\n\nUser Manual is available as a part of the documentation, [here](https://adrianstando.github.io/edgaro/source/manual.html)\n\n## Installation\n\nThe `edgaro` package is available on [PyPI](https://pypi.org/project/edgaro/) and can be installed by:\n\n```console\npip install edgaro\n```\n\n## Documentation\n\nThe documentation is available at [adrianstando.github.io/edgaro](https://adrianstando.github.io/edgaro)\n\n## Project purpose\n\nThis package was created for the purpose of my Engineering Thesis *\"The impact of data balancing on model behaviour with \nExplainable Artificial Intelligence tools in imbalanced classification problems\"*.\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Explainable imbalanceD learninG compARatOr",
    "version": "1.0.2.1",
    "project_urls": {
        "Code repository": "https://github.com/adrianstando/edgaro",
        "Documentation": "https://adrianstando.github.io/edgaro",
        "Homepage": "https://github.com/adrianstando/edgaro"
    },
    "split_keywords": [
        "xai",
        "imbalance",
        "machine learning",
        "ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "023cc07d1d80dd5b53e3161e8f61b8e4767812282d19161a0958b88168853ed2",
                "md5": "55414237f22ec3ba0d79ca6d8aecf70f",
                "sha256": "ce93110987f93c0cce33f1de91a5b3a44bcd3c545de9b0eb42c6349a367c1a4e"
            },
            "downloads": -1,
            "filename": "edgaro-1.0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "55414237f22ec3ba0d79ca6d8aecf70f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8, <4",
            "size": 60831,
            "upload_time": "2023-06-16T20:12:53",
            "upload_time_iso_8601": "2023-06-16T20:12:53.028377Z",
            "url": "https://files.pythonhosted.org/packages/02/3c/c07d1d80dd5b53e3161e8f61b8e4767812282d19161a0958b88168853ed2/edgaro-1.0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "09f4ce4db16126edd931b8148f5c8e3b11db52dbb0995ada053cb14f4815e91e",
                "md5": "292e374d75376f84de0ed8e8444a8e8c",
                "sha256": "4ef296924b34eb06ccea89bf606571f6f7d5b8d692432b7416b563260d54bb6f"
            },
            "downloads": -1,
            "filename": "edgaro-1.0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "292e374d75376f84de0ed8e8444a8e8c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8, <4",
            "size": 61613,
            "upload_time": "2023-06-16T20:12:55",
            "upload_time_iso_8601": "2023-06-16T20:12:55.707763Z",
            "url": "https://files.pythonhosted.org/packages/09/f4/ce4db16126edd931b8148f5c8e3b11db52dbb0995ada053cb14f4815e91e/edgaro-1.0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-16 20:12:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "adrianstando",
    "github_project": "edgaro",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "tox": true,
    "lcname": "edgaro"
}
        
Elapsed time: 0.10755s