bhad


Namebhad JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/AVoss84/bhad
SummaryBayesian Histogram-based Anomaly Detection
upload_time2023-05-30 09:22:53
maintainerAlexander Vosseler
docs_urlNone
authorAlexander Vosseler
requires_python>=3.8
license
keywords bayesian-inference anomaly-detection unsupervised-learning explainability
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Bayesian Histogram-based Anomaly Detection (BHAD)

Python implementation of the BHAD algorithm as presented in [Vosseler, A. (2023): BHAD: Explainable anomaly detection using Bayesian histograms](https://www.researchgate.net/publication/364265660_BHAD_Fast_unsupervised_anomaly_detection_using_Bayesian_histograms). The ***bhad* package** follows Scikit-learn's standard API for [outlier detection](https://scikit-learn.org/stable/modules/outlier_detection.html). 
<!--- The *bhad* package has been presented on *PyCon DE & PyData Berlin 2023*, you can watch the presentation [here](https://vimeo.com/user/171811262/folder/15825490). --> 

## Installation

```bash
pip install bhad
```

## Usage

1.) Preprocess the input data: discretize continuous features and conduct Bayesian model selection (optionally).

2.) Train the model using discrete data.

For convenience these two steps can be wrapped up via a scikit-learn pipeline (optionally). 

```python
from bhad.model import BHAD
from bhad.utils import Discretize
from sklearn.pipeline import Pipeline

num_cols = [....]   # names of numeric features
cat_cols = [....]   # categorical features

pipe = Pipeline(steps=[
   ('discrete', Discretize(nbins = None)),   
   ('model', BHAD(contamination = 0.01, num_features = num_cols, cat_features = cat_cols))
])
```

For a given dataset get binary model decisons:

```python
y_pred = pipe.fit_predict(X = dataset)        
```

Get global model explanation as well as for individual observations:

```python
from bhad.explainer import Explainer

local_expl = Explainer(pipe.named_steps['model'], pipe.named_steps['discrete']).fit()

local_expl.get_explanation(nof_feat_expl = 5, append = False)   # individual explanations

local_expl.global_feat_imp                                      # global explanation
```

A detailed toy example using synthetic data for anomaly detection can be found [here](https://github.com/AVoss84/bhad/blob/main/src/notebooks/Toy_Example.ipynb) and an example using the Titanic dataset illustrating model explanability can be found [here](https://github.com/AVoss84/bhad/blob/main/src/notebooks/Titanic_Example.ipynb).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/AVoss84/bhad",
    "name": "bhad",
    "maintainer": "Alexander Vosseler",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "bayesian-inference,anomaly-detection,unsupervised-learning,explainability",
    "author": "Alexander Vosseler",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/cd/79/f91da89721d5b2d7e7af36221d659869408cef7539fbee8cf1cc2a1a817a/bhad-0.1.0.tar.gz",
    "platform": null,
    "description": "# Bayesian Histogram-based Anomaly Detection (BHAD)\n\nPython implementation of the BHAD algorithm as presented in [Vosseler, A. (2023): BHAD: Explainable anomaly detection using Bayesian histograms](https://www.researchgate.net/publication/364265660_BHAD_Fast_unsupervised_anomaly_detection_using_Bayesian_histograms). The ***bhad* package** follows Scikit-learn's standard API for [outlier detection](https://scikit-learn.org/stable/modules/outlier_detection.html). \n<!--- The *bhad* package has been presented on *PyCon DE & PyData Berlin 2023*, you can watch the presentation [here](https://vimeo.com/user/171811262/folder/15825490). --> \n\n## Installation\n\n```bash\npip install bhad\n```\n\n## Usage\n\n1.) Preprocess the input data: discretize continuous features and conduct Bayesian model selection (optionally).\n\n2.) Train the model using discrete data.\n\nFor convenience these two steps can be wrapped up via a scikit-learn pipeline (optionally). \n\n```python\nfrom bhad.model import BHAD\nfrom bhad.utils import Discretize\nfrom sklearn.pipeline import Pipeline\n\nnum_cols = [....]   # names of numeric features\ncat_cols = [....]   # categorical features\n\npipe = Pipeline(steps=[\n   ('discrete', Discretize(nbins = None)),   \n   ('model', BHAD(contamination = 0.01, num_features = num_cols, cat_features = cat_cols))\n])\n```\n\nFor a given dataset get binary model decisons:\n\n```python\ny_pred = pipe.fit_predict(X = dataset)        \n```\n\nGet global model explanation as well as for individual observations:\n\n```python\nfrom bhad.explainer import Explainer\n\nlocal_expl = Explainer(pipe.named_steps['model'], pipe.named_steps['discrete']).fit()\n\nlocal_expl.get_explanation(nof_feat_expl = 5, append = False)   # individual explanations\n\nlocal_expl.global_feat_imp                                      # global explanation\n```\n\nA detailed toy example using synthetic data for anomaly detection can be found [here](https://github.com/AVoss84/bhad/blob/main/src/notebooks/Toy_Example.ipynb) and an example using the Titanic dataset illustrating model explanability can be found [here](https://github.com/AVoss84/bhad/blob/main/src/notebooks/Titanic_Example.ipynb).\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Bayesian Histogram-based Anomaly Detection",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/AVoss84/bhad"
    },
    "split_keywords": [
        "bayesian-inference",
        "anomaly-detection",
        "unsupervised-learning",
        "explainability"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a2f77015c5b11cc5b7ba1d7b50f91517cd3f591c9f7b94db5a45f093f6cfec60",
                "md5": "51525d4f7ec94168be93323917f10ed8",
                "sha256": "0bddc670d5630507c23bf911f2794f8841bde7bf58e8830778bcf3bc61f3e6a8"
            },
            "downloads": -1,
            "filename": "bhad-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "51525d4f7ec94168be93323917f10ed8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 19327,
            "upload_time": "2023-05-30T09:22:51",
            "upload_time_iso_8601": "2023-05-30T09:22:51.476264Z",
            "url": "https://files.pythonhosted.org/packages/a2/f7/7015c5b11cc5b7ba1d7b50f91517cd3f591c9f7b94db5a45f093f6cfec60/bhad-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cd79f91da89721d5b2d7e7af36221d659869408cef7539fbee8cf1cc2a1a817a",
                "md5": "e928d73ccac9c35289a77bc865b660b3",
                "sha256": "b6854fffc58f12322c979d1d019b8a52b9613824df1748622456b01481e964f8"
            },
            "downloads": -1,
            "filename": "bhad-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e928d73ccac9c35289a77bc865b660b3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 19136,
            "upload_time": "2023-05-30T09:22:53",
            "upload_time_iso_8601": "2023-05-30T09:22:53.927460Z",
            "url": "https://files.pythonhosted.org/packages/cd/79/f91da89721d5b2d7e7af36221d659869408cef7539fbee8cf1cc2a1a817a/bhad-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-30 09:22:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AVoss84",
    "github_project": "bhad",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "bhad"
}
        
Elapsed time: 1.34520s