xnb


Namexnb JSON
Version 0.3.0 PyPI version JSON
download
home_pageNone
SummaryExplainable Naive Bayes (XNB) classifier. Using KDE for feature selection and Naive Bayes for prediction.
upload_time2024-11-20 17:29:39
maintainerNone
docs_urlNone
authorCayetano Romero Vargas
requires_python<4.0.0,>=3.9.0
licenseMIT
keywords kde naive bayes classification feature selection machine learning explicability kernel density class-specific
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Explainable Class–Specific Naive–Bayes Classifier

![Test](https://github.com/sorul/xnb/actions/workflows/testing_coverage.yml/badge.svg?branch=master)
![codecov.io](https://codecov.io/github/sorul/xnb/coverage.svg?branch=master)


## Description
Explainable Naive Bayes (XNB) classifier includes two important
features: 

1) The probability is calculated by means of Kernel Density Estimation (KDE).

2) The probability for each class does not use all variables,
but **only those that are relevant** for each specific class.

From the point of view of the classification performance,
the XNB classifier is comparable to NB classifier.
However, the XNB classifier provides the subsets of relevant variables for each class,
which contributes considerably to explaining how the predictive model is performing.
In addition, the subsets of variables generated for each class are usually different and with remarkably small cardinality.

## Installation

For example, if you are using pip, yo can install the package by:
```
pip install xnb
```

## Example of use:

```python
from xnb import XNB
from xnb.enums import BWFunctionName, Kernel, Algorithm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris
import pandas as pd

''' 1. Read the dataset.
It is important that the dataset is a pandas DataFrame object with named columns.
This way, we can obtain the dictionary of important variables for each class.'''
iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target' ] = iris.target
x = df.drop('target', axis=1)
y = df['target'].replace(
  to_replace= [0, 1, 2], value = ['setosa', 'versicolor', 'virginica']
)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20, random_state=0)

''' 2. By calling the fit() function,
we prepare the object to be able to make the prediction later. '''
xnb = XNB(
  show_progress_bar = True # optional
)
xnb.fit(
  x_train,
  y_train,
  bw_function = BWFunctionName.HSILVERMAN, # optional
  kernel = Kernel.GAUSSIAN, # optional
  algorithm = Algorithm.AUTO, # optional
  n_sample = 50 # optional
)

''' 3. When the fit() function finishes,
we can now access the feature selection dictionary it has calculated. '''
feature_selection = xnb.feature_selection_dict

''' 4. We predict the values of "y_test" using implicitly the calculated dictionary. '''
y_pred = xnb.predict(x_test)

# Output
print('Relevant features for each class:\n')
for target, features in feature_selection.items():
  print(f'{target}: {features}')
print(f'\n-------------\nAccuracy: {accuracy_score(y_test, y_pred)}')
```
The output is:
```
Relevant features for each class:

setosa: {'petal length (cm)'}
virginica: {'petal length (cm)', 'petal width (cm)'}
versicolor: {'petal length (cm)', 'petal width (cm)'}

-------------
Accuracy: 1.0
```

# Links
[![GitHub](https://img.shields.io/badge/GitHub-Repository-negro?style=for-the-badge&logo=github)](https://github.com/sorul/xnb)
[![PyPI](https://img.shields.io/badge/PyPI-Package-3776AB?style=for-the-badge&logo=pypi)](https://pypi.org/project/xnb/)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "xnb",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0.0,>=3.9.0",
    "maintainer_email": null,
    "keywords": "KDE, naive bayes, classification, feature selection, machine learning, explicability, kernel density, class-specific",
    "author": "Cayetano Romero Vargas",
    "author_email": "cromerovargas2d@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/40/a8/d7afeb5b0566f1967db5129cb8c99dc3963abc433ca6228b02042a369d4d/xnb-0.3.0.tar.gz",
    "platform": null,
    "description": "# Explainable Class\u2013Specific Naive\u2013Bayes Classifier\n\n![Test](https://github.com/sorul/xnb/actions/workflows/testing_coverage.yml/badge.svg?branch=master)\n![codecov.io](https://codecov.io/github/sorul/xnb/coverage.svg?branch=master)\n\n\n## Description\nExplainable Naive Bayes (XNB) classifier includes two important\nfeatures: \n\n1) The probability is calculated by means of Kernel Density Estimation (KDE).\n\n2) The probability for each class does not use all variables,\nbut **only those that are relevant** for each specific class.\n\nFrom the point of view of the classification performance,\nthe XNB classifier is comparable to NB classifier.\nHowever, the XNB classifier provides the subsets of relevant variables for each class,\nwhich contributes considerably to explaining how the predictive model is performing.\nIn addition, the subsets of variables generated for each class are usually different and with remarkably small cardinality.\n\n## Installation\n\nFor example, if you are using pip, yo can install the package by:\n```\npip install xnb\n```\n\n## Example of use:\n\n```python\nfrom xnb import XNB\nfrom xnb.enums import BWFunctionName, Kernel, Algorithm\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import accuracy_score\nfrom sklearn.datasets import load_iris\nimport pandas as pd\n\n''' 1. Read the dataset.\nIt is important that the dataset is a pandas DataFrame object with named columns.\nThis way, we can obtain the dictionary of important variables for each class.'''\niris = load_iris()\ndf = pd.DataFrame(iris.data, columns=iris.feature_names)\ndf['target' ] = iris.target\nx = df.drop('target', axis=1)\ny = df['target'].replace(\n  to_replace= [0, 1, 2], value = ['setosa', 'versicolor', 'virginica']\n)\nx_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.20, random_state=0)\n\n''' 2. By calling the fit() function,\nwe prepare the object to be able to make the prediction later. '''\nxnb = XNB(\n  show_progress_bar = True # optional\n)\nxnb.fit(\n  x_train,\n  y_train,\n  bw_function = BWFunctionName.HSILVERMAN, # optional\n  kernel = Kernel.GAUSSIAN, # optional\n  algorithm = Algorithm.AUTO, # optional\n  n_sample = 50 # optional\n)\n\n''' 3. When the fit() function finishes,\nwe can now access the feature selection dictionary it has calculated. '''\nfeature_selection = xnb.feature_selection_dict\n\n''' 4. We predict the values of \"y_test\" using implicitly the calculated dictionary. '''\ny_pred = xnb.predict(x_test)\n\n# Output\nprint('Relevant features for each class:\\n')\nfor target, features in feature_selection.items():\n  print(f'{target}: {features}')\nprint(f'\\n-------------\\nAccuracy: {accuracy_score(y_test, y_pred)}')\n```\nThe output is:\n```\nRelevant features for each class:\n\nsetosa: {'petal length (cm)'}\nvirginica: {'petal length (cm)', 'petal width (cm)'}\nversicolor: {'petal length (cm)', 'petal width (cm)'}\n\n-------------\nAccuracy: 1.0\n```\n\n# Links\n[![GitHub](https://img.shields.io/badge/GitHub-Repository-negro?style=for-the-badge&logo=github)](https://github.com/sorul/xnb)\n[![PyPI](https://img.shields.io/badge/PyPI-Package-3776AB?style=for-the-badge&logo=pypi)](https://pypi.org/project/xnb/)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Explainable Naive Bayes (XNB) classifier. Using KDE for feature selection and Naive Bayes for prediction.",
    "version": "0.3.0",
    "project_urls": null,
    "split_keywords": [
        "kde",
        " naive bayes",
        " classification",
        " feature selection",
        " machine learning",
        " explicability",
        " kernel density",
        " class-specific"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b850ede75593459b39f531b6f771a6c3bfbcc414afc4fbbae3bea9173cb56dfd",
                "md5": "f0f9f7986b988796052119e9304d544e",
                "sha256": "de5662e672a8dc46edbb85a1c8d02e99ba1897a9f7be1eca4c0da495bdda8f03"
            },
            "downloads": -1,
            "filename": "xnb-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f0f9f7986b988796052119e9304d544e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0.0,>=3.9.0",
            "size": 11608,
            "upload_time": "2024-11-20T17:29:37",
            "upload_time_iso_8601": "2024-11-20T17:29:37.426160Z",
            "url": "https://files.pythonhosted.org/packages/b8/50/ede75593459b39f531b6f771a6c3bfbcc414afc4fbbae3bea9173cb56dfd/xnb-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "40a8d7afeb5b0566f1967db5129cb8c99dc3963abc433ca6228b02042a369d4d",
                "md5": "7ddf75a62bfc694edeca71cd1ceb2e9e",
                "sha256": "3d059b595a8956730be4ca868a0253929c232a1b9f0a8ea0a404f3a3ad4746e4"
            },
            "downloads": -1,
            "filename": "xnb-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "7ddf75a62bfc694edeca71cd1ceb2e9e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0.0,>=3.9.0",
            "size": 11830,
            "upload_time": "2024-11-20T17:29:39",
            "upload_time_iso_8601": "2024-11-20T17:29:39.185220Z",
            "url": "https://files.pythonhosted.org/packages/40/a8/d7afeb5b0566f1967db5129cb8c99dc3963abc433ca6228b02042a369d4d/xnb-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-20 17:29:39",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "xnb"
}
        
Elapsed time: 0.36106s