dice-ml


Namedice-ml JSON
Version 0.11 PyPI version JSON
download
home_pagehttps://github.com/interpretml/DiCE
SummaryGenerate Diverse Counterfactual Explanations for any machine learning model.
upload_time2023-10-27 03:54:08
maintainer
docs_urlNone
authorRamaravind Mothilal, Amit Sharma, Chenhao Tan
requires_python>=3.6
licenseMIT
keywords machine-learning explanation interpretability counterfactual
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            |PyPiVersion|_ |CondaVersion|_ |MITlicense| |PythonSupport|_ |Downloads|_ 

|BuildStatusTests|_ |BuildStatusNotebooks|_ 

.. |MITlicense| image:: https://img.shields.io/badge/License-MIT-blue.svg
.. _MITlicense: https://img.shields.io/badge/License-MIT-blue.svg

.. |PyPiVersion| image:: https://img.shields.io/pypi/v/dice-ml
.. _PyPiVersion: https://pypi.org/project/dice-ml/

.. |Downloads| image:: https://static.pepy.tech/personalized-badge/dice-ml?period=total&units=international_system&left_color=grey&right_color=orange&left_text=Downloads
.. _Downloads: https://pepy.tech/project/dice-ml

.. |PythonSupport| image:: https://img.shields.io/pypi/pyversions/dice-ml
.. _PythonSupport: https://pypi.org/project/dice-ml/

.. |CondaVersion| image:: https://anaconda.org/conda-forge/dice-ml/badges/version.svg
.. _CondaVersion: https://anaconda.org/conda-forge/dice-ml

.. |BuildStatusTests| image:: https://github.com/interpretml/DiCE/actions/workflows/python-package.yml/badge.svg?branch=main
.. _BuildStatusTests: https://github.com/interpretml/DiCE/actions/workflows/python-package.yml?query=workflow%3A%22Python+package%22

.. |BuildStatusNotebooks| image:: https://github.com/interpretml/DiCE/actions/workflows/notebook-tests.yml/badge.svg?branch=main
.. _BuildStatusNotebooks: https://github.com/interpretml/DiCE/actions/workflows/notebook-tests.yml?query=workflow%3A%22Notebook+tests%22

Diverse Counterfactual Explanations (DiCE) for ML
======================================================================

*How to explain a machine learning model such that the explanation is truthful to the model and yet interpretable to people?*

`Ramaravind K. Mothilal <https://raam93.github.io/>`_, `Amit Sharma <http://www.amitsharma.in/>`_, `Chenhao Tan <https://chenhaot.com/>`_
  
`FAT* '20 paper <https://arxiv.org/abs/1905.07697>`_ | `Docs <https://interpretml.github.io/DiCE/>`_ | `Example Notebooks <https://github.com/interpretml/DiCE/tree/master/docs/source/notebooks>`_ | Live Jupyter notebook |Binder|_

.. |Binder| image:: https://mybinder.org/badge_logo.svg
.. _Binder:  https://mybinder.org/v2/gh/interpretML/DiCE/master?filepath=docs/source/notebooks

 **Blog Post**: `Explanation for ML using diverse counterfactuals <https://www.microsoft.com/en-us/research/blog/open-source-library-provides-explanation-for-machine-learning-through-diverse-counterfactuals/>`_
 
 **Case Studies**: `Towards Data Science <https://towardsdatascience.com/dice-diverse-counterfactual-explanations-for-hotel-cancellations-762c311b2c64>`_ (Hotel Bookings) | `Analytics Vidhya <https://medium.com/analytics-vidhya/dice-ml-models-with-counterfactual-explanations-for-the-sunk-titanic-30aa035056e0>`_ (Titanic Dataset)
 
.. image:: https://www.microsoft.com/en-us/research/uploads/prod/2020/01/MSR-Amit_1400x788-v3-1blog.gif
  :align: center
  :alt: Visualizing a counterfactual explanation
  
Explanations are critical for machine learning, especially as machine learning-based systems are being used to inform decisions in societally critical domains such as finance, healthcare, education, and criminal justice.
However, most explanation methods depend on an approximation of the ML model to
create an interpretable explanation. For example,
consider a person who applied for a loan and was rejected by the loan distribution algorithm of a financial company. Typically, the company may provide an explanation on why the loan was rejected, for example, due to "poor credit history". However, such an explanation does not help the person decide *what they should do next* to improve their chances of being approved in the future. Critically, the most important feature may not be enough to flip the decision of the algorithm, and in practice, may not even be changeable such as gender and race.


DiCE implements `counterfactual (CF) explanations <https://arxiv.org/abs/1711.00399>`_  that provide this information by showing feature-perturbed versions of the same person who would have received the loan, e.g., ``you would have received the loan if your income was higher by $10,000``. In other words, it provides "what-if" explanations for model output and can be a useful complement to other explanation methods, both for end-users and model developers.

Barring simple linear models, however, it is difficult to generate CF examples that work for any machine learning model. DiCE is based on `recent research <https://arxiv.org/abs/1905.07697>`_ that generates CF explanations for any ML model. The core idea is to setup finding such explanations as an optimization problem, similar to finding adversarial examples. The critical difference is that for explanations, we need perturbations that change the output of a machine learning model, but are also diverse and feasible to change. Therefore, DiCE supports generating a set of counterfactual explanations  and has tunable parameters for diversity and proximity of the explanations to the original input. It also supports simple constraints on features to ensure feasibility of the generated counterfactual examples.


Installing DICE
-----------------
DiCE supports Python 3+. The stable version of DiCE is available on `PyPI <https://pypi.org/project/dice-ml/>`_.

.. code:: bash

    pip install dice-ml

DiCE is also available on `conda-forge <https://anaconda.org/conda-forge/dice-ml>`_. 

.. code:: bash

    conda install -c conda-forge dice-ml

To install the latest (dev) version of DiCE and its dependencies, clone this repo and run `pip install` from the top-most folder of the repo:

.. code:: bash

    pip install -e .

If you face any problems, try installing dependencies manually.

.. code:: bash

    pip install -r requirements.txt
    # Additional dependendies for deep learning models
    pip install -r requirements-deeplearning.txt
    # For running unit tests
    pip install -r requirements-test.txt


Getting started with DiCE
-------------------------
With DiCE, generating explanations is a simple three-step  process: set up a dataset, train a model, and then invoke DiCE to generate counterfactual examples for any input. DiCE can also work with pre-trained models, with or without their original training data. 


.. code:: python

    import dice_ml
    from dice_ml.utils import helpers # helper functions
    from sklearn.model_selection import train_test_split

    dataset = helpers.load_adult_income_dataset()
    target = dataset["income"] # outcome variable 
    train_dataset, test_dataset, _, _ = train_test_split(dataset,
                                                         target,
                                                         test_size=0.2,
                                                         random_state=0,
                                                         stratify=target)
    # Dataset for training an ML model
    d = dice_ml.Data(dataframe=train_dataset,
                     continuous_features=['age', 'hours_per_week'],
                     outcome_name='income')
    
    # Pre-trained ML model
    m = dice_ml.Model(model_path=dice_ml.utils.helpers.get_adult_income_modelpath(),
                      backend='TF2', func="ohe-min-max")
    # DiCE explanation instance
    exp = dice_ml.Dice(d,m)

For any given input, we can now generate counterfactual explanations. For
example, the following input leads to class 0 (low income) and we would like to know what minimal changes would lead to a prediction of 1 (high income).

.. code:: python
    
    # Generate counterfactual examples
    query_instance = test_dataset.drop(columns="income")[0:1]
    dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=4, desired_class="opposite")
    # Visualize counterfactual explanation
    dice_exp.visualize_as_dataframe()

.. image:: https://raw.githubusercontent.com/interpretml/DiCE/master/docs/_static/getting_started_updated.png 
  :width: 400
  :alt: List of counterfactual examples

You can save the generated counterfactual examples in the following way.

.. code:: python

    # Save generated counterfactual examples to disk
    dice_exp.cf_examples_list[0].final_cfs_df.to_csv(path_or_buf='counterfactuals.csv', index=False)


For more details, check out the `docs/source/notebooks <https://github.com/interpretml/DiCE/tree/master/docs/source/notebooks>`_ folder. Here are some example notebooks:

* `Getting Started <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_getting_started.ipynb>`_: Generate CF examples for a `sklearn`, `tensorflow` or `pytorch` binary classifier and compute feature importance scores.
* `Explaining Multi-class Classifiers and Regressors
  <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_multiclass_classification_and_regression.ipynb>`_: Generate CF explanations for a multi-class classifier or regressor.
* `Local and Global Feature Importance <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_feature_importances.ipynb>`_: Estimate local and global feature importance scores using generated counterfactuals.
* `Providing Constraints on Counterfactual Generation
  <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb>`_: Specifying which features to vary and their permissible ranges for valid counterfactual examples.

Supported methods for generating counterfactuals
------------------------------------------------
DiCE can generate counterfactual examples using the following methods.

**Model-agnostic methods**

* Randomized sampling 
* KD-Tree (for counterfactuals within the training data)
* Genetic algorithm 

See `model-agnostic notebook
<https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb>`_ for code examples on using these methods.

**Gradient-based methods**

* An explicit loss-based method described in `Mothilal et al. (2020) <https://arxiv.org/abs/1905.07697>`_ (Default for deep learning models).
* A Variational AutoEncoder (VAE)-based method described in `Mahajan et al. (2019) <https://arxiv.org/abs/1912.03277>`_ (see the BaseVAE `notebook <https://github.com/interpretml/DiCE/blob/master/docs/notebooks/DiCE_getting_started_feasible.ipynb>`_).

The last two methods require a differentiable model, such as a neural network. If you are interested in a specific method, do raise an issue `here <https://github.com/interpretml/DiCE/issues>`_.

Supported use-cases
-------------------
**Data**

DiCE does not need access to the full dataset. It only requires metadata properties for each feature (min, max for continuous features and levels for categorical features). Thus, for sensitive data, the dataset can be provided as:

.. code:: python

    d = data.Data(features={
                       'age':[17, 90],
                       'workclass': ['Government', 'Other/Unknown', 'Private', 'Self-Employed'],
                       'education': ['Assoc', 'Bachelors', 'Doctorate', 'HS-grad', 'Masters', 'Prof-school', 'School', 'Some-college'],
                       'marital_status': ['Divorced', 'Married', 'Separated', 'Single', 'Widowed'],
                       'occupation':['Blue-Collar', 'Other/Unknown', 'Professional', 'Sales', 'Service', 'White-Collar'],
                       'race': ['Other', 'White'],
                       'gender':['Female', 'Male'],
                       'hours_per_week': [1, 99]},
             outcome_name='income')

**Model**

We support pre-trained models as well as training a model. Here's a simple example using Tensorflow. 

.. code:: python

    sess = tf.InteractiveSession()
    # Generating train and test data
    train, _ = d.split_data(d.normalize_data(d.one_hot_encoded_data))
    X_train = train.loc[:, train.columns != 'income']
    y_train = train.loc[:, train.columns == 'income']
    # Fitting a dense neural network model
    ann_model = keras.Sequential()
    ann_model.add(keras.layers.Dense(20, input_shape=(X_train.shape[1],), kernel_regularizer=keras.regularizers.l1(0.001), activation=tf.nn.relu))
    ann_model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))
    ann_model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.Adam(0.01), metrics=['accuracy'])
    ann_model.fit(X_train, y_train, validation_split=0.20, epochs=100, verbose=0, class_weight={0:1,1:2})

    # Generate the DiCE model for explanation
    m = model.Model(model=ann_model)

Check out the `Getting Started <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_getting_started.ipynb>`_ notebook to see code examples on using DiCE with sklearn and PyTorch models.

**Explanations**

We visualize explanations through a table highlighting the change in features. We plan to support an English language explanation too!

Feasibility of counterfactual explanations
-------------------------------------------
We acknowledge that not all counterfactual explanations may be feasible for a
user. In general, counterfactuals closer to an individual's profile will be
more feasible. Diversity is also important to help an individual choose between
multiple possible options.

DiCE provides tunable parameters for diversity and proximity to generate
different kinds of explanations.

.. code:: python

    dice_exp = exp.generate_counterfactuals(query_instance,
                    total_CFs=4, desired_class="opposite",
                    proximity_weight=1.5, diversity_weight=1.0)

Additionally, it may be the case that some features are harder to change than
others (e.g., education level is harder to change than working hours per week). DiCE allows input of relative difficulty in changing a feature through specifying *feature weights*. A higher feature weight means that the feature is harder to change than others. For instance, one way is to use the mean absolute deviation from the median as a measure of relative difficulty of changing a continuous feature. By default, DiCE computes this internally and divides the distance between continuous features by the MAD of the feature's values in the training set. We can also assign different values through the *feature_weights* parameter. 

.. code:: python

    # assigning new weights
    feature_weights = {'age': 10, 'hours_per_week': 5}
    # Now generating explanations using the new feature weights
    dice_exp = exp.generate_counterfactuals(query_instance,
                    total_CFs=4, desired_class="opposite",
                    feature_weights=feature_weights)

Finally, some features are impossible to change such as one's age or race. Therefore, DiCE also allows inputting a
list of features to vary.

.. code:: python

    dice_exp = exp.generate_counterfactuals(query_instance,
                    total_CFs=4, desired_class="opposite",
                    features_to_vary=['age','workclass','education','occupation','hours_per_week'])

It also supports simple constraints on
features that reflect practical constraints (e.g., working hours per week
should be between 10 and 50 using the ``permitted_range`` parameter).

For more details, check out `this <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb>`_ notebook.

The promise of counterfactual explanations
-------------------------------------------
Being truthful to the model, counterfactual explanations can be useful to all stakeholders for a decision made by a machine learning model that makes decisions.

* **Decision subjects**: Counterfactual explanations can be used to explore actionable recourse for a person based on a decision received by a ML model. DiCE shows decision outcomes with *actionable* alternative profiles, to help people understand what they could have done to change their model outcome.

* **ML model developers**: Counterfactual explanations are also useful for model developers to debug their model for potential problems. DiCE can be used to show CF explanations for a selection of inputs that can uncover if there are any problematic (in)dependences on some features (e.g., for 95% of inputs, changing features X and Y change the outcome, but not for the other 5%). We aim to support aggregate metrics to help developers debug ML models.

* **Decision makers**: Counterfactual explanations may be useful to
  decision-makers such as doctors or judges who may use ML models to make decisions. For a particular individual, DiCE allows probing the ML model to see the possible changes that lead to a different ML outcome, thus enabling decision-makers to assess their trust in the prediction.

* **Decision evaluators**: Finally, counterfactual explanations can be useful
  to decision evaluators who may be interested in fairness or other desirable
  properties of an ML model. We plan to add support for this in the future.


Roadmap
-------
Ideally, counterfactual explanations should balance between a wide range of suggested changes (*diversity*), and the relative ease of adopting those changes (*proximity* to the original input), and also follow the causal laws of the world, e.g., one can hardly lower their educational degree or change their race.

We are working on adding the following features to DiCE:

* Support for using DiCE for debugging machine learning models
* Constructed English phrases (e.g., ``desired outcome if feature was changed``) and other ways to output the counterfactual examples
* Evaluating feature attribution methods like LIME and SHAP on necessity and sufficiency metrics using counterfactuals (see `this paper <https://arxiv.org/abs/2011.04917>`_)
* Support for Bayesian optimization and other algorithms for generating counterfactual explanations
* Better feasibility constraints for counterfactual generation 

Citing
-------
If you find DiCE useful for your research work, please cite it as follows.

Ramaravind K. Mothilal, Amit Sharma, and Chenhao Tan (2020). **Explaining machine learning classifiers through diverse counterfactual explanations**. *Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency*. 

Bibtex::

	@inproceedings{mothilal2020dice,
  		title={Explaining machine learning classifiers through diverse counterfactual explanations},
  		author={Mothilal, Ramaravind K and Sharma, Amit and Tan, Chenhao},
  		booktitle={Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency},
  		pages={607--617},
  		year={2020}
	}


Contributing
------------

This project welcomes contributions and suggestions.  Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the `Microsoft Open Source Code of Conduct <https://opensource.microsoft.com/codeofconduct/>`_.
For more information see the `Code of Conduct FAQ <https://opensource.microsoft.com/codeofconduct/faq/>`_ or
contact `opencode@microsoft.com <mailto:opencode@microsoft.com>`_ with any additional questions or comments.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/interpretml/DiCE",
    "name": "dice-ml",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "machine-learning explanation interpretability counterfactual",
    "author": "Ramaravind Mothilal, Amit Sharma, Chenhao Tan",
    "author_email": "raam.arvind93@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f4/fc/2129adcdcd6ffb771ef369b9413ca8461d76b24c203b09d306bef56fb6cf/dice_ml-0.11.tar.gz",
    "platform": null,
    "description": "|PyPiVersion|_ |CondaVersion|_ |MITlicense| |PythonSupport|_ |Downloads|_ \n\n|BuildStatusTests|_ |BuildStatusNotebooks|_ \n\n.. |MITlicense| image:: https://img.shields.io/badge/License-MIT-blue.svg\n.. _MITlicense: https://img.shields.io/badge/License-MIT-blue.svg\n\n.. |PyPiVersion| image:: https://img.shields.io/pypi/v/dice-ml\n.. _PyPiVersion: https://pypi.org/project/dice-ml/\n\n.. |Downloads| image:: https://static.pepy.tech/personalized-badge/dice-ml?period=total&units=international_system&left_color=grey&right_color=orange&left_text=Downloads\n.. _Downloads: https://pepy.tech/project/dice-ml\n\n.. |PythonSupport| image:: https://img.shields.io/pypi/pyversions/dice-ml\n.. _PythonSupport: https://pypi.org/project/dice-ml/\n\n.. |CondaVersion| image:: https://anaconda.org/conda-forge/dice-ml/badges/version.svg\n.. _CondaVersion: https://anaconda.org/conda-forge/dice-ml\n\n.. |BuildStatusTests| image:: https://github.com/interpretml/DiCE/actions/workflows/python-package.yml/badge.svg?branch=main\n.. _BuildStatusTests: https://github.com/interpretml/DiCE/actions/workflows/python-package.yml?query=workflow%3A%22Python+package%22\n\n.. |BuildStatusNotebooks| image:: https://github.com/interpretml/DiCE/actions/workflows/notebook-tests.yml/badge.svg?branch=main\n.. _BuildStatusNotebooks: https://github.com/interpretml/DiCE/actions/workflows/notebook-tests.yml?query=workflow%3A%22Notebook+tests%22\n\nDiverse Counterfactual Explanations (DiCE) for ML\n======================================================================\n\n*How to explain a machine learning model such that the explanation is truthful to the model and yet interpretable to people?*\n\n`Ramaravind K. Mothilal <https://raam93.github.io/>`_, `Amit Sharma <http://www.amitsharma.in/>`_, `Chenhao Tan <https://chenhaot.com/>`_\n  \n`FAT* '20 paper <https://arxiv.org/abs/1905.07697>`_ | `Docs <https://interpretml.github.io/DiCE/>`_ | `Example Notebooks <https://github.com/interpretml/DiCE/tree/master/docs/source/notebooks>`_ | Live Jupyter notebook |Binder|_\n\n.. |Binder| image:: https://mybinder.org/badge_logo.svg\n.. _Binder:  https://mybinder.org/v2/gh/interpretML/DiCE/master?filepath=docs/source/notebooks\n\n **Blog Post**: `Explanation for ML using diverse counterfactuals <https://www.microsoft.com/en-us/research/blog/open-source-library-provides-explanation-for-machine-learning-through-diverse-counterfactuals/>`_\n \n **Case Studies**: `Towards Data Science <https://towardsdatascience.com/dice-diverse-counterfactual-explanations-for-hotel-cancellations-762c311b2c64>`_ (Hotel Bookings) | `Analytics Vidhya <https://medium.com/analytics-vidhya/dice-ml-models-with-counterfactual-explanations-for-the-sunk-titanic-30aa035056e0>`_ (Titanic Dataset)\n \n.. image:: https://www.microsoft.com/en-us/research/uploads/prod/2020/01/MSR-Amit_1400x788-v3-1blog.gif\n  :align: center\n  :alt: Visualizing a counterfactual explanation\n  \nExplanations are critical for machine learning, especially as machine learning-based systems are being used to inform decisions in societally critical domains such as finance, healthcare, education, and criminal justice.\nHowever, most explanation methods depend on an approximation of the ML model to\ncreate an interpretable explanation. For example,\nconsider a person who applied for a loan and was rejected by the loan distribution algorithm of a financial company. Typically, the company may provide an explanation on why the loan was rejected, for example, due to \"poor credit history\". However, such an explanation does not help the person decide *what they should do next* to improve their chances of being approved in the future. Critically, the most important feature may not be enough to flip the decision of the algorithm, and in practice, may not even be changeable such as gender and race.\n\n\nDiCE implements `counterfactual (CF) explanations <https://arxiv.org/abs/1711.00399>`_  that provide this information by showing feature-perturbed versions of the same person who would have received the loan, e.g., ``you would have received the loan if your income was higher by $10,000``. In other words, it provides \"what-if\" explanations for model output and can be a useful complement to other explanation methods, both for end-users and model developers.\n\nBarring simple linear models, however, it is difficult to generate CF examples that work for any machine learning model. DiCE is based on `recent research <https://arxiv.org/abs/1905.07697>`_ that generates CF explanations for any ML model. The core idea is to setup finding such explanations as an optimization problem, similar to finding adversarial examples. The critical difference is that for explanations, we need perturbations that change the output of a machine learning model, but are also diverse and feasible to change. Therefore, DiCE supports generating a set of counterfactual explanations  and has tunable parameters for diversity and proximity of the explanations to the original input. It also supports simple constraints on features to ensure feasibility of the generated counterfactual examples.\n\n\nInstalling DICE\n-----------------\nDiCE supports Python 3+. The stable version of DiCE is available on `PyPI <https://pypi.org/project/dice-ml/>`_.\n\n.. code:: bash\n\n    pip install dice-ml\n\nDiCE is also available on `conda-forge <https://anaconda.org/conda-forge/dice-ml>`_. \n\n.. code:: bash\n\n    conda install -c conda-forge dice-ml\n\nTo install the latest (dev) version of DiCE and its dependencies, clone this repo and run `pip install` from the top-most folder of the repo:\n\n.. code:: bash\n\n    pip install -e .\n\nIf you face any problems, try installing dependencies manually.\n\n.. code:: bash\n\n    pip install -r requirements.txt\n    # Additional dependendies for deep learning models\n    pip install -r requirements-deeplearning.txt\n    # For running unit tests\n    pip install -r requirements-test.txt\n\n\nGetting started with DiCE\n-------------------------\nWith DiCE, generating explanations is a simple three-step  process: set up a dataset, train a model, and then invoke DiCE to generate counterfactual examples for any input. DiCE can also work with pre-trained models, with or without their original training data. \n\n\n.. code:: python\n\n    import dice_ml\n    from dice_ml.utils import helpers # helper functions\n    from sklearn.model_selection import train_test_split\n\n    dataset = helpers.load_adult_income_dataset()\n    target = dataset[\"income\"] # outcome variable \n    train_dataset, test_dataset, _, _ = train_test_split(dataset,\n                                                         target,\n                                                         test_size=0.2,\n                                                         random_state=0,\n                                                         stratify=target)\n    # Dataset for training an ML model\n    d = dice_ml.Data(dataframe=train_dataset,\n                     continuous_features=['age', 'hours_per_week'],\n                     outcome_name='income')\n    \n    # Pre-trained ML model\n    m = dice_ml.Model(model_path=dice_ml.utils.helpers.get_adult_income_modelpath(),\n                      backend='TF2', func=\"ohe-min-max\")\n    # DiCE explanation instance\n    exp = dice_ml.Dice(d,m)\n\nFor any given input, we can now generate counterfactual explanations. For\nexample, the following input leads to class 0 (low income) and we would like to know what minimal changes would lead to a prediction of 1 (high income).\n\n.. code:: python\n    \n    # Generate counterfactual examples\n    query_instance = test_dataset.drop(columns=\"income\")[0:1]\n    dice_exp = exp.generate_counterfactuals(query_instance, total_CFs=4, desired_class=\"opposite\")\n    # Visualize counterfactual explanation\n    dice_exp.visualize_as_dataframe()\n\n.. image:: https://raw.githubusercontent.com/interpretml/DiCE/master/docs/_static/getting_started_updated.png \n  :width: 400\n  :alt: List of counterfactual examples\n\nYou can save the generated counterfactual examples in the following way.\n\n.. code:: python\n\n    # Save generated counterfactual examples to disk\n    dice_exp.cf_examples_list[0].final_cfs_df.to_csv(path_or_buf='counterfactuals.csv', index=False)\n\n\nFor more details, check out the `docs/source/notebooks <https://github.com/interpretml/DiCE/tree/master/docs/source/notebooks>`_ folder. Here are some example notebooks:\n\n* `Getting Started <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_getting_started.ipynb>`_: Generate CF examples for a `sklearn`, `tensorflow` or `pytorch` binary classifier and compute feature importance scores.\n* `Explaining Multi-class Classifiers and Regressors\n  <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_multiclass_classification_and_regression.ipynb>`_: Generate CF explanations for a multi-class classifier or regressor.\n* `Local and Global Feature Importance <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_feature_importances.ipynb>`_: Estimate local and global feature importance scores using generated counterfactuals.\n* `Providing Constraints on Counterfactual Generation\n  <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb>`_: Specifying which features to vary and their permissible ranges for valid counterfactual examples.\n\nSupported methods for generating counterfactuals\n------------------------------------------------\nDiCE can generate counterfactual examples using the following methods.\n\n**Model-agnostic methods**\n\n* Randomized sampling \n* KD-Tree (for counterfactuals within the training data)\n* Genetic algorithm \n\nSee `model-agnostic notebook\n<https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb>`_ for code examples on using these methods.\n\n**Gradient-based methods**\n\n* An explicit loss-based method described in `Mothilal et al. (2020) <https://arxiv.org/abs/1905.07697>`_ (Default for deep learning models).\n* A Variational AutoEncoder (VAE)-based method described in `Mahajan et al. (2019) <https://arxiv.org/abs/1912.03277>`_ (see the BaseVAE `notebook <https://github.com/interpretml/DiCE/blob/master/docs/notebooks/DiCE_getting_started_feasible.ipynb>`_).\n\nThe last two methods require a differentiable model, such as a neural network. If you are interested in a specific method, do raise an issue `here <https://github.com/interpretml/DiCE/issues>`_.\n\nSupported use-cases\n-------------------\n**Data**\n\nDiCE does not need access to the full dataset. It only requires metadata properties for each feature (min, max for continuous features and levels for categorical features). Thus, for sensitive data, the dataset can be provided as:\n\n.. code:: python\n\n    d = data.Data(features={\n                       'age':[17, 90],\n                       'workclass': ['Government', 'Other/Unknown', 'Private', 'Self-Employed'],\n                       'education': ['Assoc', 'Bachelors', 'Doctorate', 'HS-grad', 'Masters', 'Prof-school', 'School', 'Some-college'],\n                       'marital_status': ['Divorced', 'Married', 'Separated', 'Single', 'Widowed'],\n                       'occupation':['Blue-Collar', 'Other/Unknown', 'Professional', 'Sales', 'Service', 'White-Collar'],\n                       'race': ['Other', 'White'],\n                       'gender':['Female', 'Male'],\n                       'hours_per_week': [1, 99]},\n             outcome_name='income')\n\n**Model**\n\nWe support pre-trained models as well as training a model. Here's a simple example using Tensorflow. \n\n.. code:: python\n\n    sess = tf.InteractiveSession()\n    # Generating train and test data\n    train, _ = d.split_data(d.normalize_data(d.one_hot_encoded_data))\n    X_train = train.loc[:, train.columns != 'income']\n    y_train = train.loc[:, train.columns == 'income']\n    # Fitting a dense neural network model\n    ann_model = keras.Sequential()\n    ann_model.add(keras.layers.Dense(20, input_shape=(X_train.shape[1],), kernel_regularizer=keras.regularizers.l1(0.001), activation=tf.nn.relu))\n    ann_model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))\n    ann_model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.Adam(0.01), metrics=['accuracy'])\n    ann_model.fit(X_train, y_train, validation_split=0.20, epochs=100, verbose=0, class_weight={0:1,1:2})\n\n    # Generate the DiCE model for explanation\n    m = model.Model(model=ann_model)\n\nCheck out the `Getting Started <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_getting_started.ipynb>`_ notebook to see code examples on using DiCE with sklearn and PyTorch models.\n\n**Explanations**\n\nWe visualize explanations through a table highlighting the change in features. We plan to support an English language explanation too!\n\nFeasibility of counterfactual explanations\n-------------------------------------------\nWe acknowledge that not all counterfactual explanations may be feasible for a\nuser. In general, counterfactuals closer to an individual's profile will be\nmore feasible. Diversity is also important to help an individual choose between\nmultiple possible options.\n\nDiCE provides tunable parameters for diversity and proximity to generate\ndifferent kinds of explanations.\n\n.. code:: python\n\n    dice_exp = exp.generate_counterfactuals(query_instance,\n                    total_CFs=4, desired_class=\"opposite\",\n                    proximity_weight=1.5, diversity_weight=1.0)\n\nAdditionally, it may be the case that some features are harder to change than\nothers (e.g., education level is harder to change than working hours per week). DiCE allows input of relative difficulty in changing a feature through specifying *feature weights*. A higher feature weight means that the feature is harder to change than others. For instance, one way is to use the mean absolute deviation from the median as a measure of relative difficulty of changing a continuous feature. By default, DiCE computes this internally and divides the distance between continuous features by the MAD of the feature's values in the training set. We can also assign different values through the *feature_weights* parameter. \n\n.. code:: python\n\n    # assigning new weights\n    feature_weights = {'age': 10, 'hours_per_week': 5}\n    # Now generating explanations using the new feature weights\n    dice_exp = exp.generate_counterfactuals(query_instance,\n                    total_CFs=4, desired_class=\"opposite\",\n                    feature_weights=feature_weights)\n\nFinally, some features are impossible to change such as one's age or race. Therefore, DiCE also allows inputting a\nlist of features to vary.\n\n.. code:: python\n\n    dice_exp = exp.generate_counterfactuals(query_instance,\n                    total_CFs=4, desired_class=\"opposite\",\n                    features_to_vary=['age','workclass','education','occupation','hours_per_week'])\n\nIt also supports simple constraints on\nfeatures that reflect practical constraints (e.g., working hours per week\nshould be between 10 and 50 using the ``permitted_range`` parameter).\n\nFor more details, check out `this <https://github.com/interpretml/DiCE/blob/master/docs/source/notebooks/DiCE_model_agnostic_CFs.ipynb>`_ notebook.\n\nThe promise of counterfactual explanations\n-------------------------------------------\nBeing truthful to the model, counterfactual explanations can be useful to all stakeholders for a decision made by a machine learning model that makes decisions.\n\n* **Decision subjects**: Counterfactual explanations can be used to explore actionable recourse for a person based on a decision received by a ML model. DiCE shows decision outcomes with *actionable* alternative profiles, to help people understand what they could have done to change their model outcome.\n\n* **ML model developers**: Counterfactual explanations are also useful for model developers to debug their model for potential problems. DiCE can be used to show CF explanations for a selection of inputs that can uncover if there are any problematic (in)dependences on some features (e.g., for 95% of inputs, changing features X and Y change the outcome, but not for the other 5%). We aim to support aggregate metrics to help developers debug ML models.\n\n* **Decision makers**: Counterfactual explanations may be useful to\n  decision-makers such as doctors or judges who may use ML models to make decisions. For a particular individual, DiCE allows probing the ML model to see the possible changes that lead to a different ML outcome, thus enabling decision-makers to assess their trust in the prediction.\n\n* **Decision evaluators**: Finally, counterfactual explanations can be useful\n  to decision evaluators who may be interested in fairness or other desirable\n  properties of an ML model. We plan to add support for this in the future.\n\n\nRoadmap\n-------\nIdeally, counterfactual explanations should balance between a wide range of suggested changes (*diversity*), and the relative ease of adopting those changes (*proximity* to the original input), and also follow the causal laws of the world, e.g., one can hardly lower their educational degree or change their race.\n\nWe are working on adding the following features to DiCE:\n\n* Support for using DiCE for debugging machine learning models\n* Constructed English phrases (e.g., ``desired outcome if feature was changed``) and other ways to output the counterfactual examples\n* Evaluating feature attribution methods like LIME and SHAP on necessity and sufficiency metrics using counterfactuals (see `this paper <https://arxiv.org/abs/2011.04917>`_)\n* Support for Bayesian optimization and other algorithms for generating counterfactual explanations\n* Better feasibility constraints for counterfactual generation \n\nCiting\n-------\nIf you find DiCE useful for your research work, please cite it as follows.\n\nRamaravind K. Mothilal, Amit Sharma, and Chenhao Tan (2020). **Explaining machine learning classifiers through diverse counterfactual explanations**. *Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency*. \n\nBibtex::\n\n\t@inproceedings{mothilal2020dice,\n  \t\ttitle={Explaining machine learning classifiers through diverse counterfactual explanations},\n  \t\tauthor={Mothilal, Ramaravind K and Sharma, Amit and Tan, Chenhao},\n  \t\tbooktitle={Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency},\n  \t\tpages={607--617},\n  \t\tyear={2020}\n\t}\n\n\nContributing\n------------\n\nThis project welcomes contributions and suggestions.  Most contributions require you to agree to a\nContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us\nthe rights to use your contribution. For details, visit https://cla.microsoft.com.\n\nWhen you submit a pull request, a CLA-bot will automatically determine whether you need to provide\na CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions\nprovided by the bot. You will only need to do this once across all repos using our CLA.\n\nThis project has adopted the `Microsoft Open Source Code of Conduct <https://opensource.microsoft.com/codeofconduct/>`_.\nFor more information see the `Code of Conduct FAQ <https://opensource.microsoft.com/codeofconduct/faq/>`_ or\ncontact `opencode@microsoft.com <mailto:opencode@microsoft.com>`_ with any additional questions or comments.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Generate Diverse Counterfactual Explanations for any machine learning model.",
    "version": "0.11",
    "project_urls": {
        "Download": "https://github.com/interpretml/DiCE/archive/v0.11.tar.gz",
        "Homepage": "https://github.com/interpretml/DiCE"
    },
    "split_keywords": [
        "machine-learning",
        "explanation",
        "interpretability",
        "counterfactual"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "691cec136743072d7b4917d72d975e094c8dc9bce86920519aff97854a7dc3ce",
                "md5": "bf876d652e75cdb3dade38f5f2416db1",
                "sha256": "9a1c199f4a0f9a865319a17f00a76c07e21c9d090e1aa10f23ea13e25e8d8455"
            },
            "downloads": -1,
            "filename": "dice_ml-0.11-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bf876d652e75cdb3dade38f5f2416db1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 2526789,
            "upload_time": "2023-10-27T03:54:06",
            "upload_time_iso_8601": "2023-10-27T03:54:06.293214Z",
            "url": "https://files.pythonhosted.org/packages/69/1c/ec136743072d7b4917d72d975e094c8dc9bce86920519aff97854a7dc3ce/dice_ml-0.11-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f4fc2129adcdcd6ffb771ef369b9413ca8461d76b24c203b09d306bef56fb6cf",
                "md5": "26d6b2cd38dea22cf1a1f342d21a3710",
                "sha256": "29e6bea9e4c877caa68ecdd5d981f91cc8de3b73cdad453fd7c7e000a336576d"
            },
            "downloads": -1,
            "filename": "dice_ml-0.11.tar.gz",
            "has_sig": false,
            "md5_digest": "26d6b2cd38dea22cf1a1f342d21a3710",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 15030781,
            "upload_time": "2023-10-27T03:54:08",
            "upload_time_iso_8601": "2023-10-27T03:54:08.891044Z",
            "url": "https://files.pythonhosted.org/packages/f4/fc/2129adcdcd6ffb771ef369b9413ca8461d76b24c203b09d306bef56fb6cf/dice_ml-0.11.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-27 03:54:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "interpretml",
    "github_project": "DiCE",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "dice-ml"
}
        
Elapsed time: 0.18006s