timeshap


Nametimeshap JSON
Version 1.0.4 PyPI version JSON
download
home_pagehttps://github.com/feedzai/timeshap
SummaryKernelSHAP adaptation for recurrent models.
upload_time2023-09-13 09:24:00
maintainer
docs_urlNone
authorFeedzai
requires_python>=3.6
license
keywords explainability timeshap
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # TimeSHAP
TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and 
extends it to the sequential domain.
TimeSHAP computes event/timestamp- feature-, and cell-level attributions. 
As sequences can be arbitrarily long, TimeSHAP also implements a pruning algorithm
based on Shapley Values, that finds a subset of consecutive, recent events that contribute
the most to the decision.


This repository is the code implementation of the TimeSHAP algorithm 
present in the paper `TimeSHAP: Explaining Recurrent Models through Sequence Perturbations`
published at **KDD 2021**. 

Links to the paper [here](https://arxiv.org/abs/2012.00073), 
and to the video presentation [here](https://www.youtube.com/watch?v=Q7Q9o7ywXx8).


## Install TimeSHAP

##### Via Pip
```
pip install timeshap
```

##### Via Github
Clone the repository into a local directory using:
```
git clone https://github.com/feedzai/timeshap.git
```

Move into the cloned repo and install the package:

```
cd timeshap
pip install .
```


##### Test your installation
Start a Python session in your terminal using

```
python
```

And import TimeSHAP

```
import timeshap
```

## TimeSHAP in 30 seconds

#### Inputs
- Model being explained;
- Instance(s) to explain;
- Background instance.

#### Outputs
- Local pruning output; (explaining a single instance)
- Local event explanations; (explaining a single instance)
- Local feature explanations; (explaining a single instance)
- Global pruning statistics; (explaining multiple instances)
- Global event explanations; (explaining multiple instances)
- Global feature explanations; (explaining multiple instances)

### Model Interface

In order for TimeSHAP to explain a model, an entry point must be provided.
This `Callable` entry point must receive a 3-D numpy array, `(#sequences; #sequence length; #features)`
and return a 2-D numpy array `(#sequences; 1)` with the corresponding score of each sequence.

In addition, to make TimeSHAP more optimized, it is possible to return the **hidden state**
of the model together with the score (if applicable). Although this is optional, we highly recommended it, 
as it has a very high impact. 
If you choose to return the hidden state, this hidden state should either be: 
(see [notebook](notebooks/AReM/AReM_API_showcase.ipynb) for specific examples)
 - a 3-D numpy array, `(#rnn layers, #sequences, #hidden_dimension)` (class `ExplainedRNN` on notebook);
 - a tuple of numpy arrays that follows the previously described characteristic 
 (usually used when using stacked RNNs with different hidden dimensions) (class `ExplainedGRU2Layer` on notebook); 
 - a tuple of tuples of numpy arrays (usually used when using LSTM's) (class `ExplainedLSTM` on notebook);;
TimeSHAP is able to explain any black-box model as long as it complies with the 
previously described interface, including both PyTorch and TensorFlow models, 
both examplified in our tutorials ([PyTorch](notebooks/AReM/AReM.ipynb), [TensorFlow](notebooks/AReM/AReM_TF.ipynb)).

Example provided in our tutorials:
- **TensorFLow**
```
model = tf.keras.models.Model(inputs=inputs, outputs=ff2)
f = lambda x: model.predict(x)
```

- **Pytorch** - (Example where model receives and returns hidden states)
```
model_wrapped = TorchModelWrapper(model)
f_hs = lambda x, y=None: model_wrapped.predict_last_hs(x, y)
```


###### Model Wrappers
In order to facilitate the interface between models and TimeSHAP, 
TimeSHAP implements `ModelWrappers`. These wrappers, used on the PyTorch
[tutorial](notebooks/AReM/AReM.ipynb) notebook, allow for greater flexibility
of explained models as they allow:
- **Batching logic**: useful when using very large inputs or NSamples, which cannot fit
on GPU memory, and therefore batching mechanisms are required;
- **Input format/type**: useful when your model does not work with numpy arrays. This
is the case of our provided PyToch example; 
- **Hidden state logic**: useful when the hidden states of your models do not match
the hidden state format required by TimeSHAP


### TimeSHAP Explanation Methods
TimeSHAP offers several methods to use depending on the desired explanations.
Local methods provide detailed view of a model decision corresponding
to a specific sequence being explained.
Global methods aggregate local explanations of a given dataset
to present a global view of the model.

#### Local Explanations
##### Pruning

[`local_pruning()`](src/timeshap/explainer/pruning.py) performs the pruning
algorithm on a given sequence with a given user defined tolerance and returns 
the pruning index along the information for plotting.

[`plot_temp_coalition_pruning()`](src/timeshap/plot/pruning.py) plots the pruning 
algorithm information calculated by `local_pruning()`.

<img src="resources/images/pruning.png" width="400">

##### Event level explanations

[`local_event()`](src/timeshap/explainer/event_level.py) calculates event level explanations
of a given sequence with the user-given parameteres and returns the respective 
event-level explanations.

[`plot_event_heatmap()`](src/timeshap/plot/event_level.py) plots the event-level explanations
calculated by `local_event()`.

<img src="resources/images/event_level.png" width="275">

##### Feature level explanations

[`local_feat()`](src/timeshap/explainer/feature_level.py) calculates feature level explanations
of a given sequence with the user-given parameteres and returns the respective 
feature-level explanations.

[`plot_feat_barplot()`](src/timeshap/plot/feature_level.py) plots the feature-level explanations
calculated by `local_feat()`.

<img src="resources/images/feature_level.png" width="350">

##### Cell level explanations

[`local_cell_level()`](src/timeshap/explainer/cell_level.py) calculates cell level explanations
of a given sequence with the respective event- and feature-level explanations
and user-given parameteres, returing the respective cell-level explanations.

[`plot_cell_level()`](src/timeshap/plot/cell_level.py) plots the feature-level explanations
calculated by  `local_cell_level()`.

<img src="resources/images/cell_level.png" width="350">

##### Local Report

[`local_report()`](src/timeshap/explainer/local_methods.py) calculates TimeSHAP 
local explanations for a given sequence and plots them.

<img src="resources/images/local_report.png" width="800">

#### Global Explanations


##### Global pruning statistics

[`prune_all()`](src/timeshap/explainer/pruning.py) performs the pruning
algorithm on multiple given sequences.

[`pruning_statistics()`](src/timeshap/plot/pruning.py) calculates the pruning
statistics for several user-given pruning tolerances using the pruning
data calculated by `prune_all()`, returning a `pandas.DataFrame` with the statistics.


##### Global event level explanations

[`event_explain_all()`](src/timeshap/explainer/event_level.py) calculates TimeSHAP 
event level explanations for multiple instances given user defined parameters.

[`plot_global_event()`](src/timeshap/plot/event_level.py) plots the global event-level explanations
calculated by `event_explain_all()`.

<img src="resources/images/global_event.png" width="600">

##### Global feature level explanations

[`feat_explain_all()`](src/timeshap/explainer/feature_level.py) calculates TimeSHAP 
feature level explanations for multiple instances given user defined parameters.

[`plot_global_feat()`](src/timeshap/plot/feature_level.py) plots the global feature-level 
explanations calculated by `feat_explain_all()`.

<img src="resources/images/global_feat.png" width="450">


##### Global report
[`global_report()`](src/timeshap/explainer/global_methods.py) calculates TimeSHAP 
explanations for multiple instances, aggregating the explanations on two plots
and returning them.

<img src="resources/images/global_report.png" width="800">



## Tutorial
In order to demonstrate TimeSHAP interfaces and methods, you can consult
[AReM.ipynb](notebooks/AReM/AReM.ipynb). 
In this tutorial we get an open-source dataset, process it, train 
Pytorch recurrent model with it and use TimeSHAP to explain it, showcasing all 
previously described methods.

Additionally, we also train a TensorFlow model on the same dataset 
[AReM_TF.ipynb](notebooks/AReM/AReM_TF.ipynb).

## Repository Structure

- [`notebooks`](notebooks) - tutorial notebooks demonstrating the package;
- [`src/timeshap`](src/timeshap) - the package source code;
  - [`src/timeshap/explainer`](src/timeshap/explainer) - TimeSHAP methods to produce the explanations
  - [`src/timeshap/explainer/kernel`](src/timeshap/explainer/kernel) - TimeSHAPKernel
  - [`src/timeshap/plot`](src/timeshap/plot) - TimeSHAP methods to produce explanation plots
  - [`src/timeshap/utils`](src/timeshap/utils) - util methods for TimeSHAP execution
  - [`src/timeshap/wrappers`](src/timeshap/wrappers) - Wrapper classes for models in order to ease TimeSHAP explanations

## Citing TimeSHAP
```
@inproceedings{bento2021timeshap,
    author = {Bento, Jo\~{a}o and Saleiro, Pedro and Cruz, Andr\'{e} F. and Figueiredo, M\'{a}rio A.T. and Bizarro, Pedro},
    title = {TimeSHAP: Explaining Recurrent Models through Sequence Perturbations},
    year = {2021},
    isbn = {9781450383325},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3447548.3467166},
    doi = {10.1145/3447548.3467166},
    booktitle = {Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining},
    pages = {2565–2573},
    numpages = {9},
    keywords = {SHAP, Shapley values, TimeSHAP, XAI, RNN, explainability},
    location = {Virtual Event, Singapore},
    series = {KDD '21}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/feedzai/timeshap",
    "name": "timeshap",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "explainability,TimeShap",
    "author": "Feedzai",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/e4/d3/bcd4cd16b92202a456970a4854a88bf45e72d9bb12a48bd76ebbfe1daa00/timeshap-1.0.4.tar.gz",
    "platform": null,
    "description": "# TimeSHAP\nTimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and \nextends it to the sequential domain.\nTimeSHAP computes event/timestamp- feature-, and cell-level attributions. \nAs sequences can be arbitrarily long, TimeSHAP also implements a pruning algorithm\nbased on Shapley Values, that finds a subset of consecutive, recent events that contribute\nthe most to the decision.\n\n\nThis repository is the code implementation of the TimeSHAP algorithm \npresent in the paper `TimeSHAP: Explaining Recurrent Models through Sequence Perturbations`\npublished at **KDD 2021**. \n\nLinks to the paper [here](https://arxiv.org/abs/2012.00073), \nand to the video presentation [here](https://www.youtube.com/watch?v=Q7Q9o7ywXx8).\n\n\n## Install TimeSHAP\n\n##### Via Pip\n```\npip install timeshap\n```\n\n##### Via Github\nClone the repository into a local directory using:\n```\ngit clone https://github.com/feedzai/timeshap.git\n```\n\nMove into the cloned repo and install the package:\n\n```\ncd timeshap\npip install .\n```\n\n\n##### Test your installation\nStart a Python session in your terminal using\n\n```\npython\n```\n\nAnd import TimeSHAP\n\n```\nimport timeshap\n```\n\n## TimeSHAP in 30 seconds\n\n#### Inputs\n- Model being explained;\n- Instance(s) to explain;\n- Background instance.\n\n#### Outputs\n- Local pruning output; (explaining a single instance)\n- Local event explanations; (explaining a single instance)\n- Local feature explanations; (explaining a single instance)\n- Global pruning statistics; (explaining multiple instances)\n- Global event explanations; (explaining multiple instances)\n- Global feature explanations; (explaining multiple instances)\n\n### Model Interface\n\nIn order for TimeSHAP to explain a model, an entry point must be provided.\nThis `Callable` entry point must receive a 3-D numpy array, `(#sequences; #sequence length; #features)`\nand return a 2-D numpy array `(#sequences; 1)` with the corresponding score of each sequence.\n\nIn addition, to make TimeSHAP more optimized, it is possible to return the **hidden state**\nof the model together with the score (if applicable). Although this is optional, we highly recommended it, \nas it has a very high impact. \nIf you choose to return the hidden state, this hidden state should either be: \n(see [notebook](notebooks/AReM/AReM_API_showcase.ipynb) for specific examples)\n - a 3-D numpy array, `(#rnn layers, #sequences, #hidden_dimension)` (class `ExplainedRNN` on notebook);\n - a tuple of numpy arrays that follows the previously described characteristic \n (usually used when using stacked RNNs with different hidden dimensions) (class `ExplainedGRU2Layer` on notebook); \n - a tuple of tuples of numpy arrays (usually used when using LSTM's) (class `ExplainedLSTM` on notebook);;\nTimeSHAP is able to explain any black-box model as long as it complies with the \npreviously described interface, including both PyTorch and TensorFlow models, \nboth examplified in our tutorials ([PyTorch](notebooks/AReM/AReM.ipynb), [TensorFlow](notebooks/AReM/AReM_TF.ipynb)).\n\nExample provided in our tutorials:\n- **TensorFLow**\n```\nmodel = tf.keras.models.Model(inputs=inputs, outputs=ff2)\nf = lambda x: model.predict(x)\n```\n\n- **Pytorch** - (Example where model receives and returns hidden states)\n```\nmodel_wrapped = TorchModelWrapper(model)\nf_hs = lambda x, y=None: model_wrapped.predict_last_hs(x, y)\n```\n\n\n###### Model Wrappers\nIn order to facilitate the interface between models and TimeSHAP, \nTimeSHAP implements `ModelWrappers`. These wrappers, used on the PyTorch\n[tutorial](notebooks/AReM/AReM.ipynb) notebook, allow for greater flexibility\nof explained models as they allow:\n- **Batching logic**: useful when using very large inputs or NSamples, which cannot fit\non GPU memory, and therefore batching mechanisms are required;\n- **Input format/type**: useful when your model does not work with numpy arrays. This\nis the case of our provided PyToch example; \n- **Hidden state logic**: useful when the hidden states of your models do not match\nthe hidden state format required by TimeSHAP\n\n\n### TimeSHAP Explanation Methods\nTimeSHAP offers several methods to use depending on the desired explanations.\nLocal methods provide detailed view of a model decision corresponding\nto a specific sequence being explained.\nGlobal methods aggregate local explanations of a given dataset\nto present a global view of the model.\n\n#### Local Explanations\n##### Pruning\n\n[`local_pruning()`](src/timeshap/explainer/pruning.py) performs the pruning\nalgorithm on a given sequence with a given user defined tolerance and returns \nthe pruning index along the information for plotting.\n\n[`plot_temp_coalition_pruning()`](src/timeshap/plot/pruning.py) plots the pruning \nalgorithm information calculated by `local_pruning()`.\n\n<img src=\"resources/images/pruning.png\" width=\"400\">\n\n##### Event level explanations\n\n[`local_event()`](src/timeshap/explainer/event_level.py) calculates event level explanations\nof a given sequence with the user-given parameteres and returns the respective \nevent-level explanations.\n\n[`plot_event_heatmap()`](src/timeshap/plot/event_level.py) plots the event-level explanations\ncalculated by `local_event()`.\n\n<img src=\"resources/images/event_level.png\" width=\"275\">\n\n##### Feature level explanations\n\n[`local_feat()`](src/timeshap/explainer/feature_level.py) calculates feature level explanations\nof a given sequence with the user-given parameteres and returns the respective \nfeature-level explanations.\n\n[`plot_feat_barplot()`](src/timeshap/plot/feature_level.py) plots the feature-level explanations\ncalculated by `local_feat()`.\n\n<img src=\"resources/images/feature_level.png\" width=\"350\">\n\n##### Cell level explanations\n\n[`local_cell_level()`](src/timeshap/explainer/cell_level.py) calculates cell level explanations\nof a given sequence with the respective event- and feature-level explanations\nand user-given parameteres, returing the respective cell-level explanations.\n\n[`plot_cell_level()`](src/timeshap/plot/cell_level.py) plots the feature-level explanations\ncalculated by  `local_cell_level()`.\n\n<img src=\"resources/images/cell_level.png\" width=\"350\">\n\n##### Local Report\n\n[`local_report()`](src/timeshap/explainer/local_methods.py) calculates TimeSHAP \nlocal explanations for a given sequence and plots them.\n\n<img src=\"resources/images/local_report.png\" width=\"800\">\n\n#### Global Explanations\n\n\n##### Global pruning statistics\n\n[`prune_all()`](src/timeshap/explainer/pruning.py) performs the pruning\nalgorithm on multiple given sequences.\n\n[`pruning_statistics()`](src/timeshap/plot/pruning.py) calculates the pruning\nstatistics for several user-given pruning tolerances using the pruning\ndata calculated by `prune_all()`, returning a `pandas.DataFrame` with the statistics.\n\n\n##### Global event level explanations\n\n[`event_explain_all()`](src/timeshap/explainer/event_level.py) calculates TimeSHAP \nevent level explanations for multiple instances given user defined parameters.\n\n[`plot_global_event()`](src/timeshap/plot/event_level.py) plots the global event-level explanations\ncalculated by `event_explain_all()`.\n\n<img src=\"resources/images/global_event.png\" width=\"600\">\n\n##### Global feature level explanations\n\n[`feat_explain_all()`](src/timeshap/explainer/feature_level.py) calculates TimeSHAP \nfeature level explanations for multiple instances given user defined parameters.\n\n[`plot_global_feat()`](src/timeshap/plot/feature_level.py) plots the global feature-level \nexplanations calculated by `feat_explain_all()`.\n\n<img src=\"resources/images/global_feat.png\" width=\"450\">\n\n\n##### Global report\n[`global_report()`](src/timeshap/explainer/global_methods.py) calculates TimeSHAP \nexplanations for multiple instances, aggregating the explanations on two plots\nand returning them.\n\n<img src=\"resources/images/global_report.png\" width=\"800\">\n\n\n\n## Tutorial\nIn order to demonstrate TimeSHAP interfaces and methods, you can consult\n[AReM.ipynb](notebooks/AReM/AReM.ipynb). \nIn this tutorial we get an open-source dataset, process it, train \nPytorch recurrent model with it and use TimeSHAP to explain it, showcasing all \npreviously described methods.\n\nAdditionally, we also train a TensorFlow model on the same dataset \n[AReM_TF.ipynb](notebooks/AReM/AReM_TF.ipynb).\n\n## Repository Structure\n\n- [`notebooks`](notebooks) - tutorial notebooks demonstrating the package;\n- [`src/timeshap`](src/timeshap) - the package source code;\n  - [`src/timeshap/explainer`](src/timeshap/explainer) - TimeSHAP methods to produce the explanations\n  - [`src/timeshap/explainer/kernel`](src/timeshap/explainer/kernel) - TimeSHAPKernel\n  - [`src/timeshap/plot`](src/timeshap/plot) - TimeSHAP methods to produce explanation plots\n  - [`src/timeshap/utils`](src/timeshap/utils) - util methods for TimeSHAP execution\n  - [`src/timeshap/wrappers`](src/timeshap/wrappers) - Wrapper classes for models in order to ease TimeSHAP explanations\n\n## Citing TimeSHAP\n```\n@inproceedings{bento2021timeshap,\n    author = {Bento, Jo\\~{a}o and Saleiro, Pedro and Cruz, Andr\\'{e} F. and Figueiredo, M\\'{a}rio A.T. and Bizarro, Pedro},\n    title = {TimeSHAP: Explaining Recurrent Models through Sequence Perturbations},\n    year = {2021},\n    isbn = {9781450383325},\n    publisher = {Association for Computing Machinery},\n    address = {New York, NY, USA},\n    url = {https://doi.org/10.1145/3447548.3467166},\n    doi = {10.1145/3447548.3467166},\n    booktitle = {Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining},\n    pages = {2565\u20132573},\n    numpages = {9},\n    keywords = {SHAP, Shapley values, TimeSHAP, XAI, RNN, explainability},\n    location = {Virtual Event, Singapore},\n    series = {KDD '21}\n}\n```\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "KernelSHAP adaptation for recurrent models.",
    "version": "1.0.4",
    "project_urls": {
        "Homepage": "https://github.com/feedzai/timeshap"
    },
    "split_keywords": [
        "explainability",
        "timeshap"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3f810ff13bd2ba8677c5e31829ff129a92c2bdf06960015884352019d8dc6e80",
                "md5": "1c9b51afc10783c9fff6756c6d791862",
                "sha256": "70c47020ecb3a3db0b3c64ef0a87e5c1861eb4156969fa4a4263b09bdc86c36c"
            },
            "downloads": -1,
            "filename": "timeshap-1.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1c9b51afc10783c9fff6756c6d791862",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 66474,
            "upload_time": "2023-09-13T09:23:58",
            "upload_time_iso_8601": "2023-09-13T09:23:58.633094Z",
            "url": "https://files.pythonhosted.org/packages/3f/81/0ff13bd2ba8677c5e31829ff129a92c2bdf06960015884352019d8dc6e80/timeshap-1.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e4d3bcd4cd16b92202a456970a4854a88bf45e72d9bb12a48bd76ebbfe1daa00",
                "md5": "060cf342aebd400dd505311cb107602b",
                "sha256": "a91cb82ef7eddd455bb01cd71695dfeef5d8b09c293e4055c5f00d379501f289"
            },
            "downloads": -1,
            "filename": "timeshap-1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "060cf342aebd400dd505311cb107602b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 47506,
            "upload_time": "2023-09-13T09:24:00",
            "upload_time_iso_8601": "2023-09-13T09:24:00.855621Z",
            "url": "https://files.pythonhosted.org/packages/e4/d3/bcd4cd16b92202a456970a4854a88bf45e72d9bb12a48bd76ebbfe1daa00/timeshap-1.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-13 09:24:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "feedzai",
    "github_project": "timeshap",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "timeshap"
}
        
Elapsed time: 0.33313s