# TimeSHAP
TimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and
extends it to the sequential domain.
TimeSHAP computes event/timestamp- feature-, and cell-level attributions.
As sequences can be arbitrarily long, TimeSHAP also implements a pruning algorithm
based on Shapley Values, that finds a subset of consecutive, recent events that contribute
the most to the decision.
This repository is the code implementation of the TimeSHAP algorithm
present in the paper `TimeSHAP: Explaining Recurrent Models through Sequence Perturbations`
published at **KDD 2021**.
Links to the paper [here](https://arxiv.org/abs/2012.00073),
and to the video presentation [here](https://www.youtube.com/watch?v=Q7Q9o7ywXx8).
## Install TimeSHAP
##### Via Pip
```
pip install timeshap
```
##### Via Github
Clone the repository into a local directory using:
```
git clone https://github.com/feedzai/timeshap.git
```
Move into the cloned repo and install the package:
```
cd timeshap
pip install .
```
##### Test your installation
Start a Python session in your terminal using
```
python
```
And import TimeSHAP
```
import timeshap
```
## TimeSHAP in 30 seconds
#### Inputs
- Model being explained;
- Instance(s) to explain;
- Background instance.
#### Outputs
- Local pruning output; (explaining a single instance)
- Local event explanations; (explaining a single instance)
- Local feature explanations; (explaining a single instance)
- Global pruning statistics; (explaining multiple instances)
- Global event explanations; (explaining multiple instances)
- Global feature explanations; (explaining multiple instances)
### Model Interface
In order for TimeSHAP to explain a model, an entry point must be provided.
This `Callable` entry point must receive a 3-D numpy array, `(#sequences; #sequence length; #features)`
and return a 2-D numpy array `(#sequences; 1)` with the corresponding score of each sequence.
In addition, to make TimeSHAP more optimized, it is possible to return the **hidden state**
of the model together with the score (if applicable). Although this is optional, we highly recommended it,
as it has a very high impact.
If you choose to return the hidden state, this hidden state should either be:
(see [notebook](notebooks/AReM/AReM_API_showcase.ipynb) for specific examples)
- a 3-D numpy array, `(#rnn layers, #sequences, #hidden_dimension)` (class `ExplainedRNN` on notebook);
- a tuple of numpy arrays that follows the previously described characteristic
(usually used when using stacked RNNs with different hidden dimensions) (class `ExplainedGRU2Layer` on notebook);
- a tuple of tuples of numpy arrays (usually used when using LSTM's) (class `ExplainedLSTM` on notebook);;
TimeSHAP is able to explain any black-box model as long as it complies with the
previously described interface, including both PyTorch and TensorFlow models,
both examplified in our tutorials ([PyTorch](notebooks/AReM/AReM.ipynb), [TensorFlow](notebooks/AReM/AReM_TF.ipynb)).
Example provided in our tutorials:
- **TensorFLow**
```
model = tf.keras.models.Model(inputs=inputs, outputs=ff2)
f = lambda x: model.predict(x)
```
- **Pytorch** - (Example where model receives and returns hidden states)
```
model_wrapped = TorchModelWrapper(model)
f_hs = lambda x, y=None: model_wrapped.predict_last_hs(x, y)
```
###### Model Wrappers
In order to facilitate the interface between models and TimeSHAP,
TimeSHAP implements `ModelWrappers`. These wrappers, used on the PyTorch
[tutorial](notebooks/AReM/AReM.ipynb) notebook, allow for greater flexibility
of explained models as they allow:
- **Batching logic**: useful when using very large inputs or NSamples, which cannot fit
on GPU memory, and therefore batching mechanisms are required;
- **Input format/type**: useful when your model does not work with numpy arrays. This
is the case of our provided PyToch example;
- **Hidden state logic**: useful when the hidden states of your models do not match
the hidden state format required by TimeSHAP
### TimeSHAP Explanation Methods
TimeSHAP offers several methods to use depending on the desired explanations.
Local methods provide detailed view of a model decision corresponding
to a specific sequence being explained.
Global methods aggregate local explanations of a given dataset
to present a global view of the model.
#### Local Explanations
##### Pruning
[`local_pruning()`](src/timeshap/explainer/pruning.py) performs the pruning
algorithm on a given sequence with a given user defined tolerance and returns
the pruning index along the information for plotting.
[`plot_temp_coalition_pruning()`](src/timeshap/plot/pruning.py) plots the pruning
algorithm information calculated by `local_pruning()`.
<img src="resources/images/pruning.png" width="400">
##### Event level explanations
[`local_event()`](src/timeshap/explainer/event_level.py) calculates event level explanations
of a given sequence with the user-given parameteres and returns the respective
event-level explanations.
[`plot_event_heatmap()`](src/timeshap/plot/event_level.py) plots the event-level explanations
calculated by `local_event()`.
<img src="resources/images/event_level.png" width="275">
##### Feature level explanations
[`local_feat()`](src/timeshap/explainer/feature_level.py) calculates feature level explanations
of a given sequence with the user-given parameteres and returns the respective
feature-level explanations.
[`plot_feat_barplot()`](src/timeshap/plot/feature_level.py) plots the feature-level explanations
calculated by `local_feat()`.
<img src="resources/images/feature_level.png" width="350">
##### Cell level explanations
[`local_cell_level()`](src/timeshap/explainer/cell_level.py) calculates cell level explanations
of a given sequence with the respective event- and feature-level explanations
and user-given parameteres, returing the respective cell-level explanations.
[`plot_cell_level()`](src/timeshap/plot/cell_level.py) plots the feature-level explanations
calculated by `local_cell_level()`.
<img src="resources/images/cell_level.png" width="350">
##### Local Report
[`local_report()`](src/timeshap/explainer/local_methods.py) calculates TimeSHAP
local explanations for a given sequence and plots them.
<img src="resources/images/local_report.png" width="800">
#### Global Explanations
##### Global pruning statistics
[`prune_all()`](src/timeshap/explainer/pruning.py) performs the pruning
algorithm on multiple given sequences.
[`pruning_statistics()`](src/timeshap/plot/pruning.py) calculates the pruning
statistics for several user-given pruning tolerances using the pruning
data calculated by `prune_all()`, returning a `pandas.DataFrame` with the statistics.
##### Global event level explanations
[`event_explain_all()`](src/timeshap/explainer/event_level.py) calculates TimeSHAP
event level explanations for multiple instances given user defined parameters.
[`plot_global_event()`](src/timeshap/plot/event_level.py) plots the global event-level explanations
calculated by `event_explain_all()`.
<img src="resources/images/global_event.png" width="600">
##### Global feature level explanations
[`feat_explain_all()`](src/timeshap/explainer/feature_level.py) calculates TimeSHAP
feature level explanations for multiple instances given user defined parameters.
[`plot_global_feat()`](src/timeshap/plot/feature_level.py) plots the global feature-level
explanations calculated by `feat_explain_all()`.
<img src="resources/images/global_feat.png" width="450">
##### Global report
[`global_report()`](src/timeshap/explainer/global_methods.py) calculates TimeSHAP
explanations for multiple instances, aggregating the explanations on two plots
and returning them.
<img src="resources/images/global_report.png" width="800">
## Tutorial
In order to demonstrate TimeSHAP interfaces and methods, you can consult
[AReM.ipynb](notebooks/AReM/AReM.ipynb).
In this tutorial we get an open-source dataset, process it, train
Pytorch recurrent model with it and use TimeSHAP to explain it, showcasing all
previously described methods.
Additionally, we also train a TensorFlow model on the same dataset
[AReM_TF.ipynb](notebooks/AReM/AReM_TF.ipynb).
## Repository Structure
- [`notebooks`](notebooks) - tutorial notebooks demonstrating the package;
- [`src/timeshap`](src/timeshap) - the package source code;
- [`src/timeshap/explainer`](src/timeshap/explainer) - TimeSHAP methods to produce the explanations
- [`src/timeshap/explainer/kernel`](src/timeshap/explainer/kernel) - TimeSHAPKernel
- [`src/timeshap/plot`](src/timeshap/plot) - TimeSHAP methods to produce explanation plots
- [`src/timeshap/utils`](src/timeshap/utils) - util methods for TimeSHAP execution
- [`src/timeshap/wrappers`](src/timeshap/wrappers) - Wrapper classes for models in order to ease TimeSHAP explanations
## Citing TimeSHAP
```
@inproceedings{bento2021timeshap,
author = {Bento, Jo\~{a}o and Saleiro, Pedro and Cruz, Andr\'{e} F. and Figueiredo, M\'{a}rio A.T. and Bizarro, Pedro},
title = {TimeSHAP: Explaining Recurrent Models through Sequence Perturbations},
year = {2021},
isbn = {9781450383325},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3447548.3467166},
doi = {10.1145/3447548.3467166},
booktitle = {Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining},
pages = {2565–2573},
numpages = {9},
keywords = {SHAP, Shapley values, TimeSHAP, XAI, RNN, explainability},
location = {Virtual Event, Singapore},
series = {KDD '21}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/feedzai/timeshap",
"name": "timeshap",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "explainability,TimeShap",
"author": "Feedzai",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/e4/d3/bcd4cd16b92202a456970a4854a88bf45e72d9bb12a48bd76ebbfe1daa00/timeshap-1.0.4.tar.gz",
"platform": null,
"description": "# TimeSHAP\nTimeSHAP is a model-agnostic, recurrent explainer that builds upon KernelSHAP and \nextends it to the sequential domain.\nTimeSHAP computes event/timestamp- feature-, and cell-level attributions. \nAs sequences can be arbitrarily long, TimeSHAP also implements a pruning algorithm\nbased on Shapley Values, that finds a subset of consecutive, recent events that contribute\nthe most to the decision.\n\n\nThis repository is the code implementation of the TimeSHAP algorithm \npresent in the paper `TimeSHAP: Explaining Recurrent Models through Sequence Perturbations`\npublished at **KDD 2021**. \n\nLinks to the paper [here](https://arxiv.org/abs/2012.00073), \nand to the video presentation [here](https://www.youtube.com/watch?v=Q7Q9o7ywXx8).\n\n\n## Install TimeSHAP\n\n##### Via Pip\n```\npip install timeshap\n```\n\n##### Via Github\nClone the repository into a local directory using:\n```\ngit clone https://github.com/feedzai/timeshap.git\n```\n\nMove into the cloned repo and install the package:\n\n```\ncd timeshap\npip install .\n```\n\n\n##### Test your installation\nStart a Python session in your terminal using\n\n```\npython\n```\n\nAnd import TimeSHAP\n\n```\nimport timeshap\n```\n\n## TimeSHAP in 30 seconds\n\n#### Inputs\n- Model being explained;\n- Instance(s) to explain;\n- Background instance.\n\n#### Outputs\n- Local pruning output; (explaining a single instance)\n- Local event explanations; (explaining a single instance)\n- Local feature explanations; (explaining a single instance)\n- Global pruning statistics; (explaining multiple instances)\n- Global event explanations; (explaining multiple instances)\n- Global feature explanations; (explaining multiple instances)\n\n### Model Interface\n\nIn order for TimeSHAP to explain a model, an entry point must be provided.\nThis `Callable` entry point must receive a 3-D numpy array, `(#sequences; #sequence length; #features)`\nand return a 2-D numpy array `(#sequences; 1)` with the corresponding score of each sequence.\n\nIn addition, to make TimeSHAP more optimized, it is possible to return the **hidden state**\nof the model together with the score (if applicable). Although this is optional, we highly recommended it, \nas it has a very high impact. \nIf you choose to return the hidden state, this hidden state should either be: \n(see [notebook](notebooks/AReM/AReM_API_showcase.ipynb) for specific examples)\n - a 3-D numpy array, `(#rnn layers, #sequences, #hidden_dimension)` (class `ExplainedRNN` on notebook);\n - a tuple of numpy arrays that follows the previously described characteristic \n (usually used when using stacked RNNs with different hidden dimensions) (class `ExplainedGRU2Layer` on notebook); \n - a tuple of tuples of numpy arrays (usually used when using LSTM's) (class `ExplainedLSTM` on notebook);;\nTimeSHAP is able to explain any black-box model as long as it complies with the \npreviously described interface, including both PyTorch and TensorFlow models, \nboth examplified in our tutorials ([PyTorch](notebooks/AReM/AReM.ipynb), [TensorFlow](notebooks/AReM/AReM_TF.ipynb)).\n\nExample provided in our tutorials:\n- **TensorFLow**\n```\nmodel = tf.keras.models.Model(inputs=inputs, outputs=ff2)\nf = lambda x: model.predict(x)\n```\n\n- **Pytorch** - (Example where model receives and returns hidden states)\n```\nmodel_wrapped = TorchModelWrapper(model)\nf_hs = lambda x, y=None: model_wrapped.predict_last_hs(x, y)\n```\n\n\n###### Model Wrappers\nIn order to facilitate the interface between models and TimeSHAP, \nTimeSHAP implements `ModelWrappers`. These wrappers, used on the PyTorch\n[tutorial](notebooks/AReM/AReM.ipynb) notebook, allow for greater flexibility\nof explained models as they allow:\n- **Batching logic**: useful when using very large inputs or NSamples, which cannot fit\non GPU memory, and therefore batching mechanisms are required;\n- **Input format/type**: useful when your model does not work with numpy arrays. This\nis the case of our provided PyToch example; \n- **Hidden state logic**: useful when the hidden states of your models do not match\nthe hidden state format required by TimeSHAP\n\n\n### TimeSHAP Explanation Methods\nTimeSHAP offers several methods to use depending on the desired explanations.\nLocal methods provide detailed view of a model decision corresponding\nto a specific sequence being explained.\nGlobal methods aggregate local explanations of a given dataset\nto present a global view of the model.\n\n#### Local Explanations\n##### Pruning\n\n[`local_pruning()`](src/timeshap/explainer/pruning.py) performs the pruning\nalgorithm on a given sequence with a given user defined tolerance and returns \nthe pruning index along the information for plotting.\n\n[`plot_temp_coalition_pruning()`](src/timeshap/plot/pruning.py) plots the pruning \nalgorithm information calculated by `local_pruning()`.\n\n<img src=\"resources/images/pruning.png\" width=\"400\">\n\n##### Event level explanations\n\n[`local_event()`](src/timeshap/explainer/event_level.py) calculates event level explanations\nof a given sequence with the user-given parameteres and returns the respective \nevent-level explanations.\n\n[`plot_event_heatmap()`](src/timeshap/plot/event_level.py) plots the event-level explanations\ncalculated by `local_event()`.\n\n<img src=\"resources/images/event_level.png\" width=\"275\">\n\n##### Feature level explanations\n\n[`local_feat()`](src/timeshap/explainer/feature_level.py) calculates feature level explanations\nof a given sequence with the user-given parameteres and returns the respective \nfeature-level explanations.\n\n[`plot_feat_barplot()`](src/timeshap/plot/feature_level.py) plots the feature-level explanations\ncalculated by `local_feat()`.\n\n<img src=\"resources/images/feature_level.png\" width=\"350\">\n\n##### Cell level explanations\n\n[`local_cell_level()`](src/timeshap/explainer/cell_level.py) calculates cell level explanations\nof a given sequence with the respective event- and feature-level explanations\nand user-given parameteres, returing the respective cell-level explanations.\n\n[`plot_cell_level()`](src/timeshap/plot/cell_level.py) plots the feature-level explanations\ncalculated by `local_cell_level()`.\n\n<img src=\"resources/images/cell_level.png\" width=\"350\">\n\n##### Local Report\n\n[`local_report()`](src/timeshap/explainer/local_methods.py) calculates TimeSHAP \nlocal explanations for a given sequence and plots them.\n\n<img src=\"resources/images/local_report.png\" width=\"800\">\n\n#### Global Explanations\n\n\n##### Global pruning statistics\n\n[`prune_all()`](src/timeshap/explainer/pruning.py) performs the pruning\nalgorithm on multiple given sequences.\n\n[`pruning_statistics()`](src/timeshap/plot/pruning.py) calculates the pruning\nstatistics for several user-given pruning tolerances using the pruning\ndata calculated by `prune_all()`, returning a `pandas.DataFrame` with the statistics.\n\n\n##### Global event level explanations\n\n[`event_explain_all()`](src/timeshap/explainer/event_level.py) calculates TimeSHAP \nevent level explanations for multiple instances given user defined parameters.\n\n[`plot_global_event()`](src/timeshap/plot/event_level.py) plots the global event-level explanations\ncalculated by `event_explain_all()`.\n\n<img src=\"resources/images/global_event.png\" width=\"600\">\n\n##### Global feature level explanations\n\n[`feat_explain_all()`](src/timeshap/explainer/feature_level.py) calculates TimeSHAP \nfeature level explanations for multiple instances given user defined parameters.\n\n[`plot_global_feat()`](src/timeshap/plot/feature_level.py) plots the global feature-level \nexplanations calculated by `feat_explain_all()`.\n\n<img src=\"resources/images/global_feat.png\" width=\"450\">\n\n\n##### Global report\n[`global_report()`](src/timeshap/explainer/global_methods.py) calculates TimeSHAP \nexplanations for multiple instances, aggregating the explanations on two plots\nand returning them.\n\n<img src=\"resources/images/global_report.png\" width=\"800\">\n\n\n\n## Tutorial\nIn order to demonstrate TimeSHAP interfaces and methods, you can consult\n[AReM.ipynb](notebooks/AReM/AReM.ipynb). \nIn this tutorial we get an open-source dataset, process it, train \nPytorch recurrent model with it and use TimeSHAP to explain it, showcasing all \npreviously described methods.\n\nAdditionally, we also train a TensorFlow model on the same dataset \n[AReM_TF.ipynb](notebooks/AReM/AReM_TF.ipynb).\n\n## Repository Structure\n\n- [`notebooks`](notebooks) - tutorial notebooks demonstrating the package;\n- [`src/timeshap`](src/timeshap) - the package source code;\n - [`src/timeshap/explainer`](src/timeshap/explainer) - TimeSHAP methods to produce the explanations\n - [`src/timeshap/explainer/kernel`](src/timeshap/explainer/kernel) - TimeSHAPKernel\n - [`src/timeshap/plot`](src/timeshap/plot) - TimeSHAP methods to produce explanation plots\n - [`src/timeshap/utils`](src/timeshap/utils) - util methods for TimeSHAP execution\n - [`src/timeshap/wrappers`](src/timeshap/wrappers) - Wrapper classes for models in order to ease TimeSHAP explanations\n\n## Citing TimeSHAP\n```\n@inproceedings{bento2021timeshap,\n author = {Bento, Jo\\~{a}o and Saleiro, Pedro and Cruz, Andr\\'{e} F. and Figueiredo, M\\'{a}rio A.T. and Bizarro, Pedro},\n title = {TimeSHAP: Explaining Recurrent Models through Sequence Perturbations},\n year = {2021},\n isbn = {9781450383325},\n publisher = {Association for Computing Machinery},\n address = {New York, NY, USA},\n url = {https://doi.org/10.1145/3447548.3467166},\n doi = {10.1145/3447548.3467166},\n booktitle = {Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining},\n pages = {2565\u20132573},\n numpages = {9},\n keywords = {SHAP, Shapley values, TimeSHAP, XAI, RNN, explainability},\n location = {Virtual Event, Singapore},\n series = {KDD '21}\n}\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "KernelSHAP adaptation for recurrent models.",
"version": "1.0.4",
"project_urls": {
"Homepage": "https://github.com/feedzai/timeshap"
},
"split_keywords": [
"explainability",
"timeshap"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3f810ff13bd2ba8677c5e31829ff129a92c2bdf06960015884352019d8dc6e80",
"md5": "1c9b51afc10783c9fff6756c6d791862",
"sha256": "70c47020ecb3a3db0b3c64ef0a87e5c1861eb4156969fa4a4263b09bdc86c36c"
},
"downloads": -1,
"filename": "timeshap-1.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1c9b51afc10783c9fff6756c6d791862",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 66474,
"upload_time": "2023-09-13T09:23:58",
"upload_time_iso_8601": "2023-09-13T09:23:58.633094Z",
"url": "https://files.pythonhosted.org/packages/3f/81/0ff13bd2ba8677c5e31829ff129a92c2bdf06960015884352019d8dc6e80/timeshap-1.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e4d3bcd4cd16b92202a456970a4854a88bf45e72d9bb12a48bd76ebbfe1daa00",
"md5": "060cf342aebd400dd505311cb107602b",
"sha256": "a91cb82ef7eddd455bb01cd71695dfeef5d8b09c293e4055c5f00d379501f289"
},
"downloads": -1,
"filename": "timeshap-1.0.4.tar.gz",
"has_sig": false,
"md5_digest": "060cf342aebd400dd505311cb107602b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 47506,
"upload_time": "2023-09-13T09:24:00",
"upload_time_iso_8601": "2023-09-13T09:24:00.855621Z",
"url": "https://files.pythonhosted.org/packages/e4/d3/bcd4cd16b92202a456970a4854a88bf45e72d9bb12a48bd76ebbfe1daa00/timeshap-1.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-13 09:24:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "feedzai",
"github_project": "timeshap",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "timeshap"
}