# xai_evals
## by - [AryaXAI](https://www.aryaxai.com/)
**`xai_evals`** is a Python package designed to generate and benchmark various explainability methods for machine learning and deep learning models. It offers tools for creating and evaluating explanations of popular machine learning models, supporting widely-used explanation methods. The package aims to streamline the interpretability of machine learning models, allowing practitioners to gain insights into how their models make predictions. Additionally, it includes several metrics for assessing the quality of these explanations . [](LICENSE)
Technical Report : [xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods](https://arxiv.org/abs/2502.03014)

---
## Table of Contents
- [Installation](#installation)
- [Usage](#usage)
- [SHAP Tabular Explainer](#shap-tabular-explainer)
- [LIME Tabular Explainer](#lime-tabular-explainer)
- [Torch Tabular Explainer](#torch-tabular-explainer)
- [TFKeras Tabular Explainer](#tfkeras-tabular-explainer)
- [DlBacktrace Tabular Explainer](#dlbacktrace-tabular-explainer)
- [Tabular Metrics Calculation](#tabular-metrics-calculation)
- [Torch Image Explainer](#torch-image-explainer)
- [TFKeras Image Explainer](#tfkeras-image-explainer)
- [DlBacktrace Image Explainer](#dlbacktrace-image-explainer)
- [Tabular Metrics Calculation](#tabular-metrics-calculation)
- [Image Metrics Calculation](#image-metrics-calculation)
- [License](#license)
---
## Installation
To install **`xai_evals`**, you can use `pip`. First, clone the repository or download the files to your local environment. Then, install the necessary dependencies:
```bash
pip install xai_evals
```
## Example Notebooks :
### Tensorflow-Keras :
| Name | Dataset | Link |
|-------------|-------------|-------------------------------|
| Tabualar ML Models Illustration and Evaluation Metrics | IRIS Dataset | [Colab Link](https://colab.research.google.com/drive/1UoT5Gx5d_L1KQmiirGUyyE1b9ajayO3L?usp=sharing) |
| Tabular Deep Learning Model Illustration and Evaluation Metrics | Lending Club | [Colab Link](https://colab.research.google.com/drive/17vuRt4D7ph6ZnAbrWMJ2aRum2mk14Tc6?usp=sharing) |
| Image Deep Learning Model Illustration and Evaluation Metrics | CIFAR10 | [Colab Link](https://colab.research.google.com/drive/1DNUMT6CNx2VGHsK8qhl3dEEtoN3eA7ar?usp=sharing) |
## Usage
## Usage : Machine Learning Models
Supported Machine Learning Models for `SHAPExplainer` and `LIMEExplainer` class is as follows :
| **Library** | **Supported Models** |
|-------------------------|------------------------------------------------------------------------------------------------------|
| **scikit-learn** | LogisticRegression, RandomForestClassifier, SVC, SGDClassifier, GradientBoostingClassifier, AdaBoostClassifier, DecisionTreeClassifier, KNeighborsClassifier, GaussianNB, LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis, KMeans, NearestCentroid, BaggingClassifier, VotingClassifier, MLPClassifier, LogisticRegressionCV, RidgeClassifier, ElasticNet |
| **xgboost** | XGBClassifier |
| **catboost** | CatBoostClassifier |
| **lightgbm** | LGBMClassifier |
| **sklearn.ensemble** | HistGradientBoostingClassifier, ExtraTreesClassifier |
### SHAP Tabular Explainer
The `SHAPExplainer` class allows you to compute and visualize **SHAP** values for your trained model. It supports various types of models, including tree-based models (e.g., `RandomForest`, `XGBoost`) and deep learning models (e.g., PyTorch models).
**Example:**
```python
from xai_evals.explainer import SHAPExplainer
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
from sklearn.datasets import load_iris
# Load dataset and train a model
data = load_iris()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
model = RandomForestClassifier()
model.fit(X, y)
# Initialize SHAP explainer
shap_explainer = SHAPExplainer(model=model, features=X.columns, task="multiclass-classification", X_train=X)
# Explain a specific instance (e.g., the first instance in the test set)
shap_attributions = shap_explainer.explain(X, instance_idx=0)
# Print the feature attributions
print(shap_attributions)
```
| **Feature** | **Value** | **Attribution** |
|-----------------------|-----------|-----------------|
| petal_length_(cm) | 1.4 | 0.360667 |
| petal_width_(cm) | 0.2 | 0.294867 |
| sepal_length_(cm) | 5.1 | 0.023467 |
| sepal_width_(cm) | 3.5 | 0.010500 |
### LIME Tabular Explainer
The `LIMEExplainer` class allows you to generate **LIME** explanations, which work by perturbing the input data and fitting a locally interpretable model.
**Example:**
```python
from xai_evals.explainer import LIMEExplainer
from sklearn.linear_model import LogisticRegression
import pandas as pd
from sklearn.datasets import load_iris
# Load dataset and train a model
data = load_iris()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
model = LogisticRegression(max_iter=200)
model.fit(X, y)
# Initialize LIME explainer
lime_explainer = LIMEExplainer(model=model, features=X.columns, task="multiclass-classification", X_train=X)
# Explain a specific instance (e.g., the first instance in the test set)
lime_attributions = lime_explainer.explain(X, instance_idx=0)
# Print the feature attributions
print(lime_attributions)
```
| **Feature** | **Value** | **Attribution** |
|-----------------------|-----------|-----------------|
| petal_length_(cm) | 1.4 | 0.497993 |
| petal_width_(cm) | 0.2 | 0.213963 |
| sepal_length_(cm) | 5.1 | 0.127047 |
| sepal_width_(cm) | 3.5 | 0.053926 |
For **LIMEExplainer and SHAPExplainer Class** we have several attributes :
| Attribute | Description | Values |
|--------------|-------------|--------|
| model | Trained model which you want to explain | [sklearn model] |
| features | Features present in the Training/Testing Set | [list of features] |
| X_train | Training Set Data | {pd.dataframe,numpy.array} |
| task | Task performed by the model | {binary,multiclass} |
| model_classes (Only for LIME) | List of Classes to be predicted by model | [list of classes] |
| subset_samples (Only for SHAP) | If we want to use k-means based sampling to use a subset for SHAP Explainer | True/False |
| subset_number (Only for SHAP)| Number of samples to sample if subset_samples is True | int |
## Usage : Deep Learning Models
### Torch Tabular Explainer
The `TorchTabularExplainer` class allows you to generate explanations for Pytorch Deep Learning Model . Explaination Method available include 'integrated_gradients', 'deep_lift', 'gradient_shap','saliency', 'input_x_gradient', 'guided_backprop','shap_kernel', 'shap_deep' and 'lime'.
| Attribute | Description | Values |
|--------------|-------------|--------|
| model | Trained Torch model which you want to explain | [Torch Model] |
| method | Explanation method. Options:'integrated_gradients', 'deep_lift', 'gradient_shap','saliency', 'input_x_gradient', 'guided_backprop','shap_kernel', 'shap_deep', 'lime' | string |
| X_train | Training Set Data | {pd.dataframe,numpy.array} |
| feature_names | Features present in the Training/Testing Set | [list of features] |
| task | Task performed by the model | {binary-classification,multiclass-classification} |
### TFKeras Tabular Explainer
The `TFTabularExplainer` class allows you to generate explanations for Tensorflow/Keras Deep Learning Model . Explaination Method available include 'shap_kernel', 'shap_deep' and 'lime'.
| Attribute | Description | Values |
|--------------|-------------|--------|
| model | Trained Tf/Keras model which you want to explain | [Tf/Keras Model] |
| method | Explanation method. Options:'shap_kernel', 'shap_deep', 'lime' | string |
| X_train | Training Set Data | {pd.dataframe,numpy.array} |
| feature_names | Features present in the Training/Testing Set | [list of features] |
| task | Task performed by the model | {binary-classification,multiclass-classification} |
### DlBacktrace Tabular Explainer
The `DlBacktraceTabularExplainer` , based on DLBacktrace, a method for analyzing neural networks by tracing the relevance of each component from output to input, to understand how each part contributes to the final prediction. It offers two modes: Default and Contrast, and is compatible with TensorFlow and PyTorch. (https://github.com/AryaXAI/DLBacktrace)
| Attribute | Description | Values |
|--------------|-------------|--------|
| model | Trained Tf/Keras/Torch model which you want to explain | [Torch/Tf/Keras Model] |
| method | Explanation method. Options:"default" or "contrastive" | string |
| X_train | Training Set Data | {pd.dataframe,numpy.array} |
| scaler | Total / Starting Relevance at the Last Layer | Integer (Default: 1) |
| feature_names | Features present in the Training/Testing Set | [list of features] |
| thresholding | Thresholding for Model Prediction | float (Default : 0.5) |
| task | Task performed by the model | {binary-classification,multiclass-classification} |
### Torch Image Explainer
The `TorchImageExplainer` class allows you to generate explanations for PyTorch-based CNN models. This class wraps around several attribution methods available in Captum, including:
- **Integrated Gradients**
- **Saliency**
- **DeepLift**
- **GradientShap**
- **GuidedBackprop**
- **Occlusion**
- **LayerGradCam**
**Example:**
```python
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from xai_evals.explainer import TorchImageExplainer
from torchvision import models
import numpy as np
import matplotlib.pyplot as plt
# Load CIFAR-10 dataset
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=4, shuffle=True)
# Load pre-trained ResNet model
model = models.resnet18(pretrained=True)
model.eval()
# Initialize the TorchImageExplainer
explainer = TorchImageExplainer(model)
# Example 1: Explain using DataLoader (batch of images)
idx = 0 # Index for the image in the DataLoader
method = "integrated_gradients"
task = "classification"
attribution_map = explainer.explain(trainloader, idx, method, task)
# Visualize attribution map (simplified)
plt.imshow(attribution_map)
plt.title(f"Attribution Map - {method} for Dataloader Torch")
plt.show()
# Example 2: Explain using a single image (torch.Tensor)
single_image_tensor = torch.randn(3, 32, 32) # Random image as a tensor, [C, H, W]
attribution_map = explainer.explain(single_image_tensor, idx=None, method=method, task=task)
# Visualize attribution map for the single image
plt.imshow(attribution_map)
plt.title(f"Attribution Map - {method} for Single Image (Tensor)")
plt.show()
# Example 3: Explain using a single image (np.ndarray)
single_image_numpy = np.random.randn(3, 32, 32) # Random image as a NumPy array, [C, H, W]
attribution_map = explainer.explain(single_image_numpy, idx=None, method=method, task=task)
# Visualize attribution map for the single image (NumPy)
plt.imshow(attribution_map)
plt.title(f"Attribution Map - {method} for Single Image (NumPy)")
plt.show()
```
#### **TorchImageExplainer**: `explain` Function Attributes
| **Attribute** | **Description** | **Values** |
|---------------|-----------------|-----------|
| `testdata` | The input data, which can be a DataLoader, NumPy array, or Tensor. | `[torch.utils.data.DataLoader, np.ndarray, torch.Tensor]` |
| `idx` | The index of the test sample to explain. | `int` or `None` (for explaining a single sample or all samples) |
| `method` | The explanation method to use. | `{grad_cam, integrated_gradients, saliency, deep_lift, gradient_shap, guided_backprop, occlusion, layer_gradcam, feature_ablation}` |
| `task` | The type of model task (e.g., classification). | `{classification}` |
---
### TFKeras Image Explainer
The `TFImageExplainer` class provides a similar functionality for TensorFlow/Keras-based models, allowing you to generate explanations for images using methods like GradCAM and Occlusion Sensitivity.
**Example:**
```python
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import cifar10
import numpy as np
import matplotlib.pyplot as plt
from xai_evals.explainer import TFImageExplainer
# Step 1: Define a Custom CNN Model
def create_custom_model():
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Normalize pixel values to be between 0 and 1
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
# Create a TensorFlow Dataset from the test data
test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)
# Initialize and train the custom model
model = create_custom_model()
# Train the model
model.fit(x_train, y_train, epochs=1, batch_size=64)
# Step 2: Use the TFImageExplainer with the Custom Model
explainer = TFImageExplainer(model)
# Example 1: Explain a Single Image (NumPy Array)
image = x_test[0] # Select the first image
label = y_test[0] # Get the label for the first image
# Generate the Grad-CAM explanation for the image
attribution_map = explainer.explain(image, idx=None, method="grad_cam", task="classification")
# Visualize the attribution map
plt.imshow(attribution_map, cmap="jet")
plt.colorbar()
plt.title("Grad-CAM Attribution Map for CIFAR-10 Image")
plt.show()
# Example 2: Explain an Image from the TensorFlow Dataset (Using idx)
idx = 10 # Select the 10th image from the test dataset
# Generate the Grad-CAM explanation for the image at index `idx`
attribution_map = explainer.explain(test_dataset, idx, method="grad_cam", task="classification")
# Visualize the attribution map
plt.imshow(attribution_map, cmap="jet")
plt.colorbar()
plt.title(f"Grad-CAM Attribution Map for Image Index {idx} in CIFAR-10")
plt.show()
```
#### **TFImageExplainer**: `explain` Function Attributes
| **Attribute** | **Description** | **Values** |
|---------------|-----------------|-----------|
| `testset` | The input data, which can be a NumPy array, TensorFlow tensor, or Dataset. | `[np.ndarray, tf.Tensor, tf.data.Dataset]` |
| `idx` | The index of the test sample to explain. | `int` or `None` (for explaining a single sample or all samples) |
| `method` | The explanation method to use. | `{grad_cam, occlusion}` |
| `task` | The type of model task (e.g., classification). | `{classification}` |
| `label` | The class label for the input sample (used for classification tasks). | `int` |
---
### DlBacktrace Image Explainer
The `DlBacktraceImageExplainer` based on DLBacktrace, a method for analyzing neural networks by tracing the relevance of each component from output to input, to understand how each part contributes to the final prediction. It offers two modes: Default and Contrast, and is compatible with TensorFlow and PyTorch. (https://github.com/AryaXAI/DLBacktrace)
**Example: Tensorflow Model DlBacktraceImageExplainer**
```python
# Load CIFAR-10 data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Normalize pixel values to be between 0 and 1
x_train, x_test = x_train / 255.0, x_test / 255.0
# Create a simple CNN model for CIFAR-10
def create_cnn_model():
model = models.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu'),
layers.Flatten(),
layers.Dense(64, activation='relu'),
layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
# Create the model
model = create_cnn_model()
# Train the model
model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
# Save the model for later use
model.save('cifar10_cnn_model.h5')
explainer = BacktraceImageExplainer(model=model)
# Choose an image from the test set
test_image = x_test[0:1] # Selecting the first image for testing
# Get the explanation for the test image
explanation = explainer.explain(test_image, instance_idx=0,mode='default', scaler=1, thresholding=0, task='multi-class-classification')
# Plot the explanation (relevance map)
plt.imshow(explanation, cmap='hot')
plt.colorbar()
plt.title("Feature Relevance for CIFAR-10 Image")
plt.show()
```
**Example: Torch Model DlBacktraceImageExplainer**
```python
# Define a simple CNN model for CIFAR-10 without using `view()`
class SimpleCNN(nn.Module):
def __init__(self, num_classes=10):
super(SimpleCNN, self).__init__()
self.identity = nn.Identity()
self.conv1 = nn.Conv2d(3, 16, 5,2)
self.relu1 = nn.ReLU()
self.conv2 = nn.Conv2d(16, 32, 3,2)
self.relu2 = nn.ReLU()
self.flatten = nn.Flatten()
self.fc1 = nn.Linear(32 * 6 * 6, 512)
self.relu3 = nn.ReLU()
self.fc2 = nn.Linear(512, num_classes)
def forward(self, x):
x = self.identity(x)
x = self.conv1(x)
x = self.relu1(x)
x = self.conv2(x)
x = self.relu2(x)
x = self.flatten(x)
x = self.fc1(x)
x = self.relu3(x)
x = self.fc2(x)
return x
# Load CIFAR-10 data with transforms for normalization
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=4, shuffle=True)
testloader = DataLoader(testset, batch_size=4, shuffle=False)
# Initialize and train the model
model = SimpleCNN()
model.train()
# Define loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
# Training loop
for epoch in range(1): # Just a couple of epochs for testing
running_loss = 0.0
for inputs, labels in trainloader:
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
print("Finished Training")
# Test the model using the BacktraceImageExplainer
explainer = DlBacktraceImageExplainer(model=model)
# Get the explanation for the first image
explanation = explainer.explain(testloader, instance_idx=0, mode='default', scaler=1, thresholding=0, task='multi-class-classification')
# Plot the explanation (relevance map)
plt.imshow(explanation, cmap='hot')
plt.colorbar()
plt.title("Feature Relevance for CIFAR-10 Image")
plt.show()
```
#### **DlBacktraceImageExplainer**: `explain` Function Attributes
| **Attribute** | **Description** | **Values** |
|---------------|-----------------|-----------|
| `test_data` | The input data, which can be a NumPy array, TensorFlow tensor, or Dataset. | `[np.ndarray, tf.Tensor, tf.data.Dataset]` |
| `instance_idx` | The index of the test sample to explain. | `int` (explaining a single sample) |
| `mode` | The explanation mode to use. | `{default, contrast}` |
| `task` | The type of model task (e.g., classification). | `{binary-classification,multiclass-classification}` |
| `scaler` | Total / Starting Relevance at the Last Layer | `float` ( Default: None, Preferred: 1) |
| `thresholding` | Thresholding Model Prediction to predict the actual class. | `float` |
| `contrast_mode` | Mode to Use if using 'contrast' mode of DlBacktrace Algorithm | `{Positive,Negative}` |
---
### Tabular Metrics Calculation
The **`xai_evals`** package provides a powerful class, **`ExplanationMetricsTabular`**, to evaluate the quality of explanations generated by SHAP and LIME. This class allows you to calculate several metrics, helping you assess the robustness, reliability, and interpretability of your model explanations. [NOTE: Metrics only supports Sklearn ML Models]
#### ExplanationMetrics Class
The **`ExplanationMetricsTabular`** class in `xai_evals` provides a structured way to evaluate the quality and reliability of explanations generated by SHAP or LIME for machine learning models. By assessing multiple metrics, you can better understand how well these explanations align with your model's predictions and behavior.
---
#### Steps for Using ExplanationMetrics
1. **Initialize ExplanationMetrics**
Begin by creating an instance of the `ExplanationMetricsTabular` class with the necessary inputs, including the model, explainer type, dataset, and the task type.
```python
from xai_evals.metrics import ExplanationMetricsTabular
from xai_evals.explainer import SHAPExplainer
from sklearn.ensemble import RandomForestClassifier
import pandas as pd
from sklearn.datasets import load_iris
# Load dataset and train a model
data = load_iris()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
model = RandomForestClassifier()
model.fit(X, y)
# Initialize ExplanationMetrics with SHAP explainer
explanation_metrics = ExplanationMetricsTabular(
model=model,
explainer_name="shap",
X_train=X,
X_test=X,
y_test=y,
features=X.columns,
task="binary"
)
```
For **ExplanationMetricsTabular Class** we have several attributes :
| Attribute | Description | Values |
|--------------|-------------|--------|
| model | Trained model which you want to explain | {binary-classification, multiclass-classification}|
| X_train | Training Set Data | {pd.dataframe,numpy.array} |
| explainer_name | Which explaination method to use | {'shap','lime','torch','tensorflow', 'backtrace'} |
| X_test | Test Set Data | {pd.dataframe,numpy.array} |
| y_test | Test Set Labels | pd.dataseries |
| features | Features present in the Training/Testing Set | [list of features] |
| task | Task performed by the model | {binary-classification,multiclass-classification} |
| metrics | List of metrics to calculate | ['faithfulness', 'infidelity', 'sensitivity', 'comprehensiveness', 'sufficiency', 'monotonicity', 'complexity', 'sparseness'] |
| method | For specifying which explaination Method to use in Torch/Tensorflow/Backtrace Explainer | Torch-{ 'integrated_gradients', 'deep_lift', 'gradient_shap','saliency', 'input_x_gradient', 'guided_backprop','shap_kernel', 'shap_deep','lime'}, Tensorflow-{'shap_kernel','shap_deep','lime'},Backtrace-{'Default','Contrastive'} |
| start_idx | Starting index of the dataset to evaluate | int |
| end_idx | Ending index of the dataset to evaluate | int |
| scaler | Total / Starting Relevance at the Last Layer Integer ( For Backtrace) | int (Default: None, Preferred: 1) |
|thresholding | Thresholding Model Prediction | float (default=0.5) |
|subset_samples | If we want to use k-means based sampling to use a subset for SHAP Explainer (Only for SHAP) | True/False |
|subset_number | Number of samples to sample if subset_samples is True (Only for SHAP) | int |
2. **Calculate Explanation Metrics**
Use the `calculate_metrics` method to compute various metrics for evaluating explanations. The method returns a DataFrame with the results.
```python
# Calculate metrics
metrics_df = explanation_metrics.calculate_metrics()
print(metrics_df)
```
---
#### Explanation Metrics Overview
The **`ExplanationMetrics`** class supports the following key metrics for evaluating explanations:
| **Metric** | **Purpose** | **Description** |
|----------------------|----------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------|
| **Faithfulness** | Measures consistency between attributions and prediction changes. | Correlation between attribution values and changes in model output when features are perturbed. |
| **Infidelity** | Assesses how closely attributions align with the actual prediction impact. | Squared difference between predicted and actual impact when features are perturbed. |
| **Sensitivity** | Evaluates the robustness of attributions to small changes in inputs. | Compares attribution values before and after perturbing input features. |
| **Comprehensiveness**| Assesses the explanatory power of the top-k features. | Measures how much model prediction decreases when top-k important features are removed. |
| **Sufficiency** | Determines whether top-k features alone are sufficient to explain the model's output. | Compares predictions based only on the top-k features to baseline predictions. |
| **Monotonicity** | Verifies the consistency of attribution values with the direction of predictions. | Ensures that changes in attributions match consistent changes in predictions. |
| **Complexity** | Measures the sparsity of explanations. | Counts the number of features with non-zero attribution values. |
| **Sparseness** | Assesses how minimal the explanation is. | Calculates the proportion of features with zero attribution values. |
Reference Values for Available Metrics :
| Metric | Typical Range | Interpretation | "Better" Direction |
|------------------|--------------------------|---------------------------------------------------------------------------------------------------------------|-------------------------------------|
| Faithfulness | -1 to 1 | Measures correlation between attributions and changes in model output when removing features. Higher indicates that more important features (according to the explanation) indeed cause larger changes in the model’s prediction. | Higher is better (closer to 1) |
| Infidelity | ≥ 0 | Measures how well attributions predict changes in the model’s output under input perturbations. Lower infidelity means the attributions closely match the model’s behavior under perturbations. | Lower is better (closer to 0) |
| Sensitivity | ≥ 0 | Measures how stable attributions are to small changes in the input. Lower values mean more stable (robust) explanations. | Lower is better (closer to 0) |
| Comprehensiveness | Depends on model output | Measures how much the prediction drops when the top-k most important features are removed. If removing them significantly decreases the prediction, it suggests these features are truly important. | Higher difference indicates more comprehensive explanations |
| Sufficiency | Depends on model output | Measures how well the top-k features alone approximate or even match the original prediction. A higher (or less negative) value means these top-k features are sufficient on their own, capturing most of what the model uses. | Higher (or closer to zero if baseline is the original prediction) is generally better |
| Monotonicity | 0 to 1 (as an average) | Checks if attributions are in a non-increasing order. A higher average indicates that the explanation presents a consistent ranking of feature importance. | Higher is better (closer to 1) |
| Complexity | Depends on number of features | Measures the number of non-zero attributions. More features with non-zero attributions mean a more complex explanation. Fewer important features make it easier to interpret. | Lower is typically preferred |
| Sparseness | 0 to 1 | Measures the fraction of attributions that are zero. Higher sparseness means fewer features are highlighted, making the explanation simpler. | Higher is generally preferred |
---
#### Practical Examples
**1. Faithfulness Correlation**
- Correlates feature attributions with prediction changes when features are perturbed.
- Higher correlation indicates that the explanation aligns well with model predictions.
```python
faithfulness_score = explanation_metrics.calculate_metrics()['faithfulness']
print("Faithfulness:", faithfulness_score)
```
**2. Infidelity**
- Computes the squared difference between predicted and actual changes in model output.
- Lower scores indicate higher alignment of explanations with model behavior.
```python
infidelity_score = explanation_metrics.calculate_metrics()['infidelity']
print("Infidelity:", infidelity_score)
```
**3. Comprehensiveness**
- Evaluates whether removing the top-k features significantly reduces the model's prediction confidence.
- A higher score indicates that the top-k features are critical for the prediction.
```python
comprehensiveness_score = explanation_metrics.calculate_metrics()['comprehensiveness']
print("Comprehensiveness:", comprehensiveness_score)
```
---
#### Example Output
After calculating the metrics, the method returns a DataFrame summarizing the results:
| Metric | Value |
|-------------------|---------|
| Faithfulness | 0.89 |
| Infidelity | 0.05 |
| Sensitivity | 0.13 |
| Comprehensiveness | 0.62 |
| Sufficiency | 0.45 |
| Monotonicity | 1.00 |
| Complexity | 7 |
| Sparseness | 0.81 |
---
### Image Metrics Calculation
The **`xai_evals`** package provides a powerful class, **`ExplanationMetricsImage`**, to evaluate the quality of explanations generated for image-based deep learning models. This class allows you to calculate several metrics, helping you assess the robustness, reliability, and interpretability of your image explanations. [NOTE: Metrics currently support image-based deep learning models such as PyTorch and TensorFlow.]
#### ExplanationMetricsImage Class
The **`ExplanationMetricsImage`** class in **`xai_evals`** provides a structured way to evaluate the quality and reliability of image-based explanations, such as GradCAM, Integrated Gradients, and Occlusion. By assessing multiple metrics, you can better understand how well these image explanations align with your model's predictions and behavior. This class uses **Quantus** to calculate the various metrics for evaluating explanations.
---
#### Steps for Using ExplanationMetricsImage
1. **Initialize ExplanationMetricsImage**
Begin by creating an instance of the **`ExplanationMetricsImage`** class with the necessary inputs, including the model, dataset, and evaluation settings.
2. **Evaluate Explanation Metrics**
Use the `evaluate` method to compute various metrics for evaluating image-based explanations. The method returns a dictionary with the results.
```python
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from tensorflow.keras.datasets import cifar10
from xai_evals.metrics import ExplanationMetricsImage
from torchvision import models
import tensorflow as tf
import numpy as np
import torch.optim as optim
import tensorflow.keras as keras
# --- TensorFlow Setup ---
# Load CIFAR-10 dataset (for TensorFlow example)
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0 # Normalize the images
train_data = (x_train, y_train) # Tuple of data and labels
test_data = (x_test, y_test) # Tuple of data and labels
# Convert to TensorFlow Dataset
train_dataset_tf = tf.data.Dataset.from_tensor_slices(train_data).batch(32)
test_dataset_tf = tf.data.Dataset.from_tensor_slices(test_data).batch(32)
# --- PyTorch Setup ---
# PyTorch Dataset for CIFAR-10
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = DataLoader(trainset, batch_size=4, shuffle=True)
# --- Custom Model Setup ---
# Custom PyTorch model (simple CNN for CIFAR-10)
class SimpleCNN(torch.nn.Module):
def __init__(self):
super(SimpleCNN, self).__init__()
self.conv1 = torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1)
self.conv2 = torch.nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
self.fc1 = torch.nn.Linear(64*8*8, 128)
self.fc2 = torch.nn.Linear(128, 10) # 10 classes for CIFAR-10
def forward(self, x):
x = torch.relu(self.conv1(x))
x = torch.max_pool2d(x, 2)
x = torch.relu(self.conv2(x))
x = torch.max_pool2d(x, 2)
x = x.view(x.size(0), -1)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
# --- TensorFlow Model Setup ---
model_tf = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, kernel_size=3, activation='relu', input_shape=(32, 32, 3)),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Conv2D(64, kernel_size=3, activation='relu'),
tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10) # 10 classes for CIFAR-10
])
# Compile the model for training
model_tf.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# --- TensorFlow Model Training (1 Epoch) ---
model_tf.fit(train_dataset_tf, epochs=1)
print("Finished TensorFlow Training")
# Initialize PyTorch model
model_torch = SimpleCNN()
model_torch.train() # Set model to training mode
# --- Training PyTorch Model for 1 Epoch ---
criterion = torch.nn.CrossEntropyLoss()
optimizer = optim.SGD(model_torch.parameters(), lr=0.001, momentum=0.9)
for epoch in range(1): # Training for 1 epoch
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
inputs, labels = data
optimizer.zero_grad()
outputs = model_torch(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if i % 2000 == 1999: # Print every 2000 mini-batches
print(f"[{epoch + 1}, {i + 1}] loss: {running_loss / 2000:.3f}")
running_loss = 0.0
print("Finished PyTorch Training")
# --- Example 1: PyTorch Metrics Calculation ---
metrics_image_pytorch = ExplanationMetricsImage(
model=model_torch,
data_loader=trainloader,
framework="torch",
num_classes=10
)
# Example: Calculate metrics using the PyTorch DataLoader
metrics_results_pytorch = metrics_image_pytorch.evaluate(
start_idx=0, end_idx=32,
metric_names=["FaithfulnessCorrelation","MaxSensitivity","MPRT","SmoothMPRT","AvgSensitivity","FaithfulnessEstimate"],
xai_method_name="IntegratedGradients"
)
print("PyTorch Example Metrics:", metrics_results_pytorch)
# --- Example 2: TensorFlow Metrics Calculation ---
metrics_image_tensorflow = ExplanationMetricsImage(
model=model_tf, # Use TensorFlow model for TensorFlow example
data_loader=train_dataset_tf,
framework="tensorflow",
num_classes=10
)
# Example: Calculate metrics using the TensorFlow Dataset
metrics_results_tensorflow = metrics_image_tensorflow.evaluate(
start_idx=0, end_idx=32,
metric_names=["FaithfulnessCorrelation","MaxSensitivity","MPRT","SmoothMPRT","AvgSensitivity","FaithfulnessEstimate"],
xai_method_name="GradCAM"
)
print("TensorFlow Example Metrics:", metrics_results_tensorflow)
# --- Example 3: Explain using a single image (numpy array) ---
single_image_numpy = np.random.randn(1,3,32, 32) # Random image as a NumPy array, [H, W, C]
label = np.random.randint(0, 9,size=1)
# Initialize ExplanationMetricsImage for a single image (use PyTorch framework even for NumPy array)
metrics_image_single = ExplanationMetricsImage(
model=model_torch, # Use PyTorch model
data_loader=(single_image_numpy,label), # Pass the single image as a numpy array
framework="torch", # Use the torch framework for single image
num_classes=10,
)
# Calculate metrics for the single image
metrics_single_image = metrics_image_single.evaluate(
start_idx=0, end_idx=1,
metric_names=["FaithfulnessCorrelation","MaxSensitivity","MPRT","SmoothMPRT","AvgSensitivity","FaithfulnessEstimate"],
xai_method_name="IntegratedGradients"
)
print("Single Image Example Metrics:", metrics_single_image)
# --- Example 4: TensorFlow Model with Single Image ---
single_image_numpy = np.random.randn(1,32, 32,3) # Random image as a NumPy array, [H, W, C]
label = np.random.randint(0, 9,size=1)
# For TensorFlow, the single image example using TensorFlow framework
metrics_image_single_tf = ExplanationMetricsImage(
model=model_tf, # Use TensorFlow model
data_loader=(single_image_numpy,label), # Pass the single image as a numpy array
framework="tensorflow", # Use the tensorflow framework for single image
num_classes=10
)
# Calculate metrics for the single image
metrics_single_image_tf = metrics_image_single_tf.evaluate(
start_idx=0, end_idx=1,
metric_names=["FaithfulnessCorrelation","MaxSensitivity","MPRT","SmoothMPRT","AvgSensitivity","FaithfulnessEstimate"],
xai_method_name="GradCAM"
)
print("TensorFlow Single Image Example Metrics:", metrics_single_image_tf)
```
---
#### Explanation Metrics Overview
The **`ExplanationMetricsImage`** class supports the following key metrics for evaluating image explanations:
| **Metric** | **Purpose** | **Description** |
|--------------------------|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| **FaithfulnessCorrelation** | Measures the correlation between attribution values and model output changes when perturbing image features. | Higher values indicate that important features (according to the explanation) indeed cause significant changes in the model’s prediction. |
| **MaxSensitivity** | Measures the maximum sensitivity of an attribution method to input perturbations. | Higher values suggest that the attribution method highlights the most sensitive parts of the image. |
| **MPRT** | Measures the relevance of features based on perturbations. | Helps evaluate the robustness of the explanation when features are perturbed. |
| **SmoothMPRT** | A smoother version of MPRT that reduces noise from perturbations. | Ensures more stable results by averaging perturbations. |
| **AvgSensitivity** | Measures the average sensitivity of the model to input perturbations across all features. | Indicates how sensitive the model is to small changes in the input. |
| **FaithfulnessEstimate** | Estimates the faithfulness of the attribution by comparing against a perturbation baseline. | Compares how well the explanation reflects the model’s behavior under feature perturbations. |
Reference Values for Available Metrics:
| Metric | Typical Range | Interpretation | "Better" Direction |
|--------------------------|-------------------------|---------------------------------------------------------------------------------------------------------|--------------------------------------|
| FaithfulnessCorrelation | -1 to 1 | Measures correlation between attribution values and changes in model output when features are perturbed. Higher indicates that more important features (according to the explanation) indeed cause larger changes in the model’s prediction. | Higher is better (closer to 1) |
| MaxSensitivity | ≥ 0 | Measures how well attributions match model sensitivity when perturbing image features. Lower scores indicate that the explanations focus on the most sensitive features. | Lower is better (closer to 0) |
| MPRT | ≥ 0 | Measures how the perturbation of features affects the model’s prediction. A higher score indicates that the model's prediction is heavily influenced by the perturbed features. | Higher is better |
| SmoothMPRT | ≥ 0 | Measures the stability of MPRT under perturbation noise. Higher values suggest more stable explanations. | Higher is better |
| AvgSensitivity | ≥ 0 | Measures the average change in prediction for small changes in input features. Indicates model robustness. | Lower is better |
| FaithfulnessEstimate | 0 to 1 | Compares model predictions under perturbations and attributions. Higher values indicate better alignment. | Higher is better |
---
#### Initialization Attributes / Contructor for **`ExplanationMetricsImage`** class
| **Attribute** | **Description** | **Values** |
|----------------------|----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
| `model` | The trained model for which explanations will be evaluated. | [PyTorch model, TensorFlow model] |
| `data_loader` | The data loader or dataset containing the test data. | [PyTorch Dataset,PyTorch DataLoader,TensorFlow Dataset, tuple of (image-np.array/torch.Tensor/tensorflow.Tensor.Tensor,label-np.array/torch.Tensor/tensorflow.Tensor)] |
| `framework` | The framework used for the model (either 'torch' or 'tensorflow' or 'backtrace'). | {'torch', 'tensorflow','backtrace'} |
| `device` | The device (CPU/GPU) used for performing computations (for PyTorch models). | [torch.device (Optional)] |
| `num_classes` | The number of classes for classification tasks. | Integer (default: 10) |
---
#### Evaluate Function (`evaluate`) for **`ExplanationMetricsImage`** class to calculate metrics
| **Attribute** | **Description** | **Values** |
|-----------------------|----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
| `start_idx` | The starting index of the batch for evaluation. | Integer (e.g., 0) |
| `end_idx` | The ending index of the batch for evaluation. | Integer (e.g., 100, or `None` for the entire batch) |
| `metric_names` | The list of metric names to evaluate. | List of strings representing the metrics to compute (e.g., `["FaithfulnessCorrelation", "MaxSensitivity", "MPRT", "SmoothMPRT", "AvgSensitivity", "FaithfulnessEstimate"]`) |
| `xai_method_name` | The name of the XAI method used for explanations (e.g., 'IntegratedGradients', 'GradCAM', etc.). | String (e.g., for Torch `{grad_cam, integrated_gradients, saliency, deep_lift, gradient_shap, guided_backprop, occlusion, layer_gradcam, feature_ablation}` ; for Tensorflow `{VanillaGradients, GradCAM,GradientsInput,IntegratedGradients,OcclusionSensitivity,SmoothGrad` & for Backtrace `{default,contrast-positive,contrast-negative}`) |
---
#### Practical Examples
**1. Faithfulness Correlation**
- Correlates feature attributions with prediction changes when features (pixels) in the image are perturbed.
- Higher correlation indicates that the explanation aligns well with model predictions.
```python
faithfulness_score = metrics_image.evaluate(
start_idx=0, end_idx=5, metric_names=["FaithfulnessCorrelation"], xai_method_name="IntegratedGradients"
)['FaithfulnessCorrelation']
print("Faithfulness:", faithfulness_score)
```
**2. Max Sensitivity**
- Measures the sensitivity of the explanation method by observing the effect of perturbing different parts of the image.
- A higher score indicates that the explanation method is sensitive to the most influential pixels.
```python
max_sensitivity_score = metrics_image.evaluate(
start_idx=0, end_idx=5, metric_names=["MaxSensitivity"], xai_method_name="IntegratedGradients"
)['MaxSensitivity']
print("Max Sensitivity:", max_sensitivity_score)
```
---
#### Example Output
After calculating the metrics, the method returns a dictionary summarizing the results:
| Metric | Value |
|--------------------------|---------|
| FaithfulnessCorrelation | 0.88 |
| MaxSensitivity | 0.92 |
---
#### Benefits of ExplanationMetrics
- **Interpretability:** Quantifies how well feature attributions explain the model's predictions.
- **Robustness:** Evaluates the stability of explanations under input perturbations.
- **Comprehensiveness and Sufficiency:** Provides insights into the contribution of top features to the model’s predictions.
- **Scalability:** Works with various tasks, including binary classification, multi-class classification, and regression.
By leveraging these metrics, you can ensure that your explanations are meaningful, robust, and align closely with your model's decision-making process.
---
### Acknowledgements
We would like to extend our heartfelt thanks to the developers and contributors of the libraries **[Quantus](https://github.com/Trusted-AI/quantus)**, **[Captum](https://captum.ai/)**, **[tf-explain](https://github.com/sicara/tf-explain)**, **[LIME](https://github.com/marcotcr/lime)**, and **[SHAP](https://github.com/slundberg/shap)**, which have been instrumental in enabling the explainability methods implemented in this package.
- **[Quantus](https://github.com/Trusted-AI/quantus)** provides a comprehensive suite of metrics that allow us to evaluate and assess the quality of explanations, ensuring that our interpretability methods are both reliable and robust.
- **[Captum](https://captum.ai/)** is an invaluable tool for PyTorch users, offering a variety of powerful attribution methods like Integrated Gradients, Saliency, and Gradient Shap, which are crucial for generating insights into the inner workings of deep learning models.
- **[tf-explain](https://github.com/sicara/tf-explain)** simplifies the process of explaining TensorFlow/Keras models, with methods like GradCAM and Occlusion Sensitivity, enabling us to generate visual explanations that help interpret the decision-making of complex models.
- **[LIME](https://github.com/marcotcr/lime)** (Local Interpretable Model-Agnostic Explanations) has been a key library for providing local explanations for machine learning models, allowing us to generate understandable explanations for individual predictions.
- **[SHAP](https://github.com/slundberg/shap)** (SHapley Additive exPlanations) is essential for computing Shapley values and provides a unified approach to explaining machine learning models, making it easier to understand feature contributions across a range of model types.
We are deeply grateful for the contributions these libraries have made in advancing model interpretability, and their seamless integration in our package ensures that users can leverage state-of-the-art methods for understanding machine learning and deep learning models.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
---
### Future Plans
In the future, we will continue to improve this library.
---
## Citations
This code is free. So, if you use this code anywhere, please cite us:
```
@misc{seth2025xaievalsframeworkevaluating,
title={xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods},
author={Pratinav Seth and Yashwardhan Rathore and Neeraj Kumar Singh and Chintan Chitroda and Vinay Kumar Sankarapu},
year={2025},
eprint={2502.03014},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2502.03014},
}
```
## Get in touch
Contanct us at [AryaXAI](https://www.aryaxai.com/).
Raw data
{
"_id": null,
"home_page": "https://github.com/AryaXAI/xai_evals",
"name": "xai-evals",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.0",
"maintainer_email": null,
"keywords": "aryaxai deep learning backtrace, ML observability",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/e5/cc/8379d2c135be5f7d092a09c9bee49429f65ccc4576ba4da779041c55d978/xai_evals-0.0.10.tar.gz",
"platform": null,
"description": "# xai_evals\n## by - [AryaXAI](https://www.aryaxai.com/)\n\n**`xai_evals`** is a Python package designed to generate and benchmark various explainability methods for machine learning and deep learning models. It offers tools for creating and evaluating explanations of popular machine learning models, supporting widely-used explanation methods. The package aims to streamline the interpretability of machine learning models, allowing practitioners to gain insights into how their models make predictions. Additionally, it includes several metrics for assessing the quality of these explanations . [](LICENSE) \n\nTechnical Report : [xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods](https://arxiv.org/abs/2502.03014)\n\n\n---\n\n## Table of Contents\n\n- [Installation](#installation)\n- [Usage](#usage)\n - [SHAP Tabular Explainer](#shap-tabular-explainer)\n - [LIME Tabular Explainer](#lime-tabular-explainer)\n - [Torch Tabular Explainer](#torch-tabular-explainer)\n - [TFKeras Tabular Explainer](#tfkeras-tabular-explainer)\n - [DlBacktrace Tabular Explainer](#dlbacktrace-tabular-explainer)\n - [Tabular Metrics Calculation](#tabular-metrics-calculation)\n - [Torch Image Explainer](#torch-image-explainer)\n - [TFKeras Image Explainer](#tfkeras-image-explainer)\n - [DlBacktrace Image Explainer](#dlbacktrace-image-explainer)\n - [Tabular Metrics Calculation](#tabular-metrics-calculation)\n - [Image Metrics Calculation](#image-metrics-calculation)\n- [License](#license)\n\n---\n\n## Installation\n\nTo install **`xai_evals`**, you can use `pip`. First, clone the repository or download the files to your local environment. Then, install the necessary dependencies:\n\n```bash\npip install xai_evals\n```\n\n## Example Notebooks : \n\n### Tensorflow-Keras : \n\n| Name | Dataset | Link |\n|-------------|-------------|-------------------------------|\n| Tabualar ML Models Illustration and Evaluation Metrics | IRIS Dataset | [Colab Link](https://colab.research.google.com/drive/1UoT5Gx5d_L1KQmiirGUyyE1b9ajayO3L?usp=sharing) |\n| Tabular Deep Learning Model Illustration and Evaluation Metrics | Lending Club | [Colab Link](https://colab.research.google.com/drive/17vuRt4D7ph6ZnAbrWMJ2aRum2mk14Tc6?usp=sharing) |\n| Image Deep Learning Model Illustration and Evaluation Metrics | CIFAR10 | [Colab Link](https://colab.research.google.com/drive/1DNUMT6CNx2VGHsK8qhl3dEEtoN3eA7ar?usp=sharing) |\n\n## Usage\n\n## Usage : Machine Learning Models\n\nSupported Machine Learning Models for `SHAPExplainer` and `LIMEExplainer` class is as follows : \n\n| **Library** | **Supported Models** |\n|-------------------------|------------------------------------------------------------------------------------------------------|\n| **scikit-learn** | LogisticRegression, RandomForestClassifier, SVC, SGDClassifier, GradientBoostingClassifier, AdaBoostClassifier, DecisionTreeClassifier, KNeighborsClassifier, GaussianNB, LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis, KMeans, NearestCentroid, BaggingClassifier, VotingClassifier, MLPClassifier, LogisticRegressionCV, RidgeClassifier, ElasticNet |\n| **xgboost** | XGBClassifier |\n| **catboost** | CatBoostClassifier |\n| **lightgbm** | LGBMClassifier |\n| **sklearn.ensemble** | HistGradientBoostingClassifier, ExtraTreesClassifier |\n\n### SHAP Tabular Explainer\n\nThe `SHAPExplainer` class allows you to compute and visualize **SHAP** values for your trained model. It supports various types of models, including tree-based models (e.g., `RandomForest`, `XGBoost`) and deep learning models (e.g., PyTorch models).\n\n**Example:**\n\n```python\nfrom xai_evals.explainer import SHAPExplainer\nfrom sklearn.ensemble import RandomForestClassifier\nimport pandas as pd\nfrom sklearn.datasets import load_iris\n\n# Load dataset and train a model\ndata = load_iris()\nX = pd.DataFrame(data.data, columns=data.feature_names)\ny = data.target\nmodel = RandomForestClassifier()\nmodel.fit(X, y)\n\n# Initialize SHAP explainer\nshap_explainer = SHAPExplainer(model=model, features=X.columns, task=\"multiclass-classification\", X_train=X)\n\n# Explain a specific instance (e.g., the first instance in the test set)\nshap_attributions = shap_explainer.explain(X, instance_idx=0)\n\n# Print the feature attributions\nprint(shap_attributions)\n```\n\n| **Feature** | **Value** | **Attribution** |\n|-----------------------|-----------|-----------------|\n| petal_length_(cm) | 1.4 | 0.360667 |\n| petal_width_(cm) | 0.2 | 0.294867 |\n| sepal_length_(cm) | 5.1 | 0.023467 |\n| sepal_width_(cm) | 3.5 | 0.010500 |\n\n\n### LIME Tabular Explainer\n\nThe `LIMEExplainer` class allows you to generate **LIME** explanations, which work by perturbing the input data and fitting a locally interpretable model.\n\n**Example:**\n\n```python\nfrom xai_evals.explainer import LIMEExplainer\nfrom sklearn.linear_model import LogisticRegression\nimport pandas as pd\nfrom sklearn.datasets import load_iris\n\n# Load dataset and train a model\ndata = load_iris()\nX = pd.DataFrame(data.data, columns=data.feature_names)\ny = data.target\nmodel = LogisticRegression(max_iter=200)\nmodel.fit(X, y)\n\n# Initialize LIME explainer\nlime_explainer = LIMEExplainer(model=model, features=X.columns, task=\"multiclass-classification\", X_train=X)\n\n# Explain a specific instance (e.g., the first instance in the test set)\nlime_attributions = lime_explainer.explain(X, instance_idx=0)\n\n# Print the feature attributions\nprint(lime_attributions)\n```\n| **Feature** | **Value** | **Attribution** |\n|-----------------------|-----------|-----------------|\n| petal_length_(cm) | 1.4 | 0.497993 |\n| petal_width_(cm) | 0.2 | 0.213963 |\n| sepal_length_(cm) | 5.1 | 0.127047 |\n| sepal_width_(cm) | 3.5 | 0.053926 |\n\nFor **LIMEExplainer and SHAPExplainer Class** we have several attributes :\n\n| Attribute | Description | Values |\n|--------------|-------------|--------|\n| model | Trained model which you want to explain | [sklearn model] |\n| features | Features present in the Training/Testing Set | [list of features] |\n| X_train | Training Set Data | {pd.dataframe,numpy.array} |\n| task | Task performed by the model | {binary,multiclass} |\n| model_classes (Only for LIME) | List of Classes to be predicted by model | [list of classes] |\n| subset_samples (Only for SHAP) | If we want to use k-means based sampling to use a subset for SHAP Explainer | True/False |\n| subset_number (Only for SHAP)| Number of samples to sample if subset_samples is True | int |\n\n## Usage : Deep Learning Models\n\n### Torch Tabular Explainer\n\nThe `TorchTabularExplainer` class allows you to generate explanations for Pytorch Deep Learning Model . Explaination Method available include 'integrated_gradients', 'deep_lift', 'gradient_shap','saliency', 'input_x_gradient', 'guided_backprop','shap_kernel', 'shap_deep' and 'lime'.\n\n| Attribute | Description | Values |\n|--------------|-------------|--------|\n| model | Trained Torch model which you want to explain | [Torch Model] |\n| method | Explanation method. Options:'integrated_gradients', 'deep_lift', 'gradient_shap','saliency', 'input_x_gradient', 'guided_backprop','shap_kernel', 'shap_deep', 'lime' | string |\n| X_train | Training Set Data | {pd.dataframe,numpy.array} |\n| feature_names | Features present in the Training/Testing Set | [list of features] |\n| task | Task performed by the model | {binary-classification,multiclass-classification} |\n\n### TFKeras Tabular Explainer\n\nThe `TFTabularExplainer` class allows you to generate explanations for Tensorflow/Keras Deep Learning Model . Explaination Method available include 'shap_kernel', 'shap_deep' and 'lime'.\n\n| Attribute | Description | Values |\n|--------------|-------------|--------|\n| model | Trained Tf/Keras model which you want to explain | [Tf/Keras Model] |\n| method | Explanation method. Options:'shap_kernel', 'shap_deep', 'lime' | string |\n| X_train | Training Set Data | {pd.dataframe,numpy.array} |\n| feature_names | Features present in the Training/Testing Set | [list of features] |\n| task | Task performed by the model | {binary-classification,multiclass-classification} |\n\n\n\n### DlBacktrace Tabular Explainer\n\nThe `DlBacktraceTabularExplainer` , based on DLBacktrace, a method for analyzing neural networks by tracing the relevance of each component from output to input, to understand how each part contributes to the final prediction. It offers two modes: Default and Contrast, and is compatible with TensorFlow and PyTorch. (https://github.com/AryaXAI/DLBacktrace)\n \n| Attribute | Description | Values |\n|--------------|-------------|--------|\n| model | Trained Tf/Keras/Torch model which you want to explain | [Torch/Tf/Keras Model] |\n| method | Explanation method. Options:\"default\" or \"contrastive\" | string |\n| X_train | Training Set Data | {pd.dataframe,numpy.array} |\n| scaler | Total / Starting Relevance at the Last Layer | Integer (Default: 1) |\n| feature_names | Features present in the Training/Testing Set | [list of features] |\n| thresholding | Thresholding for Model Prediction | float (Default : 0.5) |\n| task | Task performed by the model | {binary-classification,multiclass-classification} |\n\n### Torch Image Explainer\n\nThe `TorchImageExplainer` class allows you to generate explanations for PyTorch-based CNN models. This class wraps around several attribution methods available in Captum, including:\n\n- **Integrated Gradients**\n- **Saliency**\n- **DeepLift**\n- **GradientShap**\n- **GuidedBackprop**\n- **Occlusion**\n- **LayerGradCam**\n\n**Example:**\n\n```python\nimport torch\nimport torchvision\nimport torchvision.transforms as transforms\nfrom torch.utils.data import DataLoader\nfrom xai_evals.explainer import TorchImageExplainer\nfrom torchvision import models\nimport numpy as np\nimport matplotlib.pyplot as plt\n\n# Load CIFAR-10 dataset\ntransform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\ntrainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)\ntrainloader = DataLoader(trainset, batch_size=4, shuffle=True)\n\n# Load pre-trained ResNet model\nmodel = models.resnet18(pretrained=True)\nmodel.eval()\n\n# Initialize the TorchImageExplainer\nexplainer = TorchImageExplainer(model)\n\n# Example 1: Explain using DataLoader (batch of images)\nidx = 0 # Index for the image in the DataLoader\nmethod = \"integrated_gradients\"\ntask = \"classification\"\nattribution_map = explainer.explain(trainloader, idx, method, task)\n\n# Visualize attribution map (simplified)\nplt.imshow(attribution_map)\nplt.title(f\"Attribution Map - {method} for Dataloader Torch\")\nplt.show()\n\n# Example 2: Explain using a single image (torch.Tensor)\nsingle_image_tensor = torch.randn(3, 32, 32) # Random image as a tensor, [C, H, W]\nattribution_map = explainer.explain(single_image_tensor, idx=None, method=method, task=task)\n\n# Visualize attribution map for the single image\nplt.imshow(attribution_map)\nplt.title(f\"Attribution Map - {method} for Single Image (Tensor)\")\nplt.show()\n\n# Example 3: Explain using a single image (np.ndarray)\nsingle_image_numpy = np.random.randn(3, 32, 32) # Random image as a NumPy array, [C, H, W]\nattribution_map = explainer.explain(single_image_numpy, idx=None, method=method, task=task)\n\n# Visualize attribution map for the single image (NumPy)\nplt.imshow(attribution_map)\nplt.title(f\"Attribution Map - {method} for Single Image (NumPy)\")\nplt.show()\n```\n\n#### **TorchImageExplainer**: `explain` Function Attributes\n\n| **Attribute** | **Description** | **Values** |\n|---------------|-----------------|-----------|\n| `testdata` | The input data, which can be a DataLoader, NumPy array, or Tensor. | `[torch.utils.data.DataLoader, np.ndarray, torch.Tensor]` |\n| `idx` | The index of the test sample to explain. | `int` or `None` (for explaining a single sample or all samples) |\n| `method` | The explanation method to use. | `{grad_cam, integrated_gradients, saliency, deep_lift, gradient_shap, guided_backprop, occlusion, layer_gradcam, feature_ablation}` |\n| `task` | The type of model task (e.g., classification). | `{classification}` |\n\n---\n\n### TFKeras Image Explainer\n\nThe `TFImageExplainer` class provides a similar functionality for TensorFlow/Keras-based models, allowing you to generate explanations for images using methods like GradCAM and Occlusion Sensitivity.\n\n**Example:**\n\n```python\nimport tensorflow as tf\nfrom tensorflow.keras import layers, models\nfrom tensorflow.keras.datasets import cifar10\nimport numpy as np\nimport matplotlib.pyplot as plt\nfrom xai_evals.explainer import TFImageExplainer\n\n# Step 1: Define a Custom CNN Model\ndef create_custom_model():\n model = models.Sequential([\n layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),\n layers.MaxPooling2D((2, 2)),\n layers.Conv2D(64, (3, 3), activation='relu'),\n layers.MaxPooling2D((2, 2)),\n layers.Conv2D(64, (3, 3), activation='relu'),\n layers.Flatten(),\n layers.Dense(64, activation='relu'),\n layers.Dense(10, activation='softmax')\n ])\n model.compile(optimizer='adam',\n loss='sparse_categorical_crossentropy',\n metrics=['accuracy'])\n return model\n\n# Load CIFAR-10 dataset\n(x_train, y_train), (x_test, y_test) = cifar10.load_data()\n\n# Normalize pixel values to be between 0 and 1\nx_train = x_train.astype(\"float32\") / 255.0\nx_test = x_test.astype(\"float32\") / 255.0\n\n# Create a TensorFlow Dataset from the test data\ntest_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test)).batch(32)\n\n# Initialize and train the custom model\nmodel = create_custom_model()\n\n# Train the model\nmodel.fit(x_train, y_train, epochs=1, batch_size=64)\n\n# Step 2: Use the TFImageExplainer with the Custom Model\nexplainer = TFImageExplainer(model)\n\n# Example 1: Explain a Single Image (NumPy Array)\nimage = x_test[0] # Select the first image\nlabel = y_test[0] # Get the label for the first image\n\n# Generate the Grad-CAM explanation for the image\nattribution_map = explainer.explain(image, idx=None, method=\"grad_cam\", task=\"classification\")\n\n# Visualize the attribution map\nplt.imshow(attribution_map, cmap=\"jet\")\nplt.colorbar()\nplt.title(\"Grad-CAM Attribution Map for CIFAR-10 Image\")\nplt.show()\n\n# Example 2: Explain an Image from the TensorFlow Dataset (Using idx)\nidx = 10 # Select the 10th image from the test dataset\n\n# Generate the Grad-CAM explanation for the image at index `idx`\nattribution_map = explainer.explain(test_dataset, idx, method=\"grad_cam\", task=\"classification\")\n\n# Visualize the attribution map\nplt.imshow(attribution_map, cmap=\"jet\")\nplt.colorbar()\nplt.title(f\"Grad-CAM Attribution Map for Image Index {idx} in CIFAR-10\")\nplt.show()\n```\n\n#### **TFImageExplainer**: `explain` Function Attributes\n\n| **Attribute** | **Description** | **Values** |\n|---------------|-----------------|-----------|\n| `testset` | The input data, which can be a NumPy array, TensorFlow tensor, or Dataset. | `[np.ndarray, tf.Tensor, tf.data.Dataset]` |\n| `idx` | The index of the test sample to explain. | `int` or `None` (for explaining a single sample or all samples) |\n| `method` | The explanation method to use. | `{grad_cam, occlusion}` |\n| `task` | The type of model task (e.g., classification). | `{classification}` |\n| `label` | The class label for the input sample (used for classification tasks). | `int` |\n\n---\n\n### DlBacktrace Image Explainer\n\nThe `DlBacktraceImageExplainer` based on DLBacktrace, a method for analyzing neural networks by tracing the relevance of each component from output to input, to understand how each part contributes to the final prediction. It offers two modes: Default and Contrast, and is compatible with TensorFlow and PyTorch. (https://github.com/AryaXAI/DLBacktrace)\n\n**Example: Tensorflow Model DlBacktraceImageExplainer**\n\n```python\n# Load CIFAR-10 data\n(x_train, y_train), (x_test, y_test) = cifar10.load_data()\n\n# Normalize pixel values to be between 0 and 1\nx_train, x_test = x_train / 255.0, x_test / 255.0\n\n# Create a simple CNN model for CIFAR-10\ndef create_cnn_model():\n model = models.Sequential([\n layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),\n layers.MaxPooling2D((2, 2)),\n layers.Conv2D(64, (3, 3), activation='relu'),\n layers.MaxPooling2D((2, 2)),\n layers.Conv2D(64, (3, 3), activation='relu'),\n layers.Flatten(),\n layers.Dense(64, activation='relu'),\n layers.Dense(10)\n ])\n model.compile(optimizer='adam',\n loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n metrics=['accuracy'])\n return model\n\n# Create the model\nmodel = create_cnn_model()\n\n# Train the model\nmodel.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))\n\n# Save the model for later use\nmodel.save('cifar10_cnn_model.h5')\n\nexplainer = BacktraceImageExplainer(model=model)\n\n# Choose an image from the test set\ntest_image = x_test[0:1] # Selecting the first image for testing\n\n# Get the explanation for the test image\nexplanation = explainer.explain(test_image, instance_idx=0,mode='default', scaler=1, thresholding=0, task='multi-class-classification')\n\n# Plot the explanation (relevance map)\nplt.imshow(explanation, cmap='hot')\nplt.colorbar()\nplt.title(\"Feature Relevance for CIFAR-10 Image\")\nplt.show()\n```\n\n**Example: Torch Model DlBacktraceImageExplainer**\n\n```python\n# Define a simple CNN model for CIFAR-10 without using `view()`\nclass SimpleCNN(nn.Module):\n def __init__(self, num_classes=10):\n super(SimpleCNN, self).__init__()\n self.identity = nn.Identity()\n self.conv1 = nn.Conv2d(3, 16, 5,2)\n self.relu1 = nn.ReLU()\n self.conv2 = nn.Conv2d(16, 32, 3,2)\n self.relu2 = nn.ReLU()\n self.flatten = nn.Flatten()\n self.fc1 = nn.Linear(32 * 6 * 6, 512)\n self.relu3 = nn.ReLU()\n self.fc2 = nn.Linear(512, num_classes)\n\n def forward(self, x):\n x = self.identity(x)\n x = self.conv1(x)\n x = self.relu1(x)\n x = self.conv2(x)\n x = self.relu2(x)\n x = self.flatten(x)\n x = self.fc1(x)\n x = self.relu3(x)\n x = self.fc2(x)\n return x\n\n# Load CIFAR-10 data with transforms for normalization\ntransform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\n\ntrainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)\ntestset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)\n\ntrainloader = DataLoader(trainset, batch_size=4, shuffle=True)\ntestloader = DataLoader(testset, batch_size=4, shuffle=False)\n\n# Initialize and train the model\nmodel = SimpleCNN()\nmodel.train()\n\n# Define loss and optimizer\ncriterion = nn.CrossEntropyLoss()\noptimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)\n\n# Training loop\nfor epoch in range(1): # Just a couple of epochs for testing\n running_loss = 0.0\n for inputs, labels in trainloader:\n optimizer.zero_grad()\n outputs = model(inputs)\n loss = criterion(outputs, labels)\n loss.backward()\n optimizer.step()\n running_loss += loss.item()\n\nprint(\"Finished Training\")\n\n# Test the model using the BacktraceImageExplainer\nexplainer = DlBacktraceImageExplainer(model=model)\n\n\n# Get the explanation for the first image\nexplanation = explainer.explain(testloader, instance_idx=0, mode='default', scaler=1, thresholding=0, task='multi-class-classification')\n\n# Plot the explanation (relevance map)\nplt.imshow(explanation, cmap='hot')\nplt.colorbar()\nplt.title(\"Feature Relevance for CIFAR-10 Image\")\nplt.show()\n```\n\n#### **DlBacktraceImageExplainer**: `explain` Function Attributes\n\n| **Attribute** | **Description** | **Values** |\n|---------------|-----------------|-----------|\n| `test_data` | The input data, which can be a NumPy array, TensorFlow tensor, or Dataset. | `[np.ndarray, tf.Tensor, tf.data.Dataset]` |\n| `instance_idx` | The index of the test sample to explain. | `int` (explaining a single sample) |\n| `mode` | The explanation mode to use. | `{default, contrast}` |\n| `task` | The type of model task (e.g., classification). | `{binary-classification,multiclass-classification}` |\n| `scaler` | Total / Starting Relevance at the Last Layer\t | `float` ( Default: None, Preferred: 1) |\n| `thresholding` | Thresholding Model Prediction to predict the actual class. | `float` |\n| `contrast_mode` | Mode to Use if using 'contrast' mode of DlBacktrace Algorithm | `{Positive,Negative}` |\n\n---\n\n### Tabular Metrics Calculation\n\nThe **`xai_evals`** package provides a powerful class, **`ExplanationMetricsTabular`**, to evaluate the quality of explanations generated by SHAP and LIME. This class allows you to calculate several metrics, helping you assess the robustness, reliability, and interpretability of your model explanations. [NOTE: Metrics only supports Sklearn ML Models]\n\n#### ExplanationMetrics Class\n\n\nThe **`ExplanationMetricsTabular`** class in `xai_evals` provides a structured way to evaluate the quality and reliability of explanations generated by SHAP or LIME for machine learning models. By assessing multiple metrics, you can better understand how well these explanations align with your model's predictions and behavior.\n\n---\n\n#### Steps for Using ExplanationMetrics\n\n1. **Initialize ExplanationMetrics** \n Begin by creating an instance of the `ExplanationMetricsTabular` class with the necessary inputs, including the model, explainer type, dataset, and the task type.\n\n ```python\n from xai_evals.metrics import ExplanationMetricsTabular\n from xai_evals.explainer import SHAPExplainer\n from sklearn.ensemble import RandomForestClassifier\n import pandas as pd\n from sklearn.datasets import load_iris\n\n # Load dataset and train a model\n data = load_iris()\n X = pd.DataFrame(data.data, columns=data.feature_names)\n y = data.target\n model = RandomForestClassifier()\n model.fit(X, y)\n\n # Initialize ExplanationMetrics with SHAP explainer\n explanation_metrics = ExplanationMetricsTabular(\n model=model,\n explainer_name=\"shap\",\n X_train=X,\n X_test=X,\n y_test=y,\n features=X.columns,\n task=\"binary\"\n )\n ```\n\nFor **ExplanationMetricsTabular Class** we have several attributes :\n\n\n| Attribute | Description | Values |\n|--------------|-------------|--------|\n| model | Trained model which you want to explain | {binary-classification, multiclass-classification}|\n| X_train | Training Set Data | {pd.dataframe,numpy.array} |\n| explainer_name | Which explaination method to use | {'shap','lime','torch','tensorflow', 'backtrace'} |\n| X_test | Test Set Data | {pd.dataframe,numpy.array} |\n| y_test | Test Set Labels | pd.dataseries |\n| features | Features present in the Training/Testing Set | [list of features] |\n| task | Task performed by the model | {binary-classification,multiclass-classification} |\n| metrics | List of metrics to calculate | ['faithfulness', 'infidelity', 'sensitivity', 'comprehensiveness', 'sufficiency', 'monotonicity', 'complexity', 'sparseness'] |\n| method | For specifying which explaination Method to use in Torch/Tensorflow/Backtrace Explainer | Torch-{ 'integrated_gradients', 'deep_lift', 'gradient_shap','saliency', 'input_x_gradient', 'guided_backprop','shap_kernel', 'shap_deep','lime'}, Tensorflow-{'shap_kernel','shap_deep','lime'},Backtrace-{'Default','Contrastive'} |\n| start_idx | Starting index of the dataset to evaluate | int |\n| end_idx | Ending index of the dataset to evaluate | int |\n| scaler | Total / Starting Relevance at the Last Layer\tInteger ( For Backtrace) | int (Default: None, Preferred: 1) |\n|thresholding | Thresholding Model Prediction | float (default=0.5) |\n|subset_samples | If we want to use k-means based sampling to use a subset for SHAP Explainer (Only for SHAP) |\tTrue/False |\n|subset_number | Number of samples to sample if subset_samples is True (Only for SHAP) |\tint |\n\n\n2. **Calculate Explanation Metrics** \n Use the `calculate_metrics` method to compute various metrics for evaluating explanations. The method returns a DataFrame with the results.\n\n ```python\n # Calculate metrics\n metrics_df = explanation_metrics.calculate_metrics()\n print(metrics_df)\n ```\n\n---\n\n#### Explanation Metrics Overview\n\nThe **`ExplanationMetrics`** class supports the following key metrics for evaluating explanations:\n\n| **Metric** | **Purpose** | **Description** |\n|----------------------|----------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------|\n| **Faithfulness** | Measures consistency between attributions and prediction changes. | Correlation between attribution values and changes in model output when features are perturbed. |\n| **Infidelity** | Assesses how closely attributions align with the actual prediction impact. | Squared difference between predicted and actual impact when features are perturbed. |\n| **Sensitivity** | Evaluates the robustness of attributions to small changes in inputs. | Compares attribution values before and after perturbing input features. |\n| **Comprehensiveness**| Assesses the explanatory power of the top-k features. | Measures how much model prediction decreases when top-k important features are removed. |\n| **Sufficiency** | Determines whether top-k features alone are sufficient to explain the model's output. | Compares predictions based only on the top-k features to baseline predictions. |\n| **Monotonicity** | Verifies the consistency of attribution values with the direction of predictions. | Ensures that changes in attributions match consistent changes in predictions. |\n| **Complexity** | Measures the sparsity of explanations. | Counts the number of features with non-zero attribution values. |\n| **Sparseness** | Assesses how minimal the explanation is. | Calculates the proportion of features with zero attribution values. |\n\nReference Values for Available Metrics : \n\n| Metric | Typical Range | Interpretation | \"Better\" Direction |\n|------------------|--------------------------|---------------------------------------------------------------------------------------------------------------|-------------------------------------|\n| Faithfulness | -1 to 1 | Measures correlation between attributions and changes in model output when removing features. Higher indicates that more important features (according to the explanation) indeed cause larger changes in the model\u2019s prediction. | Higher is better (closer to 1) |\n| Infidelity | \u2265 0 | Measures how well attributions predict changes in the model\u2019s output under input perturbations. Lower infidelity means the attributions closely match the model\u2019s behavior under perturbations. | Lower is better (closer to 0) |\n| Sensitivity | \u2265 0 | Measures how stable attributions are to small changes in the input. Lower values mean more stable (robust) explanations. | Lower is better (closer to 0) |\n| Comprehensiveness | Depends on model output | Measures how much the prediction drops when the top-k most important features are removed. If removing them significantly decreases the prediction, it suggests these features are truly important. | Higher difference indicates more comprehensive explanations |\n| Sufficiency | Depends on model output | Measures how well the top-k features alone approximate or even match the original prediction. A higher (or less negative) value means these top-k features are sufficient on their own, capturing most of what the model uses. | Higher (or closer to zero if baseline is the original prediction) is generally better |\n| Monotonicity | 0 to 1 (as an average) | Checks if attributions are in a non-increasing order. A higher average indicates that the explanation presents a consistent ranking of feature importance. | Higher is better (closer to 1) |\n| Complexity | Depends on number of features | Measures the number of non-zero attributions. More features with non-zero attributions mean a more complex explanation. Fewer important features make it easier to interpret. | Lower is typically preferred |\n| Sparseness | 0 to 1 | Measures the fraction of attributions that are zero. Higher sparseness means fewer features are highlighted, making the explanation simpler. | Higher is generally preferred |\n\n---\n\n#### Practical Examples\n\n**1. Faithfulness Correlation**\n - Correlates feature attributions with prediction changes when features are perturbed. \n - Higher correlation indicates that the explanation aligns well with model predictions.\n\n ```python\n faithfulness_score = explanation_metrics.calculate_metrics()['faithfulness']\n print(\"Faithfulness:\", faithfulness_score)\n ```\n\n**2. Infidelity**\n - Computes the squared difference between predicted and actual changes in model output.\n - Lower scores indicate higher alignment of explanations with model behavior.\n\n ```python\n infidelity_score = explanation_metrics.calculate_metrics()['infidelity']\n print(\"Infidelity:\", infidelity_score)\n ```\n\n**3. Comprehensiveness**\n - Evaluates whether removing the top-k features significantly reduces the model's prediction confidence.\n - A higher score indicates that the top-k features are critical for the prediction.\n\n ```python\n comprehensiveness_score = explanation_metrics.calculate_metrics()['comprehensiveness']\n print(\"Comprehensiveness:\", comprehensiveness_score)\n ```\n\n---\n\n#### Example Output\n\nAfter calculating the metrics, the method returns a DataFrame summarizing the results:\n\n| Metric | Value |\n|-------------------|---------|\n| Faithfulness | 0.89 |\n| Infidelity | 0.05 |\n| Sensitivity | 0.13 |\n| Comprehensiveness | 0.62 |\n| Sufficiency | 0.45 |\n| Monotonicity | 1.00 |\n| Complexity | 7 |\n| Sparseness | 0.81 |\n\n---\n\n### Image Metrics Calculation\n\nThe **`xai_evals`** package provides a powerful class, **`ExplanationMetricsImage`**, to evaluate the quality of explanations generated for image-based deep learning models. This class allows you to calculate several metrics, helping you assess the robustness, reliability, and interpretability of your image explanations. [NOTE: Metrics currently support image-based deep learning models such as PyTorch and TensorFlow.]\n\n#### ExplanationMetricsImage Class\n\nThe **`ExplanationMetricsImage`** class in **`xai_evals`** provides a structured way to evaluate the quality and reliability of image-based explanations, such as GradCAM, Integrated Gradients, and Occlusion. By assessing multiple metrics, you can better understand how well these image explanations align with your model's predictions and behavior. This class uses **Quantus** to calculate the various metrics for evaluating explanations.\n\n---\n\n#### Steps for Using ExplanationMetricsImage\n\n1. **Initialize ExplanationMetricsImage** \n Begin by creating an instance of the **`ExplanationMetricsImage`** class with the necessary inputs, including the model, dataset, and evaluation settings.\n\n2. **Evaluate Explanation Metrics** \n Use the `evaluate` method to compute various metrics for evaluating image-based explanations. The method returns a dictionary with the results.\n\n ```python\n import torch\n import torchvision\n import torchvision.transforms as transforms\n from torch.utils.data import DataLoader\n from tensorflow.keras.datasets import cifar10\n from xai_evals.metrics import ExplanationMetricsImage\n from torchvision import models\n import tensorflow as tf\n import numpy as np\n import torch.optim as optim\n import tensorflow.keras as keras\n\n # --- TensorFlow Setup ---\n # Load CIFAR-10 dataset (for TensorFlow example)\n (x_train, y_train), (x_test, y_test) = cifar10.load_data()\n x_train, x_test = x_train / 255.0, x_test / 255.0 # Normalize the images\n train_data = (x_train, y_train) # Tuple of data and labels\n test_data = (x_test, y_test) # Tuple of data and labels\n\n # Convert to TensorFlow Dataset\n train_dataset_tf = tf.data.Dataset.from_tensor_slices(train_data).batch(32)\n test_dataset_tf = tf.data.Dataset.from_tensor_slices(test_data).batch(32)\n\n # --- PyTorch Setup ---\n # PyTorch Dataset for CIFAR-10\n transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\n trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)\n trainloader = DataLoader(trainset, batch_size=4, shuffle=True)\n\n\n # --- Custom Model Setup ---\n # Custom PyTorch model (simple CNN for CIFAR-10)\n class SimpleCNN(torch.nn.Module):\n def __init__(self):\n super(SimpleCNN, self).__init__()\n self.conv1 = torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1)\n self.conv2 = torch.nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)\n self.fc1 = torch.nn.Linear(64*8*8, 128)\n self.fc2 = torch.nn.Linear(128, 10) # 10 classes for CIFAR-10\n\n def forward(self, x):\n x = torch.relu(self.conv1(x))\n x = torch.max_pool2d(x, 2)\n x = torch.relu(self.conv2(x))\n x = torch.max_pool2d(x, 2)\n x = x.view(x.size(0), -1)\n x = torch.relu(self.fc1(x))\n x = self.fc2(x)\n return x\n\n # --- TensorFlow Model Setup ---\n model_tf = tf.keras.Sequential([\n tf.keras.layers.Conv2D(32, kernel_size=3, activation='relu', input_shape=(32, 32, 3)),\n tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),\n tf.keras.layers.Conv2D(64, kernel_size=3, activation='relu'),\n tf.keras.layers.MaxPooling2D(pool_size=(2, 2)),\n tf.keras.layers.Flatten(),\n tf.keras.layers.Dense(128, activation='relu'),\n tf.keras.layers.Dense(10) # 10 classes for CIFAR-10\n ])\n\n # Compile the model for training\n model_tf.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])\n # --- TensorFlow Model Training (1 Epoch) ---\n model_tf.fit(train_dataset_tf, epochs=1)\n\n print(\"Finished TensorFlow Training\")\n # Initialize PyTorch model\n model_torch = SimpleCNN()\n model_torch.train() # Set model to training mode\n\n # --- Training PyTorch Model for 1 Epoch ---\n criterion = torch.nn.CrossEntropyLoss()\n optimizer = optim.SGD(model_torch.parameters(), lr=0.001, momentum=0.9)\n\n for epoch in range(1): # Training for 1 epoch\n running_loss = 0.0\n for i, data in enumerate(trainloader, 0):\n inputs, labels = data\n optimizer.zero_grad()\n\n outputs = model_torch(inputs)\n loss = criterion(outputs, labels)\n loss.backward()\n optimizer.step()\n\n running_loss += loss.item()\n if i % 2000 == 1999: # Print every 2000 mini-batches\n print(f\"[{epoch + 1}, {i + 1}] loss: {running_loss / 2000:.3f}\")\n running_loss = 0.0\n\n print(\"Finished PyTorch Training\")\n # --- Example 1: PyTorch Metrics Calculation ---\n metrics_image_pytorch = ExplanationMetricsImage(\n model=model_torch, \n data_loader=trainloader, \n framework=\"torch\", \n num_classes=10\n )\n\n # Example: Calculate metrics using the PyTorch DataLoader\n metrics_results_pytorch = metrics_image_pytorch.evaluate(\n start_idx=0, end_idx=32, \n metric_names=[\"FaithfulnessCorrelation\",\"MaxSensitivity\",\"MPRT\",\"SmoothMPRT\",\"AvgSensitivity\",\"FaithfulnessEstimate\"], \n xai_method_name=\"IntegratedGradients\"\n )\n print(\"PyTorch Example Metrics:\", metrics_results_pytorch)\n # --- Example 2: TensorFlow Metrics Calculation ---\n metrics_image_tensorflow = ExplanationMetricsImage(\n model=model_tf, # Use TensorFlow model for TensorFlow example\n data_loader=train_dataset_tf,\n framework=\"tensorflow\",\n num_classes=10\n )\n\n # Example: Calculate metrics using the TensorFlow Dataset\n metrics_results_tensorflow = metrics_image_tensorflow.evaluate(\n start_idx=0, end_idx=32, \n metric_names=[\"FaithfulnessCorrelation\",\"MaxSensitivity\",\"MPRT\",\"SmoothMPRT\",\"AvgSensitivity\",\"FaithfulnessEstimate\"],\n xai_method_name=\"GradCAM\"\n )\n print(\"TensorFlow Example Metrics:\", metrics_results_tensorflow)\n # --- Example 3: Explain using a single image (numpy array) ---\n single_image_numpy = np.random.randn(1,3,32, 32) # Random image as a NumPy array, [H, W, C]\n label = np.random.randint(0, 9,size=1)\n\n # Initialize ExplanationMetricsImage for a single image (use PyTorch framework even for NumPy array)\n metrics_image_single = ExplanationMetricsImage(\n model=model_torch, # Use PyTorch model\n data_loader=(single_image_numpy,label), # Pass the single image as a numpy array\n framework=\"torch\", # Use the torch framework for single image\n num_classes=10,\n )\n\n # Calculate metrics for the single image\n metrics_single_image = metrics_image_single.evaluate(\n start_idx=0, end_idx=1, \n metric_names=[\"FaithfulnessCorrelation\",\"MaxSensitivity\",\"MPRT\",\"SmoothMPRT\",\"AvgSensitivity\",\"FaithfulnessEstimate\"],\n xai_method_name=\"IntegratedGradients\"\n )\n print(\"Single Image Example Metrics:\", metrics_single_image)\n # --- Example 4: TensorFlow Model with Single Image ---\n single_image_numpy = np.random.randn(1,32, 32,3) # Random image as a NumPy array, [H, W, C]\n label = np.random.randint(0, 9,size=1)\n # For TensorFlow, the single image example using TensorFlow framework\n metrics_image_single_tf = ExplanationMetricsImage(\n model=model_tf, # Use TensorFlow model\n data_loader=(single_image_numpy,label), # Pass the single image as a numpy array\n framework=\"tensorflow\", # Use the tensorflow framework for single image\n num_classes=10\n )\n\n # Calculate metrics for the single image\n metrics_single_image_tf = metrics_image_single_tf.evaluate(\n start_idx=0, end_idx=1, \n metric_names=[\"FaithfulnessCorrelation\",\"MaxSensitivity\",\"MPRT\",\"SmoothMPRT\",\"AvgSensitivity\",\"FaithfulnessEstimate\"],\n xai_method_name=\"GradCAM\"\n )\n print(\"TensorFlow Single Image Example Metrics:\", metrics_single_image_tf)\n\n\n ```\n\n---\n\n#### Explanation Metrics Overview\n\nThe **`ExplanationMetricsImage`** class supports the following key metrics for evaluating image explanations:\n\n| **Metric** | **Purpose** | **Description** |\n|--------------------------|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|\n| **FaithfulnessCorrelation** | Measures the correlation between attribution values and model output changes when perturbing image features. | Higher values indicate that important features (according to the explanation) indeed cause significant changes in the model\u2019s prediction. |\n| **MaxSensitivity** | Measures the maximum sensitivity of an attribution method to input perturbations. | Higher values suggest that the attribution method highlights the most sensitive parts of the image. |\n| **MPRT** | Measures the relevance of features based on perturbations. | Helps evaluate the robustness of the explanation when features are perturbed. |\n| **SmoothMPRT** | A smoother version of MPRT that reduces noise from perturbations. | Ensures more stable results by averaging perturbations. |\n| **AvgSensitivity** | Measures the average sensitivity of the model to input perturbations across all features. | Indicates how sensitive the model is to small changes in the input. |\n| **FaithfulnessEstimate** | Estimates the faithfulness of the attribution by comparing against a perturbation baseline. | Compares how well the explanation reflects the model\u2019s behavior under feature perturbations. |\n\nReference Values for Available Metrics:\n\n| Metric | Typical Range | Interpretation | \"Better\" Direction |\n|--------------------------|-------------------------|---------------------------------------------------------------------------------------------------------|--------------------------------------|\n| FaithfulnessCorrelation | -1 to 1 | Measures correlation between attribution values and changes in model output when features are perturbed. Higher indicates that more important features (according to the explanation) indeed cause larger changes in the model\u2019s prediction. | Higher is better (closer to 1) |\n| MaxSensitivity | \u2265 0 | Measures how well attributions match model sensitivity when perturbing image features. Lower scores indicate that the explanations focus on the most sensitive features. | Lower is better (closer to 0) |\n| MPRT | \u2265 0 | Measures how the perturbation of features affects the model\u2019s prediction. A higher score indicates that the model's prediction is heavily influenced by the perturbed features. | Higher is better |\n| SmoothMPRT | \u2265 0 | Measures the stability of MPRT under perturbation noise. Higher values suggest more stable explanations. | Higher is better |\n| AvgSensitivity | \u2265 0 | Measures the average change in prediction for small changes in input features. Indicates model robustness. | Lower is better |\n| FaithfulnessEstimate | 0 to 1 | Compares model predictions under perturbations and attributions. Higher values indicate better alignment. | Higher is better |\n\n---\n\n#### Initialization Attributes / Contructor for **`ExplanationMetricsImage`** class\n\n| **Attribute** | **Description** | **Values** |\n|----------------------|----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|\n| `model` | The trained model for which explanations will be evaluated. | [PyTorch model, TensorFlow model] |\n| `data_loader` | The data loader or dataset containing the test data. | [PyTorch Dataset,PyTorch DataLoader,TensorFlow Dataset, tuple of (image-np.array/torch.Tensor/tensorflow.Tensor.Tensor,label-np.array/torch.Tensor/tensorflow.Tensor)] |\n| `framework` | The framework used for the model (either 'torch' or 'tensorflow' or 'backtrace'). | {'torch', 'tensorflow','backtrace'} |\n| `device` | The device (CPU/GPU) used for performing computations (for PyTorch models). | [torch.device (Optional)] |\n| `num_classes` | The number of classes for classification tasks. | Integer (default: 10) |\n\n---\n\n#### Evaluate Function (`evaluate`) for **`ExplanationMetricsImage`** class to calculate metrics\n\n| **Attribute** | **Description** | **Values** |\n|-----------------------|----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|\n| `start_idx` | The starting index of the batch for evaluation. | Integer (e.g., 0) |\n| `end_idx` | The ending index of the batch for evaluation. | Integer (e.g., 100, or `None` for the entire batch) |\n| `metric_names` | The list of metric names to evaluate. | List of strings representing the metrics to compute (e.g., `[\"FaithfulnessCorrelation\", \"MaxSensitivity\", \"MPRT\", \"SmoothMPRT\", \"AvgSensitivity\", \"FaithfulnessEstimate\"]`) |\n| `xai_method_name` | The name of the XAI method used for explanations (e.g., 'IntegratedGradients', 'GradCAM', etc.). | String (e.g., for Torch `{grad_cam, integrated_gradients, saliency, deep_lift, gradient_shap, guided_backprop, occlusion, layer_gradcam, feature_ablation}` ; for Tensorflow `{VanillaGradients, GradCAM,GradientsInput,IntegratedGradients,OcclusionSensitivity,SmoothGrad` & for Backtrace `{default,contrast-positive,contrast-negative}`) |\n\n---\n\n\n\n#### Practical Examples\n\n**1. Faithfulness Correlation**\n - Correlates feature attributions with prediction changes when features (pixels) in the image are perturbed.\n - Higher correlation indicates that the explanation aligns well with model predictions.\n\n ```python\n faithfulness_score = metrics_image.evaluate(\n start_idx=0, end_idx=5, metric_names=[\"FaithfulnessCorrelation\"], xai_method_name=\"IntegratedGradients\"\n )['FaithfulnessCorrelation']\n print(\"Faithfulness:\", faithfulness_score)\n ```\n\n**2. Max Sensitivity**\n - Measures the sensitivity of the explanation method by observing the effect of perturbing different parts of the image.\n - A higher score indicates that the explanation method is sensitive to the most influential pixels.\n\n ```python\n max_sensitivity_score = metrics_image.evaluate(\n start_idx=0, end_idx=5, metric_names=[\"MaxSensitivity\"], xai_method_name=\"IntegratedGradients\"\n )['MaxSensitivity']\n print(\"Max Sensitivity:\", max_sensitivity_score)\n ```\n\n---\n\n#### Example Output\n\nAfter calculating the metrics, the method returns a dictionary summarizing the results:\n\n| Metric | Value |\n|--------------------------|---------|\n| FaithfulnessCorrelation | 0.88 |\n| MaxSensitivity | 0.92 |\n\n---\n\n#### Benefits of ExplanationMetrics\n\n- **Interpretability:** Quantifies how well feature attributions explain the model's predictions.\n- **Robustness:** Evaluates the stability of explanations under input perturbations.\n- **Comprehensiveness and Sufficiency:** Provides insights into the contribution of top features to the model\u2019s predictions.\n- **Scalability:** Works with various tasks, including binary classification, multi-class classification, and regression.\n\nBy leveraging these metrics, you can ensure that your explanations are meaningful, robust, and align closely with your model's decision-making process.\n\n---\n\n### Acknowledgements\n\nWe would like to extend our heartfelt thanks to the developers and contributors of the libraries **[Quantus](https://github.com/Trusted-AI/quantus)**, **[Captum](https://captum.ai/)**, **[tf-explain](https://github.com/sicara/tf-explain)**, **[LIME](https://github.com/marcotcr/lime)**, and **[SHAP](https://github.com/slundberg/shap)**, which have been instrumental in enabling the explainability methods implemented in this package.\n\n- **[Quantus](https://github.com/Trusted-AI/quantus)** provides a comprehensive suite of metrics that allow us to evaluate and assess the quality of explanations, ensuring that our interpretability methods are both reliable and robust.\n \n- **[Captum](https://captum.ai/)** is an invaluable tool for PyTorch users, offering a variety of powerful attribution methods like Integrated Gradients, Saliency, and Gradient Shap, which are crucial for generating insights into the inner workings of deep learning models.\n\n- **[tf-explain](https://github.com/sicara/tf-explain)** simplifies the process of explaining TensorFlow/Keras models, with methods like GradCAM and Occlusion Sensitivity, enabling us to generate visual explanations that help interpret the decision-making of complex models.\n\n- **[LIME](https://github.com/marcotcr/lime)** (Local Interpretable Model-Agnostic Explanations) has been a key library for providing local explanations for machine learning models, allowing us to generate understandable explanations for individual predictions.\n\n- **[SHAP](https://github.com/slundberg/shap)** (SHapley Additive exPlanations) is essential for computing Shapley values and provides a unified approach to explaining machine learning models, making it easier to understand feature contributions across a range of model types.\n\nWe are deeply grateful for the contributions these libraries have made in advancing model interpretability, and their seamless integration in our package ensures that users can leverage state-of-the-art methods for understanding machine learning and deep learning models.\n\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n---\n\n### Future Plans\n\nIn the future, we will continue to improve this library.\n\n--- \n\n## Citations\nThis code is free. So, if you use this code anywhere, please cite us:\n```\n@misc{seth2025xaievalsframeworkevaluating,\n title={xai_evals : A Framework for Evaluating Post-Hoc Local Explanation Methods}, \n author={Pratinav Seth and Yashwardhan Rathore and Neeraj Kumar Singh and Chintan Chitroda and Vinay Kumar Sankarapu},\n year={2025},\n eprint={2502.03014},\n archivePrefix={arXiv},\n primaryClass={cs.LG},\n url={https://arxiv.org/abs/2502.03014}, \n}\n```\n\n## Get in touch\nContanct us at [AryaXAI](https://www.aryaxai.com/).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A package for model explainability and explainability comparision for tabular data",
"version": "0.0.10",
"project_urls": {
"Homepage": "https://github.com/AryaXAI/xai_evals"
},
"split_keywords": [
"aryaxai deep learning backtrace",
" ml observability"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "33cdf08fee94ee40310d512857fe1d184b5562369e886cb8ca85b44436ce0ba2",
"md5": "5c9f13170dddd7f97c75e44481eead3f",
"sha256": "0fc91ac2100eb72e531c27bc830cc2724f8ac0a2a861c53a340fc9298476e65f"
},
"downloads": -1,
"filename": "xai_evals-0.0.10-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5c9f13170dddd7f97c75e44481eead3f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.0",
"size": 31272,
"upload_time": "2025-02-10T06:16:34",
"upload_time_iso_8601": "2025-02-10T06:16:34.429294Z",
"url": "https://files.pythonhosted.org/packages/33/cd/f08fee94ee40310d512857fe1d184b5562369e886cb8ca85b44436ce0ba2/xai_evals-0.0.10-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e5cc8379d2c135be5f7d092a09c9bee49429f65ccc4576ba4da779041c55d978",
"md5": "2d45d28253b4eea3bbe919f248ac14d3",
"sha256": "65d37553a948bdc1ee48788ed7b829803404b0614e39954ee05ea770a26760f5"
},
"downloads": -1,
"filename": "xai_evals-0.0.10.tar.gz",
"has_sig": false,
"md5_digest": "2d45d28253b4eea3bbe919f248ac14d3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.0",
"size": 155629,
"upload_time": "2025-02-10T06:16:36",
"upload_time_iso_8601": "2025-02-10T06:16:36.292675Z",
"url": "https://files.pythonhosted.org/packages/e5/cc/8379d2c135be5f7d092a09c9bee49429f65ccc4576ba4da779041c55d978/xai_evals-0.0.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-10 06:16:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AryaXAI",
"github_project": "xai_evals",
"github_not_found": true,
"lcname": "xai-evals"
}