LIME Explainer¶

Overview¶

LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by learning an interpretable model locally around the prediction. It perturbs the input data and observes the changes in predictions to understand feature importance.

Class: `LIME`¶

Initialization¶

from chemxai.explainers import LIME

explainer = LIME(
    model=model,
    background_tensor=background_data,
    test_tensor=test_data,
    device=device,
    mode='regression'
)

Parameters¶

model (torch.nn.Module): The trained model to be explained
background_tensor (torch.Tensor): Training data used by LIME to understand feature distributions
test_tensor (torch.Tensor): Test data to be explained
device (torch.device): Device for computations (CPU or GPU)
mode (str, optional): Task type - 'regression' or 'classification' (default: 'regression')

Methods¶

`explain_local(index, num_features=None)`¶

Generates a local explanation for a specific instance using LIME.

Parameters: - index (int): Index of the instance in the test set to be explained - num_features (int, optional): Number of features to include in explanation. If None, uses all features

Returns: - list: LIME importance values for each feature of the specified instance

Example:

# Explain the first instance
explanation = explainer.explain_local(index=0)
print(f"LIME values: {explanation}")

# Explain with top 10 features only
explanation_top10 = explainer.explain_local(index=0, num_features=10)

`predict_fn(data)`¶

Internal prediction function that adapts the PyTorch model for LIME compatibility.

Parameters: - data (numpy.ndarray): Input data for prediction

Returns: - numpy.ndarray: Model predictions

Usage Example¶

import torch
from chemxai.explainers import LIME
from chemxai.data import qm9_tabular

# Load data
qm9 = qm9_tabular()
train_loader, _, test_loader, _, _, _, _ = qm9.get_paired_dataloaders_tabular(
    batch_size=32, 
    descriptor_type='Morgan', 
    morgan_radius=3, 
    morgan_nBits=512
)

# Get background and test data
background_batch = next(iter(train_loader))
test_batch = next(iter(test_loader))

# Initialize explainer
explainer = LIME(
    model=your_model,
    background_tensor=background_batch[0],
    test_tensor=test_batch[0],
    device=torch.device('cuda' if torch.cuda.is_available() else 'cpu'),
    mode='regression'
)

# Generate explanation for first instance
explanation = explainer.explain_local(index=0)

# Generate explanation with specific number of features
explanation_limited = explainer.explain_local(index=0, num_features=20)

Technical Details¶

How LIME Works¶

Perturbation: Generates perturbed samples around the instance of interest
Prediction: Gets model predictions for perturbed samples
Weighting: Assigns weights based on proximity to original instance
Fitting: Fits a simple interpretable model (linear model) to weighted samples
Explanation: Returns coefficients of the interpretable model as feature importance

Feature Mapping¶

The explainer handles feature mapping by: - Using regular expressions to extract feature indices from LIME output - Maintaining consistent feature ordering with the original input - Returning zero importance for features not selected by LIME

Model Compatibility¶

Works with any PyTorch model that accepts tabular input
Supports both regression and classification tasks
Automatically handles tensor conversion between PyTorch and NumPy

Configuration Options¶

The underlying LimeTabularExplainer is configured with: - training_data: Background tensor for understanding feature distributions - mode: Regression or classification - feature_names: Automatically generated feature names - discretize_continuous: Set to True for better categorical handling

Performance Considerations¶

Number of perturbations affects accuracy and computation time
Larger feature spaces require more perturbations for stable results
Background data size influences the quality of perturbations