LIME Explainer¶
Overview¶
LIME (Local Interpretable Model-agnostic Explanations) explains individual predictions by learning an interpretable model locally around the prediction. It perturbs the input data and observes the changes in predictions to understand feature importance.
Class: LIME
¶
Initialization¶
from chemxai.explainers import LIME
explainer = LIME(
model=model,
background_tensor=background_data,
test_tensor=test_data,
device=device,
mode='regression'
)
Parameters¶
- model (
torch.nn.Module
): The trained model to be explained - background_tensor (
torch.Tensor
): Training data used by LIME to understand feature distributions - test_tensor (
torch.Tensor
): Test data to be explained - device (
torch.device
): Device for computations (CPU or GPU) - mode (
str
, optional): Task type - 'regression' or 'classification' (default: 'regression')
Methods¶
explain_local(index, num_features=None)
¶
Generates a local explanation for a specific instance using LIME.
Parameters:
- index (int
): Index of the instance in the test set to be explained
- num_features (int
, optional): Number of features to include in explanation. If None, uses all features
Returns: - list: LIME importance values for each feature of the specified instance
Example:
# Explain the first instance
explanation = explainer.explain_local(index=0)
print(f"LIME values: {explanation}")
# Explain with top 10 features only
explanation_top10 = explainer.explain_local(index=0, num_features=10)
predict_fn(data)
¶
Internal prediction function that adapts the PyTorch model for LIME compatibility.
Parameters:
- data (numpy.ndarray
): Input data for prediction
Returns: - numpy.ndarray: Model predictions
Usage Example¶
import torch
from chemxai.explainers import LIME
from chemxai.data import qm9_tabular
# Load data
qm9 = qm9_tabular()
train_loader, _, test_loader, _, _, _, _ = qm9.get_paired_dataloaders_tabular(
batch_size=32,
descriptor_type='Morgan',
morgan_radius=3,
morgan_nBits=512
)
# Get background and test data
background_batch = next(iter(train_loader))
test_batch = next(iter(test_loader))
# Initialize explainer
explainer = LIME(
model=your_model,
background_tensor=background_batch[0],
test_tensor=test_batch[0],
device=torch.device('cuda' if torch.cuda.is_available() else 'cpu'),
mode='regression'
)
# Generate explanation for first instance
explanation = explainer.explain_local(index=0)
# Generate explanation with specific number of features
explanation_limited = explainer.explain_local(index=0, num_features=20)
Technical Details¶
How LIME Works¶
- Perturbation: Generates perturbed samples around the instance of interest
- Prediction: Gets model predictions for perturbed samples
- Weighting: Assigns weights based on proximity to original instance
- Fitting: Fits a simple interpretable model (linear model) to weighted samples
- Explanation: Returns coefficients of the interpretable model as feature importance
Feature Mapping¶
The explainer handles feature mapping by: - Using regular expressions to extract feature indices from LIME output - Maintaining consistent feature ordering with the original input - Returning zero importance for features not selected by LIME
Model Compatibility¶
- Works with any PyTorch model that accepts tabular input
- Supports both regression and classification tasks
- Automatically handles tensor conversion between PyTorch and NumPy
Configuration Options¶
The underlying LimeTabularExplainer
is configured with:
- training_data: Background tensor for understanding feature distributions
- mode: Regression or classification
- feature_names: Automatically generated feature names
- discretize_continuous: Set to True for better categorical handling
Performance Considerations¶
- Number of perturbations affects accuracy and computation time
- Larger feature spaces require more perturbations for stable results
- Background data size influences the quality of perturbations