Method overview
Goals: understanding feature importance.
What it does: SHAP explains feature influence by a game-theoretic approach, treating features as players. Intuitively, the importance of a feature is computed by how much the feature helps score toward the prediction. To determine this contribution, different feature combinations (“teams”) are considered.
Limitations: Computing SHAP values exactly is computationally prohibitive because the number of feature combinations is exponential in the number of features.
XAI taxonomy: model-agnostic, post-hoc, local (global possible when used for multiple instances)
Details and further intuition
Core intuition: SHAP assigns each feature a value that represents how much that feature contributes to a specific prediction, based on principles from cooperative game theory. SHAP treats each feature as a “player” in a game whose goal is to make a prediction. Each feature is imagined to join a team of every possible constitution, and its contribution is measured by how much it helps the team’s score each time it joins. The final value each feature gets is the average of all these contributions over all possible team combinations. Therefore, the process does not only consider individual impacts but also how features work together.
- The “payout” (model prediction) is split among all features according to how important each is in making the prediction
- The sum of all SHAP values always equals the difference between the model’s output and a baseline, so each value clearly shows individual impact
TODO: SHAP force plots, some formalization.
Further resources
- original paper: A Unified Approach to Interpreting Model Predictions (S. Lundberg, 2017)
- Code/Packages/Implementations:
- Exploring SHAP explanations for image classification – Christian Garbin’s blog – CIFAR 10 attributions
- Python Library: SHAP
- from this project see also: Grad-CAM (Gradient-weighted Class Activation Mapping), LIME (Local Interpretable Model-Agnostic Explanations)