Explaining Explanations

In the new e-book Interpretable Machine Learning Christoph Molnar provides a good framework for thinking about explanations from the point of view of a data scientist.

He articulates a range of different explanations relevant to XAI:

  • Algorithm transparency: How does the algorithm create the model?
  • Global, Holistic Model Interpretability: How does the trained model make predictions?
  • Global Model Interpretability on a Modular Level: How do parts of the model influence predictions?
  • Local Interpretability for a Single Prediction: Why did the model make a specific decision for an instance?
  • Local Interpretability for a Group of Prediction: Why did the model make specific decisions for a group of instances?

Also he suggests how to think about what makes a good explanation in general beyond the data science specific perspective.  In the list below I have merged Christoph’s framework with one we have used previously.

Good explanations are some mix of:

  • Contrastive: Humans usually don’t ask why a certain prediction was made, but rather why this prediction was made instead of another prediction. 
  • Selective: People don’t expect explanations to be exhaustive, they want to be guided to the most important elements.
  • Contextual: Explanations are part of a larger interaction and they are interpreted within the context of that interaction given the nature of the audiences perspective
  • Exceptional:  People focus more on abnormal causes to explain events where removing these abnormal causes would have changed the outcome a lot.
  • Truthful. Good explanations should be true.  However, they don’t need to be exhaustive and they don’t need to be formally provable.
  • Sensitive to prior beliefs:  Explanations are more effective if they are presented in a way that builds on existing beliefs and baseline processes. Cliff Kuang describes a good healthcare example.
  • General and probable:  In the absence of some abnormal event, a general explanation is usually judged to be good.
  • Causal: Technically XAI explanations are explaining why the model delivered a given result.  Of course users find that more satisfying if it reflects an actual causal relationship in the real world and not just a correlation.
  • More complete: Typically an explanation that accounts for more facts and observations is considered better than one that accounts for less
  • Falsifiable: this is not always possible, but of course we prefer explanations that are not just causal but are falsifiable to allow a form of verification
  • Hard to vary: good explanations are constrained in ways that makes them hard to vary.  They might be constrained by existing knowledge (such as knowledge about causal relationships) or easily generated knowledge (such as additional testable predictions that can be generated through the model).

The main take away from the above is the need to not think about an XAI explanation as a single unitary platonic ideal, but rather think about matching explanation to audience.

3 thoughts on “Explaining Explanations

Leave a Reply