Why XAI: Correlation drift

XKCD on correlation
XKCD strikes again

A team at Google describe one type of correlation drift:  

“Machine learning systems often have a difficult time distinguishing the impact of correlated features.  This may not seem like a major problem: if two features are always correlated, but only one is truly causal, it may still seem okay to ascribe credit to both and rely on their observed co-occurrence. However, if the world suddenly stops making these features co-occur, prediction behavior may change significantly.
– Machine Learning: The High Interest Credit Card of Technical Debt

In this case things that were correlated are no longer correlated and that is not because the underlying reality changed it is because system learned to generate results based on a correlation that was not robust.  There was no meaningful underlying casualty associated with that correlation.

Domain experts are well positioned to recognizes these non-robust correlations. However, they can only do that if they get some explanation with the results they are presented.  If the system is a black box then we will likely become aware of these brittle correlations only after we have been bitten in the rear in production.

One thought on “Why XAI: Correlation drift

Leave a Reply