What data scientists get wrong about explainability

The median data scientist sees explainability as an “irrational constraint that limits my ability to get max predictive power”.  You have to go more than one sigma to the right before you run into data scientists who clearly “get” XAI.

Here are ten examples of what the median data scientist gets wrong about the black box problem and explainability?

Judge AIs as alternatives rather than aides:

Considering machine learning systems as a replacement for a human makes it is easier to assume that no detailed explanation is needed so long as they do their job.  However, most AIs will be aides that augment human decision making and clearly need to explain their results.  Read more …

Expect stakeholders to “think more like me”:

Machine learning practitioners must adapt our systems to fit the expectations of the world, rather than expect the world to adapt to our expectations. Read more …

Optimize for model performance over enterprise utility:

It’s natural for data scientist to see everything as an optimization problem.  The trick is knowing what to optimize for.  It is tempting to work at the lab bench focused on optimizing your model as measured by the typical machine learning metrics.  What is harder to do but far more valuable is to optimize for the benefits and overall utility of the encompassing system.  Read more …

Value XAI only as a placebo: 

It is easy to imagine the only benefit of explainability is to placate the users.  That is a tempting idea, it is also wrong.  Read more …

Believe what is said is what will be heard:

It is common to have a disconnect between what a machine learning model is actually communicating vs. what the stakeholders are hearing.  Users can extrapolate the recognition of a pattern into unwarranted confidence in a presumed course of action.  Read more …

Provide a single explanation for all audiences:

Data scientists tend to think in terms of rigorous ideals.  They like clear cut goals and provably correct answers.  So it is natural to think that a given model or a specific result has a single definitive explanation.  But of course they don’t, different audiences need different explanations.  Read more …

Undervalue explanation friendly features:

Model features shouldn’t be judged just on statistical equivalence, run time speed and server resource use.  They should also be judged based whether they clarify the connection between our model and reality.  They should be judged based on how much they contribute to explanations and generalizability.  Read more … 

Fail to design for debugging:

Every system will have failures in production.  Robust systems design in from the start the mechanisms needed to quickly isolate and resolve those failures.  For example, if we make it easier to distinguish expected outliers from true errors we will accelerate troubleshooting.  Read more …

Assume rather than demonstrate generalizability:

We tend to be overly optimistic about our models ability to generalize.  Models that work well on the lab bench and in initial production use may still have latent limitations on their ability to succeed over time, geography and use case.  XAI approaches can reveal latent issues and allow non-data scientists to build confidence in how broadly a model can be applied.

Think moonshots are the model:

When we think of AI what comes to mind first are the moonshots: the decade long projects that use breakthrough new technology to implement radical solutions to huge problems.   However, consider the distribution of all machine learning projects over the next decade, only a tiny sliver of them will be these ultra expensive ground breaking projects.  We need to design our common toolsets and methodologies for the median machine learning project not the moonshots.  Read more …

Leave a Reply