Actionable Interpretability through Optimizable Counterfactual Explanations for Tree Ensembles

27 November 2019

Ana Lucic

Maarten de Rijke

Abstract

Model interpretability has become an important problem in ML due to the increased effect algorithmic decisions have on humans. Counterfactual explanations can help users understand not only why ML models make certain decisions, but also how these decisions can be changed. We frame the problem of finding counterfactual explanations as an optimization task and extend previous work that could only be applied to differentiable models. In order to accommodate non-differentiable models such as tree ensembles, we propose using probabilistic model approximations in the optimization framework. We introduce a novel approximation technique that is effective for finding counterfactual explanations for predictions of the original model and show that our counterfactual examples are significantly closer to the original instances compared to other methods specifically designed for tree ensembles.

View on arXiv

Comments on this paper