Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning

12 June 2017

Sören R. Künzel

Abstract

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of meta-algorithms that can take advantage of any machine learning or regression method to estimate the conditional average treatment effect (CATE) function. Meta-algorithms build on base algorithms---such as OLS, the Nadaraya-Watson estimator, Random Forests (RF), Bayesian Average Regression Trees (BART) or neural networks---to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a new meta-algorithm, the X--learner, that is provably efficient when the number of units in one treatment group is much larger than another, and it can exploit structural properties of the CATE function. For example, if the CATE function is parametrically linear and the response functions in treatment and control are Lipschitz continuous, the X--learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X--learner that uses RF and BART as base learners. In our extensive simulation studies, the X--learner performs favorably, although none of the meta-learners is uniformly the best. We also analyze two real data applications, and provide a software package that implements our methods.+

View on arXiv

Comments on this paper