Experimental Evaluation of Individualized Treatment Rules

In recent years, the increasing availability of individual-level data has led to the rapid methodological development of individualized (or personalized) treatment rules (ITRs). These new tools are being deployed in a variety of fields including business, medicine, and politics. We propose to use a randomized experiment for evaluating the empirical performance of ITRs and quantifying its estimation uncertainty under the Neyman's repeated sampling framework. Unlike the existing methods, the proposed experimental evaluation requires neither modeling assumptions, asymptotic approximation, nor resampling method. As a result, it is applicable to any ITR including those based on complex machine learning algorithms. Our methodology also takes into account a budget constraint, which is an important consideration for policymakers with limited resources. Furthermore, we extend our theoretical results to the common situations, in which ITRs are estimated via cross-validation using the same experimental data as the one used for their evaluation. We show how to account for the additional uncertainty regarding the estimation of ITRs. Finally, we conduct a simulation study to demonstrate the accuracy of the proposed methodology in small samples. We also apply our methods to the Project STAR (Student-Teacher Achievement Ratio) experiment and compare the performance of ITRs based on several machine learning methods that are widely used for estimating heterogeneous treatment effects.
View on arXiv