Model Selection for Treatment Choice: Penalized Welfare Maximization
This paper studies a new statistical decision rule for the treatment assignment problem. Consider a utilitarian policy maker who must use sample data to allocate one of two treatments to members of a population, based on their observable characteristics. In practice, it is often the case that policy makers do not have full discretion on how these covariates can be used, for legal, ethical or political reasons. We treat this constrained problem as a statistical decision problem, where we evaluate the performance of decision rules by their maximum regret. We focus on settings in which the policy maker may want to select amongst a collection of such constrained classes: examples we consider include choosing the number of covariates over which to perform best-subset selection, and model selection when approximating a complicated class via a sieve. We adapt and extend results from statistical learning to develop a decision rule which we call the Penalized Welfare Maximization (PWM) rule. We establish an oracle inequality for the regret of the PWM rule which shows that it is able to perform model selection over the collection of available classes. We then use this oracle inequality to derive relevant bounds on maximum regret for PWM. We illustrate the model-selection capabilities of our method with a small simulation exercise, and conclude by applying our rule to data from the Job Training Partnership Act (JTPA) study.
View on arXiv