Optimal Experimental Design for Staggered Rollouts

9 November 2019

Abstract

In this paper, we study the problem of designing experiments that are conducted on a set of units such as users or groups of users in an online marketplace, for multiple time periods such as weeks or months. These experiments are particularly useful to study the treatments that have causal effects on both current and future outcomes (instantaneous and lagged effects). The design problem involves selecting a treatment time for each unit, before or during the experiment, in order to most precisely estimate the instantaneous and lagged effects, post experimentation. This optimization of the treatment decisions can directly minimize the opportunity cost of the experiment by reducing its sample size requirement. The optimization is an NP-hard integer program for which we provide a near-optimal solution, when the design decisions are performed all at the beginning (fixed-sample-size designs). Next, we study sequential experiments that allow adaptive decisions during the experiments, and also potentially early stop the experiments, further reducing their cost. However, the sequential nature of these experiments complicates both the design phase and the estimation phase. We propose a new algorithm, PGAE, that addresses these challenges by adaptively making treatment decisions, estimating the treatment effects, and drawing valid post-experimentation inference. PGAE combines ideas from Bayesian statistics, dynamic programming, and sample splitting. Using synthetic experiments on real data sets from multiple domains, we demonstrate that our proposed solutions for fixed-sample-size and sequential experiments reduce the opportunity cost of the experiments by over 50% and 70%, respectively, compared to benchmarks.

View on arXiv

Comments on this paper