Efficiency of Regression (Un)-Adjusted Rosenbaum's Rank-based Estimator in Randomized Experiments

A completely randomized experiment allows us to estimate the causal effect by the difference in the averages of the outcome under the treatment and control. But, difference-in-means type estimators behave poorly if the potential outcomes are heavy-tailed, or contain a few outliers. We study an alternative estimator by Rosenbaum that estimates the causal effect by inverting a randomization test using ranks. By calculating the asymptotic breakdown point of this estimator, we show that it is provably more robust than the difference-in-means estimator. We obtain the limiting distribution of this estimator and develop a framework to compare the efficiencies of different estimators of the treatment effect in the setting of randomized experiments. In particular, we show that the asymptotic variance of Rosenbaum's estimator is, in the worst case, about 1.16 times the variance of the difference-in-means estimator, and can be much smaller when the potential outcomes are not light-tailed. Further, we propose a regression adjusted version of Rosenbaum's estimator to incorporate additional covariate information in randomization inference. We prove gain in efficiency by this regression adjustment method under a linear regression model. Finally, we illustrate through synthetic and real data that these rank-based estimators, regression adjusted or unadjusted, are efficient and robust against heavy-tailed distributions, contamination, and model misspecification.
View on arXiv