The Bias and Efficiency of Incomplete-Data Estimators in Small Univariate Normal Samples

14 April 2012

Abstract

Recent simulations have shown that widely used methods for analyzing missing data can be biased in small samples, even when the underlying statistical model is correctly specified. In an effort to understand these biases, this paper analyzes in detail the situation where a small univariate normal sample is missing values at random. Estimates are derived using either observed-data maximum likelihood (ML) or multiple imputation (MI). We distinguish two types of MI: the usual Bayesian approach, which we call posterior draw (PD) imputation, and a little-used alternative, which we call ML imputation, in which values are imputed conditionally on an ML estimate. We find that PD imputation has a large bias and low efficiency when the usual prior is used; however, modifying the prior can substantially improve both bias and efficiency. ML imputation dominates PD imputation, with greater efficiency and less potential for bias. Observed-data ML dominates both ML imputation and PD imputation.

View on arXiv

Comments on this paper