On the Convergence of the EM Algorithm: A Data-Adaptive Analysis

The Expectation-Maximization (EM) algorithm is an iterative method to maximize the log-likelihood function for parameter estimation. Previous works on the convergence analysis of the EM algorithm have established results on the asymptotic (population level) convergence rate of the algorithm. In this paper, we give a data-adaptive analysis of the sample level local convergence rate of the EM algorithm. In particular, we show that the local convergence rate of the EM algorithm is a random variable derived from the data generating distribution, which adaptively yields the convergence rate of the EM algorithm on each finite sample data set from the same population distribution. We then give a non-asymptotic concentration bound of on the population level optimal convergence rate of the EM algorithm, which implies that in probability as the sample size . Our theory identifies the effect of sample size on the convergence behavior of sample EM sequence, and explains a surprising phenomenon in applications of the EM algorithm, i.e. the finite sample version of the algorithm sometimes converges faster even than the population version. We apply our theory to the EM algorithm on three canonical models and obtain specific forms of the adaptive convergence theorem for each model.
View on arXiv