263

Sparse Precision Matrix Selection for Fitting Gaussian Random Field Models to Large Data Sets

Abstract

Fitting a Gaussian Random Field (GRF) model to spatial data via maximum likelihood (ML) requires optimizing a highly non-convex function. Iterative methods to solve the ML problem require O(n3)O(n^3) floating point operations per iteration, where nn denotes the number of data points observed, given that a n×nn \times n covariance matrix needs to be inverted at each iteration. Therefore, for large data sets, the non-convexity of the ML problem together with a O(n3)O(n^3) complexity render the traditional ML methodology very inefficient for GRF model fitting. In this paper, we propose a new two-step GRF estimation procedure which first solves a \emph{convex} distance-based regularized likelihood problem to fit a sparse {\em precision} (inverse covariance) matrix to the GRF model. This implies a Gaussian Markov Random Function (GMRF) approximation to the GRF, although we do not explicitly fit a GMRF. The Alternating Direction Method of Multipliers (ADMM) algorithm is used in this first stage. In a second step, we estimate the parameters of the GRF spatial covariance function by finding the covariance matrix that is closer in Frobenius norm to the inverse of the precision matrix obtained in stage one. We show how this second stage problem can be solved using a simple line search if an isotropic stationary covariance function is assumed. Numerical experiments on both synthetic and real data sets shows improvements of one order of magnitude in mean square prediction error over competitor methods for a variety of spatial covariance models. Furthermore, the covariance parameters estimated by our two-stage method are shown to be more precise and accurate than those obtained with alternative methods. The proposed approach can easily be parallelized to fit a GRF model to large data sets and can also be easily modified to allow for anisotropic covariance functions.

View on arXiv
Comments on this paper