Gaussian process (GP) models are effective non-linear models for numerous scientific applications. However, computation of their hyperparameters can be difficult when there is a large number of training observations (n) due to the O(n^3) cost of evaluating the likelihood function. Furthermore, non-identifiable hyperparameter values can induce difficulty in parameter estimation. Because of this, maximum likelihood estimation or Bayesian calibration is sometimes omitted and the hyperparameters are estimated with prediction-based methods such as a grid search using cross validation. Kriging, or prediction using a Gaussian process model, amounts to a weighted mean of the data, where training data close to the prediction location as determined by the form and hyperparameters of the kernel matrix are more highly weighted. Our analysis focuses on examination of the commonly utilized Matern covariance function, of which the radial basis function (RBF) kernel function is the infinity limit of the smoothness parameter. We first perform a collinearity analysis to motivate identifiability issues between the parameters of the Matern covariance function. We also demonstrate which of its parameters can be estimated using only the predictions. Considering the kriging weights for a fixed training data and prediction location as a function of the hyperparameters, we evaluate their sensitivities - as well as those of the predicted variance - with respect to said hyperparameters. We demonstrate the smoothness parameter nu is the most sensitive parameter in determining the kriging weights, particularly when the nugget parameter is small, indicating this is the most important parameter to estimate. Finally, we demonstrate the impact of our conclusions on performance and accuracy in a classification problem using a latent Gaussian process model with the hyperparameters selected via a grid search.
View on arXiv