Fast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization

Matrix square roots and their inverses arise frequently in machine learning, e.g., when sampling from high-dimensional Gaussians or whitening a vector against covariance matrix . While existing methods typically require computation, we introduce a highly-efficient quadratic-time algorithm for computing , , and their derivatives through matrix-vector multiplication (MVMs). Our method combines Krylov subspace methods with a rational approximation and typically achieves decimal places of accuracy with fewer than MVMs. Moreover, the backward pass requires little additional computation. We demonstrate our method's applicability on matrices as large as - well beyond traditional methods - with little approximation error. Applying this increased scalability to variational Gaussian processes, Bayesian optimization, and Gibbs sampling results in more powerful models with higher accuracy.
View on arXiv