11
13

Bootstrapping the Operator Norm in High Dimensions: Error Estimation for Covariance Matrices and Sketching

Abstract

Although the operator (spectral) norm is one of the most widely used metrics for covariance estimation, comparatively little is known about the fluctuations of error in this norm. To be specific, let Σ^\hat\Sigma denote the sample covariance matrix of nn observations in Rp\mathbb{R}^p that arise from a population matrix Σ\Sigma, and let Tn=nΣ^ΣopT_n=\sqrt{n}\|\hat\Sigma-\Sigma\|_{\text{op}}. In the setting where the eigenvalues of Σ\Sigma have a decay profile of the form λj(Σ)j2β\lambda_j(\Sigma)\asymp j^{-2\beta}, we analyze how well the bootstrap can approximate the distribution of TnT_n. Our main result shows that up to factors of log(n)\log(n), the bootstrap can approximate the distribution of TnT_n at the dimension-free rate of nβ1/26β+4n^{-\frac{\beta-1/2}{6\beta+4}}, with respect to the Kolmogorov metric. Perhaps surprisingly, a result of this type appears to be new even in settings where p<np< n. More generally, we discuss the consequences of this result beyond covariance matrices and show how the bootstrap can be used to estimate the errors of sketching algorithms in randomized numerical linear algebra (RandNLA). An illustration of these ideas is also provided with a climate data example.

View on arXiv
Comments on this paper