Eigenvector overlaps in large sample covariance matrices and nonlinear shrinkage estimators

28 April 2024

Abstract

Consider a data matrix $Y = [\mathbf{y}_1, \cdots, \mathbf{y}_N]$ of size $M \times N$ , where the columns are independent observations from a random vector $\mathbf{y}$ with zero mean and population covariance $\Sigma$ . Let $\mathbf{u}_i$ and $\mathbf{v}_j$ denote the left and right singular vectors of $Y$ , respectively. This study investigates the eigenvector/singular vector overlaps $\langle {\mathbf{u}_i, D_1 \mathbf{u}_j} \rangle$ , $\langle {\mathbf{v}_i, D_2 \mathbf{v}_j} \rangle$ and $\langle {\mathbf{u}_i, D_3 \mathbf{v}_j} \rangle$ , where $D_k$ are general deterministic matrices with bounded operator norms. We establish the convergence in probability of these eigenvector overlaps toward their deterministic counterparts with explicit convergence rates, when the dimension $M$ scales proportionally with the sample size $N$ . Building on these findings, we offer a more precise characterization of the loss for Ledoit and Wolf's nonlinear shrinkage estimators of the population covariance $\Sigma$ .

View on arXiv

Comments on this paper