88
10
v1v2 (latest)

Detecting Rare and Weak Spikes in Large Covariance Matrices

Abstract

Given pp-dimensional Gaussian vectors XiiidN(0,Σ)X_i \stackrel{iid}{\sim} N(0, \Sigma), 1in1 \leq i \leq n, where pnp \geq n, we are interested in testing a null hypothesis where Σ=Ip\Sigma = I_p against an alternative hypothesis where all eigenvalues of Σ\Sigma are 11, except for rr of them are larger than 11 (i.e., spiked eigenvalues). We consider a Rare/Weak setting where the spikes are sparse (i.e., 1rp1 \ll r \ll p) and individually weak (i.e., each spiked eigenvalue is only slightly larger than 11), and discover a phase transition: the two-dimensional phase space that calibrates the spike sparsity and strengths partitions into the Region of Impossibility and the Region of Possibility. In Region of Impossibility, all tests are (asymptotically) powerless in separating the alternative from the null. In Region of Possibility, there are tests that have (asymptotically) full power. We consider a CuSum test, a trace-based test, an eigenvalue-based Higher Criticism test, and a Tracy-Widom test (Johnstone 2001), and show that the first two tests have asymptotically full power in Region of Possibility. To use our results from a different angle, we derive new bounds for (a) empirical eigenvalues, and (b) cumulative sums of the empirical eigenvalues, both under the alternative hypothesis. Part (a) is related to those in Baik, Ben-Arous and Peche (2005), but both the settings and results are different. The study requires careful analysis of the L1L^1-distance of our testing problem and delicate Radom Matrix Theory. Our technical devises include (a) a Gaussian proxy model, (b) Le Cam's comparison of experiments, and (c) large deviation bounds on empirical eigenvalues.

View on arXiv
Comments on this paper