v1v2 (latest)

Detecting Rare and Weak Spikes in Large Covariance Matrices

4 September 2016

Abstract

Given $p$ -dimensional Gaussian vectors $X_i \stackrel{iid}{\sim} N(0, \Sigma)$ , $1 \leq i \leq n$ , where $p \geq n$ , we are interested in testing a null hypothesis where $\Sigma = I_p$ against an alternative hypothesis where all eigenvalues of $\Sigma$ are $1$ , except for $r$ of them are larger than $1$ (i.e., spiked eigenvalues). We consider a Rare/Weak setting where the spikes are sparse (i.e., $1 \ll r \ll p$ ) and individually weak (i.e., each spiked eigenvalue is only slightly larger than $1$ ), and discover a phase transition: the two-dimensional phase space that calibrates the spike sparsity and strengths partitions into the Region of Impossibility and the Region of Possibility. In Region of Impossibility, all tests are (asymptotically) powerless in separating the alternative from the null. In Region of Possibility, there are tests that have (asymptotically) full power. We consider a CuSum test, a trace-based test, an eigenvalue-based Higher Criticism test, and a Tracy-Widom test (Johnstone 2001), and show that the first two tests have asymptotically full power in Region of Possibility. To use our results from a different angle, we derive new bounds for (a) empirical eigenvalues, and (b) cumulative sums of the empirical eigenvalues, both under the alternative hypothesis. Part (a) is related to those in Baik, Ben-Arous and Peche (2005), but both the settings and results are different. The study requires careful analysis of the $L^1$ -distance of our testing problem and delicate Radom Matrix Theory. Our technical devises include (a) a Gaussian proxy model, (b) Le Cam's comparison of experiments, and (c) large deviation bounds on empirical eigenvalues.

View on arXiv

Comments on this paper