113
27

Spectral statistics of large dimensional Spearman's rank correlation matrix and its application

Abstract

Let Q=(Q1,,Qn)\mathbf{Q}=(Q_1,\ldots,Q_n) be a random vector drawn from the uniform distribution on the set of all n!n! permutations of {1,2,,n}\{1,2,\ldots,n\}. Let Z=(Z1,,Zn)\mathbf{Z}=(Z_1,\ldots,Z_n), where ZjZ_j is the mean zero variance one random variable obtained by centralizing and normalizing QjQ_j, j=1,,nj=1,\ldots,n. Assume that Xi,i=1,,p\mathbf{X}_i,i=1,\ldots,p are i.i.d. copies of 1pZ\frac{1}{\sqrt{p}}\mathbf{Z} and X=Xp,nX=X_{p,n} is the p×np\times n random matrix with Xi\mathbf{X}_i as its ii-th row. Then Sn=XXS_n=XX^* is called the p×np\times n Spearman's rank correlation matrix which can be regarded as a high dimensional extension of the classical non-parametric statistic Spearman's rank correlation coefficient between two independent random variables. In this paper we will establish a CLT for the linear spectral statistics of this non-parametric random matrix model in the scenario of high dimension supposing that p=p(n)p=p(n) and p/nc(0,)p/n\to c\in(0,\infty) as nn\to \infty. We propose a novel evaluation scheme to estimate the core quantity in Anderson and Zeitouni's cumulant method in \cite{AZ2009} to bypass the so called joint cumulant summability. In addition, we raise a {\emph{two-step comparison approach}} to obtain the explicit formulae for the mean and covariance functions in the CLT. Relying on this CLT we then construct a distribution-free statistic to test complete independence for components of random vectors. Owing to the non-parametric property, we can use this test on generally distributed random variables including the heavy-tailed ones.

View on arXiv
Comments on this paper