111
27
v1v2v3 (latest)

Spectral statistics of large dimensional Spearman's rank correlation matrix and its application

Abstract

Let Q=(Q1,,Qn)\mathbf{Q}=(Q_1,\ldots,Q_n) be a random vector drawn from the uniform distribution on the set of all n!n! permutations of {1,2,,n}\{1,2,\ldots,n\}. Let Z=(Z1,,Zn)\mathbf{Z}=(Z_1,\ldots,Z_n), where ZjZ_j is the mean zero variance one random variable obtained by centralizing and normalizing QjQ_j, j=1,,nj=1,\ldots,n. Assume that Xi,i=1,,p\mathbf {X}_i,i=1,\ldots ,p are i.i.d. copies of 1pZ\frac{1}{\sqrt{p}}\mathbf{Z} and X=Xp,nX=X_{p,n} is the p×np\times n random matrix with Xi\mathbf{X}_i as its iith row. Then Sn=XXS_n=XX^* is called the p×np\times n Spearman's rank correlation matrix which can be regarded as a high dimensional extension of the classical nonparametric statistic Spearman's rank correlation coefficient between two independent random variables. In this paper, we establish a CLT for the linear spectral statistics of this nonparametric random matrix model in the scenario of high dimension, namely, p=p(n)p=p(n) and p/nc(0,)p/n\to c\in(0,\infty) as nn\to\infty. We propose a novel evaluation scheme to estimate the core quantity in Anderson and Zeitouni's cumulant method in [Ann. Statist. 36 (2008) 2553-2576] to bypass the so-called joint cumulant summability. In addition, we raise a two-step comparison approach to obtain the explicit formulae for the mean and covariance functions in the CLT. Relying on this CLT, we then construct a distribution-free statistic to test complete independence for components of random vectors. Owing to the nonparametric property, we can use this test on generally distributed random variables including the heavy-tailed ones.

View on arXiv
Comments on this paper