58

On singular values of data matrices with general independent columns

Abstract

In this paper, we analyse singular values of a large p×np\times n data matrix Xn=(xn1,,xnn)\mathbf{X}_n= (\mathbf{x}_{n1},\ldots,\mathbf{x}_{nn}) where the column xnj\mathbf{x}_{nj}'s are independent pp-dimensional vectors, possibly with different distributions. Such data matrices are common in high-dimensional statistics. Under a key assumption that the covariance matrices Σnj=Cov(xnj)\mathbf{\Sigma}_{nj}=\text{Cov}(\mathbf{x}_{nj}) can be asymptotically simultaneously diagonalizable, and appropriate convergence of their spectra, we establish a limiting distribution for the singular values of Xn\mathbf{X}_n when both dimension pp and nn grow to infinity in a comparable magnitude. The matrix model goes beyond and includes many existing works on different types of sample covariance matrices, including the weighted sample covariance matrix, the Gram matrix model and the sample covariance matrix of linear times series models. Furthermore, we develop two applications of our general approach. First, we obtain the existence and uniqueness of a new limiting spectral distribution of realized covariance matrices for a multi-dimensional diffusion process with anisotropic time-varying co-volatility processes. Secondly, we derive the limiting spectral distribution for singular values of the data matrix for a recent matrix-valued auto-regressive model. Finally, for a generalized finite mixture model, the limiting spectral distribution for singular values of the data matrix is obtained.

View on arXiv
Comments on this paper