Detection of Block-Exchangeable Structure in High-Dimensional
Correlation Matrices

Correlation matrices are omnipresent in multivariate data analysis. When the number of variables is large, however, their sample estimates are typically noisy and may muddle up underlying dependence patterns. In this article, we assume that the variables under study can be grouped into clusters with exchangeable dependence. Under this partial exchangeability condition, the corresponding correlation matrix has a block structure and the number of unknown parameters is reduced from to at most . We propose an efficient algorithm to identify the clusters without assuming the knowledge of a priori. As a by-product, we obtain an improved estimator of the correlation matrix and its inverse, along with its asymptotic variance. Our procedure is based on Kendall's rank correlation, which makes it robust, margin-free, and valid whenever the marginal distributions are continuous; no assumption of multivariate Normality is required. When the data are Normal or more generally elliptical, our results are easily extended to classical linear correlation matrices and their inverses, as we show. The procedure is illustrated on financial stock returns; the clusters identified nicely correspond to different business sectors. Technical proofs and the R-code of the algorithm are provided in the Online Supplement.
View on arXiv