High-Dimensional Canonical Correlation Analysis

Abstract
This paper studies high-dimensional canonical correlation analysis (CCA) with an emphasis on vectors which define canonical variables. The paper shows that when two dimensions of data grow to infinity jointly and proportionally the classical CCA procedure for estimating those vectors fails to deliver a consistent estimate. This provides the first result on impossibility of the identification of canonical variables in CCA procedure when all dimensions are large. To offset, the paper derives the magnitude of the estimation error, which can be used in practice to assess the precision of CCA estimates. An application of the results to limestone grassland data set is provided.
View on arXivComments on this paper