A Generalization of the Pearson Correlation to Riemannian Manifolds
The increasing application of deep-learning is accompanied by a shift towards highly non-linear statistical models. In terms of their geometry it is natural to identify these models with Riemannian manifolds. The further analysis of the statistical models therefore raises the issue of a correlation measure, that (1) in the cutting planes of the tangent spaces equals the respective Pearson correlation and (2) extends to a correlation measure that is normalized with repect to the underlying Riemannian manifold. In this purpose the first section reconstitutes elementary properties of the Pearson correlation to derive a representation with respect to the a regression line. The second section introduces principal components to derive this respective line, and thereupon to generalize it to linear subspaces of it's embedding space. Thereby the theory is derived for generic elliptical distributions and the principal components are introduced as an orthogonal basis of the embedding space, that decorrelates elliptically distributed random vectors. In the subsequent section the spaces, that are spanned by principal components, are used to identify the tangent spaces of principal manifolds. As principal manifolds, however, are not assured to exist for arbitrary underlying densities, a class of smooth manifold based densities is introduced, that closes this gap. Finally the Riemann-Pearson correlation is defined, which is shown to generalize the Pearson correlation to densities of this respective class.
View on arXiv