19
54

Large-Scale Approximate Kernel Canonical Correlation Analysis

Abstract

Kernel canonical correlation analysis (KCCA) is a nonlinear multi-view representation learning technique with broad applicability in statistics and machine learning. Although there is a closed-form solution for the KCCA objective, it involves solving an N×NN\times N eigenvalue system where NN is the training set size, making its computational requirements in both memory and time prohibitive for large-scale problems. Various approximation techniques have been developed for KCCA. A commonly used approach is to first transform the original inputs to an MM-dimensional random feature space so that inner products in the feature space approximate kernel evaluations, and then apply linear CCA to the transformed inputs. In many applications, however, the dimensionality MM of the random feature space may need to be very large in order to obtain a sufficiently good approximation; it then becomes challenging to perform the linear CCA step on the resulting very high-dimensional data matrices. We show how to use a stochastic optimization algorithm, recently proposed for linear CCA and its neural-network extension, to further alleviate the computation requirements of approximate KCCA. This approach allows us to run approximate KCCA on a speech dataset with 1.41.4 million training samples and a random feature space of dimensionality M=100000M=100000 on a typical workstation.

View on arXiv
Comments on this paper