Guaranteed Classification via Regularized Similarity Learning

Learning an appropriate (dis)similarity function from the available data is a central problem in machine learning, since the success of many machine learning algorithms critically depends on the choice of a similarity function to compare examples. Despite many approaches for similarity metric learning have been proposed, there is little theoretical study on the links between similarity metric learning and the classification performance of the result classifier. In this paper, we propose a regularized similarity learning formulation associated with general matrix-norms, and establish their generalization bounds. We show that the generalization error of the resulting linear separator can be bounded by the derived generalization bound of similarity learning. This shows that a good generalization of the learnt similarity function guarantees a good classification of the resulting linear classifier. Our results extend and improve those obtained by Bellet at al. \cite{Bellet}. Due to the techniques dependent on the notion of uniform stability \cite{Bous}, the bound obtained there holds true only for the Frobenius matrix-norm regularization, which has a strong dependence on the dimensionality of the input space. Our techniques using the Rademacher complexity \cite{BM} and its related Khinchin-type inequality, in the cases of sparse -norm and mixed -norm regularization, enables us to obtain bounds that have a mild dependence on the input dimensionality.
View on arXiv