247

Cross-Modal Similarity Learning : A Low Rank Bilinear Formulation

International Conference on Information and Knowledge Management (CIKM), 2014
Abstract

The cross-media retrieval problem has received much attention in recent years due to the rapidly increasing multimedia data on the Internet. A new approach to the problem was raised which intends to match the different modal features directly. Thus, how to get rid of the heterogeneity between the different modalities and match the different modal features with different dimensions become the key points in this research. On the other hand, the metric learning shows great power to learn a distance metric to explore the relationship between data points. However, the traditional metric learning algorithms are only focusing on one modality features, which suffers difficulties to handle the heterogeneous features with different dimensions. Thus, in this paper, we proposed a heterogeneous similarity learning algorithm based on the metric learning for the cross-modal feature matching. With the nuclear penalization, an accelerated proximal gradient algorithm is successfully imported to find the optimal solution with the fast convergence rate of O(1/t^2). We applied it to the image-text cross-media retrieval problem, and compared it with several popular and the state-of-the-art algorithms. Experiments on two well known databases show that the proposed method achieves the best performance compared to the state-of-the-art algorithms.

View on arXiv
Comments on this paper