Unsupervised Concatenation Hashing via Combining Subspace Learning and Graph Embedding for Cross-Modal Image Retrieval

26 March 2019

Abstract

Different from the content-based image retrieval methods, cross-modal image retrieval methods uncover the rich semantic-level information of social images to further understand image contents. As multiple modal data depict a common object from multiple perspectives, many works focus on learning the unified subspace representation. Recently, hash representation has received much attention in the retrieval field. In common Hamming space, how to directly preserve the local manifold structure among objects become an interesting problem. Most of the unsupervised hashing methods attempt to solve it by constructing a neighborhood graph on every modality respectively. However, it is hard to decide the weight factor of each graph to get the optimal graph. To overcome this problem, we adopt the concatenated features to represent the common object since the information implied by different modalities is complementary. In our framework, Locally Linear Embedding and Locality Preserving Projection are introduced to reconstruct the manifold structure of the original space. Besides, The $\ell_{2,1}$ -norm constraint is imposed on the projection matrices to explore the discriminative hashing functions. Extensive experiments are performed on three public datasets and the experimental results show that our method outperforms several classic unsupervised hashing models.

View on arXiv

Comments on this paper