SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

28 July 2016

Yuxin Peng

Junchao Zhang

Abstract

The hashing methods have been widely used for efficient similarity retrieval on large scale image datasets. The traditional hashing methods learn hash functions to generate binary codes from hand-crafted features, which achieve limited accuracy since the hand-crafted features cannot optimally represent the image content and preserve the semantic similarity. Recently, several deep hashing methods have shown better performance because the deep architectures generate more discriminative feature representations. However, these deep hashing methods are mainly designed for the supervised scenarios, which only exploit the semantic similarity information, but ignore the underlying data structures. In this paper, we propose the semi-supervised deep hashing (SSDH) method, to perform more effective hash learning by simultaneously preserving the semantic similarity and the underlying data structures. Our proposed approach can be divided into two phases. First, a deep network is designed to extensively exploit both the labeled and unlabeled data, in which we construct the similarity graph online in a mini-batch with the deep feature representations. To the best of our knowledge, our proposed deep network is the first deep hashing method that can perform the hash code learning and feature learning simultaneously in a semi-supervised fashion. Second, we propose a loss function suitable for the semi-supervised scenario by jointly minimizing the empirical error on the labeled data as well as the embedding error on both the labeled and unlabeled data, which can preserve the semantic similarity, as well as capture the meaningful neighbors on the underlying data structures for effective hashing. Experiment results on 4 widely used datasets show that the proposed approach outperforms state-of-the-art hashing methods.

View on arXiv

Comments on this paper