369

Neural network-based clustering using pairwise constraints

Abstract

In this work, we address the problem of finding a clustering of high-dimensional data using only pairwise constraints provided as input. Our strategy utilizes the back-propagation algorithm for optimizing neural networks to discover the clusters, while at the same time the features are also learned during the same training process. In order to do this, we design a novel architecture that can incorporate cost functions associated with KL divergence in order to minimize the distance for similar pairs while maximizing the distance for dissimilar pairs. We also propose an implementation that optimizes the parameters of the architecture more efficiently than a naive implementation, e.g. via Siamese networks. Experiments on MNIST and CIFAR-10 show that the accuracy of the proposed approach is comparable or exceeds the results of classification. Reasonable clusters could also be discovered when only partial pairwise constraints were available. The robustness analysis shows the method can tolerate moderate noise and is largely insensitive to the number of clusters.

View on arXiv
Comments on this paper