Semi-Supervised Learning with Heterophily

9 December 2014

Abstract

We propose a novel linear semi-supervised learning formulation that is derived from a solid probabilistic framework: belief propagation. We show that our formulation generalizes a number of label propagation algorithms described in the literature by allowing them to propagate generalized assumptions about influences between classes of neighboring nodes. We call this formulation Semi-Supervised Learning with Heterophily (SSL-H). We also show how the affinity matrix can be learned from observed data with a simple convex optimization framework that is inspired by locally linear embedding. We call this approach Linear Heterophily Estimation (LHE). Experiments on synthetic data show that both approaches combined can learn heterophily of a graph with 1M nodes, 10M edges and few labels in under 1min, and give better labeling accuracies than a baseline method in the case of small fraction of explicitly labeled nodes.

View on arXiv

Comments on this paper