335

Semi-Supervised Learning for Asymmetric Graphs through Reach and Distance Diffusion

Abstract

Semi-supervised learning algorithms are an indispensable tool when labeled examples are scarce and there are many unlabeled examples [Blum and Chawla 2001, Zhu et. al. 2003]. With graph-based methods, entities (examples) correspond to nodes in a graph and edges correspond to related entities. The graph structure is used to infer implicit pairwise affinity values (kernel) which are used to compute the learned labels. Two powerful techniques to define such a kernel are "symmetric" spectral methods and Personalized Page Rank (PPR). With spectral methods, labels can be scalably learned using Jacobi iterations, but an inherent limiting issue is that they are applicable to (undirected) graphs, whereas often, such as with like, follow, or hyperlinks, relations between entities are inherently asymmetric. PPR naturally works with directed graphs but even with state of the art techniques does not scale when we want to learn billions of labels. Aiming at both high scalability and handling of directed relations, we propose here and kernels. Our design is inspired by models for influence diffusion in social networks, formalized and spawned from the seminal work of [Kempe, Kleinberg, and Tardos 2003]. These models apply with directed interactions and are naturally suited for asymmetry. We tailor these models to define a natural asymmetric "kernel" and design highly scalable algorithms for parameter setting and label learning.

View on arXiv
Comments on this paper