386

Co-clustering for Directed Graphs; the Stochastic Co-Blockmodel and a Spectral Algorithm

Abstract

Communities of highly connected actors form an essential feature in the structure of several empirical directed and undirected networks. However, compared to the amount of research on clustering for undirected graphs, there is relatively little understanding of clustering in directed networks. This paper extends the spectral clustering algorithm to directed networks in a way that co-clusters or bi-clusters the rows and columns of a graph Laplacian. Co-clustering leverages the increased complexity of asymmetric relationships to gain new insight into the structure of the directed network. To understand this algorithm and to study its asymptotic properties in a canonical setting, we propose the Stochastic Co-Blockmodel to encode co-clustering structure. This is the first statistical model of co-clustering and it is derived using the concept of stochastic equivalence that motivated the original Stochastic Blockmodel. Although directed spectral clustering is not derived from the Stochastic Co-Blockmodel, we show that, asymptotically, the algorithm can estimate the blocks in a high dimensional asymptotic setting in which the number of blocks grows with the number of nodes. The algorithm, model, and asymptotic results can all be extended to bipartite graphs.

View on arXiv
Comments on this paper