
Node similarity is a fundamental problem in graph analytics. However, node similarity between nodes in different graphs (inter-graph nodes) has not been investigated adequately yet. The inter-graph node similarity is important in learning a new graph based on the knowledge of an existing graph (transfer learning in graphs) and has applications in biological, communication, and social networks. In this paper, we propose a novel distance function for measuring inter-graph node similarity with edit distance, called NED. In NED, two inter-graph nodes are compared according to their local neighborhood topological structures which are unlabeled unordered k-adjacent trees. Since the computation problem of tree edit distance on unordered trees is NP-Complete. In this paper, we propose a modified tree edit distance, called TED* for comparing neighborhood trees. The TED* is also a metric distance as the original tree edit distance but more importantly, TED* is polynomially computable. Compared to existing inter-graph node similarity measures, not only NED is a metric for nodes that can admit efficient indexing methods, but also by using topological information, NED is a more precise measure for real-world applications such as graph de-anonymization. The efficiency and effectiveness of NED are empirically demonstrated by using real-world graphs.
View on arXiv