GraPE: fast and scalable Graph Processing and Embedding

Nature Computational Science (Nat. Comput. Sci.), 2021

12 October 2021

Christopher J. Mungall

Peter N. Robinson

Justin P Reese

ArXiv (abs)PDF HTML

Abstract

Graph Representation Learning methods opened new possibilities for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE, a software resource for graph processing and representation learning that is able to scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation. When compared with state of the art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as a substantial and statistically significant improvement in edge prediction and node label prediction performance. Furthermore, GRAPE provides over 80, 000 graphs from the literature and other sources, standardized interfaces allowing a straightforward integration of third-party libraries, 61 node embedding methods, 25 inference models, and 3 modular pipelines to allow a FAIR and reproducible comparison of methods and libraries for graph processing and embedding.

View on arXiv

Comments on this paper