v1v2v3 (latest)

Generative Graph Pattern Machine

22 May 2025

ArXiv (abs)PDF HTML Github (7★)

Main:10 Pages

7 Figures

Bibliography:5 Pages

10 Tables

Appendix:2 Pages

Abstract

Graph neural networks (GNNs) have been predominantly driven by message-passing, where node representations are iteratively updated via local neighborhood aggregation. Despite their success, message-passing suffers from fundamental limitations -- including constrained expressiveness, over-smoothing, over-squashing, and limited capacity to model long-range dependencies. These issues hinder scalability: increasing data size or model size often fails to yield improved performance. To this end, we explore pathways beyond message-passing and introduce Generative Graph Pattern Machine (G $^2$ PM), a generative Transformer pre-training framework for graphs. G $^2$ PM represents graph instances (nodes, edges, or entire graphs) as sequences of substructures, and employs generative pre-training over the sequences to learn generalizable and transferable representations. Empirically, G $^2$ PM demonstrates strong scalability: on the ogbn-arxiv benchmark, it continues to improve with model sizes up to 60M parameters, outperforming prior generative approaches that plateau at significantly smaller scales (e.g., 3M). In addition, we systematically analyze the model design space, highlighting key architectural choices that contribute to its scalability and generalization. Across diverse tasks -- including node/link/graph classification, transfer learning, and cross-graph pretraining -- G $^2$ PM consistently outperforms strong baselines, establishing a compelling foundation for scalable graph learning. The code and dataset are available atthis https URL.

View on arXiv

Comments on this paper