432
v1v2v3 (latest)

Generative Graph Pattern Machine

Main:10 Pages
7 Figures
Bibliography:5 Pages
10 Tables
Appendix:2 Pages
Abstract

Graph neural networks (GNNs) have been predominantly driven by message-passing, where node representations are iteratively updated via local neighborhood aggregation. Despite their success, message-passing suffers from fundamental limitations -- including constrained expressiveness, over-smoothing, over-squashing, and limited capacity to model long-range dependencies. These issues hinder scalability: increasing data size or model size often fails to yield improved performance. To this end, we explore pathways beyond message-passing and introduce Generative Graph Pattern Machine (G2^2PM), a generative Transformer pre-training framework for graphs. G2^2PM represents graph instances (nodes, edges, or entire graphs) as sequences of substructures, and employs generative pre-training over the sequences to learn generalizable and transferable representations. Empirically, G2^2PM demonstrates strong scalability: on the ogbn-arxiv benchmark, it continues to improve with model sizes up to 60M parameters, outperforming prior generative approaches that plateau at significantly smaller scales (e.g., 3M). In addition, we systematically analyze the model design space, highlighting key architectural choices that contribute to its scalability and generalization. Across diverse tasks -- including node/link/graph classification, transfer learning, and cross-graph pretraining -- G2^2PM consistently outperforms strong baselines, establishing a compelling foundation for scalable graph learning. The code and dataset are available atthis https URL.

View on arXiv
Comments on this paper