Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees

21 December 2024

Abstract

Foundation models aim to create general, cross-task, and cross-domain machine learning models by pretraining on large-scale datasets to capture shared patterns or concepts, such as contours, colors, textures, and edges in images, or tokens, words, and sentences in text. However, identifying generalities across graph-structured data remains a significant challenge, as different graph-based tasks necessitate distinct inductive biases, thereby impeding the development of graph foundation models. To address this challenge, we introduce a novel approach for learning cross-task generalities in graphs. Specifically, we propose task-trees as basic learning instances to align task spaces (node, link, graph) on graphs. Then, we conduct a theoretical analysis to examine their stability, transferability, and generalization. Our findings indicate that when a graph neural network (GNN) is pretrained on diverse task-trees using a reconstruction objective, it acquires transferable knowledge, enabling effective adaptation to downstream tasks with an appropriate set of fine-tuning samples. To empirically validate this approach, we develop a pretrained graph model based on task-trees, termed Graph Generality Identifier on Task-Trees (GIT). Extensive experiments demonstrate that a single pretrained GIT model can be effectively adapted to over 30 different graphs across five domains via fine-tuning, in-context learning, or zero-shot learning. Our data and code are available atthis https URL.

View on arXiv

Comments on this paper