Scaling GraphLLM with Bilevel-Optimized Sparse Querying

30 January 2026

Yangzhe Peng

Haiquan Qiu

Quanming Yao

Kun He

ArXiv (abs)PDF HTML Github

Main:8 Pages

2 Figures

Bibliography:3 Pages

11 Tables

Appendix:8 Pages

Abstract

LLMs have recently shown strong potential in enhancing node-level tasks on text-attributed graphs (TAGs) by providing explanation features. However, their practical use is severely limited by the high computational and monetary cost of repeated LLM queries. To illustrate, naively generating explanations for all nodes on a medium-sized benchmark like Photo (48k nodes) using a representative method (e.g., TAPE) would consume days of processing time. In this paper, we propose Bilevel-Optimized Sparse Querying (BOSQ), a general framework that selectively leverages LLM-derived explanation features to enhance performance on node-level tasks on TAGs. We design an adaptive sparse querying strategy that selectively decides when to invoke LLMs, avoiding redundant or low-gain queries and significantly reducing computation overhead. Extensive experiments on six real-world TAG datasets involving two types of node-level tasks demonstrate that BOSQ achieves orders of magnitude speedups over existing GraphLLM methods while consistently delivering on-par or superior performance.

View on arXiv

Comments on this paper