Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks

While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.
View on arXiv@article{bechler-speicher2025_2502.14546, title={ Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks }, author={ Maya Bechler-Speicher and Ben Finkelshtein and Fabrizio Frasca and Luis Müller and Jan Tönshoff and Antoine Siraudin and Viktor Zaverkin and Michael M. Bronstein and Mathias Niepert and Bryan Perozzi and Mikhail Galkin and Christopher Morris }, journal={arXiv preprint arXiv:2502.14546}, year={ 2025 } }