Weisfeiler and Lehman Go Measurement Modeling: Probing the Validity of the WL Test

11 July 2023

Arjun Subramonian

ArXiv (abs)PDF HTML Github

Main:14 Pages

5 Figures

Bibliography:4 Pages

5 Tables

Appendix:29 Pages

Abstract

The expressive power of graph neural networks is usually measured by comparing how many pairs of graphs or nodes an architecture can possibly distinguish as non-isomorphic to those distinguishable by the $k$ -dimensional Weisfeiler-Lehman ( $k$ -WL) test. In this paper, we uncover misalignments between practitioners' conceptualizations of expressive power and $k$ -WL through a systematic analysis of the reliability and validity of $k$ -WL. We further conduct a survey ( $n = 18$ ) of practitioners to surface their conceptualizations of expressive power and their assumptions about $k$ -WL. In contrast to practitioners' opinions, our analysis (which draws from graph theory and benchmark auditing) reveals that $k$ -WL does not guarantee isometry, can be irrelevant to real-world graph tasks, and may not promote generalization or trustworthiness. We argue for extensional definitions and measurement of expressive power based on benchmarks; we further contribute guiding questions for constructing such benchmarks, which is critical for progress in graph machine learning.

View on arXiv

Comments on this paper