Fast Symbolic Regression Benchmarking

International Conference on Swarm Intelligence (ICSI), 2025

20 August 2025

Viktor Martinek

ArXiv (abs)PDF HTML Github

Main:9 Pages

1 Figures

Bibliography:1 Pages

1 Tables

Abstract

Symbolic regression (SR) uncovers mathematical models from data. Several benchmarks have been proposed to compare the performance of SR algorithms. However, existing ground-truth rediscovery benchmarks overemphasize the recovery of "the one" expression form or rely solely on computer algebra systems (such as SymPy) to assess success. Furthermore, existing benchmarks continue the expression search even after its discovery. We improve upon these issues by introducing curated lists of acceptable expressions, and a callback mechanism for early termination. As a starting point, we use the symbolic regression for scientific discovery (SRSD) benchmark problems proposed by Yoshitomo et al., and benchmark the two SR packagesthis http URLand TiSR. The new benchmarking method increases the rediscovery rate ofthis http URLfrom 26.7%, as reported by Yoshitomo et at., to 44.7%. Performing the benchmark takes 41.2% less computational expense. TiSR's rediscovery rate is 69.4%, while performing the benchmark saves 63% time.

View on arXiv

Comments on this paper