41
0

Enhancing Symbolic Regression with Quality-Diversity and Physics-Inspired Constraints

Abstract

This paper presents QDSR, an advanced symbolic Regression (SR) system that integrates genetic programming (GP), a quality-diversity (QD) algorithm, and a dimensional analysis (DA) engine. Our method focuses on exact symbolic recovery of known expressions from datasets, with a particular emphasis on the Feynman-AI benchmark. On this widely used collection of 117 physics equations, QDSR achieves an exact recovery rate of 91.6~%\%, surpassing all previous SR methods by over 20 percentage points. Our method also exhibits strong robustness to noise. Beyond QD and DA, this high success rate results from a profitable trade-off between vocabulary expressiveness and search space size: we show that significantly expanding the vocabulary with precomputed meaningful variables (e.g., dimensionless combinations and well-chosen scalar products) often reduces equation complexity, ultimately leading to better performance. Ablation studies will also show that QD alone already outperforms the state-of-the-art. This suggests that a simple integration of QD, by projecting individuals onto a QD grid, can significantly boost performance in existing algorithms, without requiring major system overhauls.

View on arXiv
@article{bruneton2025_2503.19043,
  title={ Enhancing Symbolic Regression with Quality-Diversity and Physics-Inspired Constraints },
  author={ J.-P. Bruneton },
  journal={arXiv preprint arXiv:2503.19043},
  year={ 2025 }
}
Comments on this paper