What makes math problems hard for reinforcement learning: a case study
Ali Shehper
A. Medina-Mardones
Lucas Fagan
Angus Gruen
Piotr Kucharski
Sergei Gukov
Piotr Kucharski
Zhenghan Wang
Sergei Gukov

Abstract
Using a long-standing conjecture from combinatorial group theory, we explore, from multiple perspectives, the challenges of finding rare instances carrying disproportionately high rewards. Based on lessons learned in the context defined by the Andrews-Curtis conjecture, we propose algorithmic enhancements and a topological hardness measure with implications for a broad class of search problems. As part of our study, we also address several open mathematical questions. Notably, we demonstrate the length reducibility of all but two presentations in the Akbulut-Kirby series (1981), and resolve various potential counterexamples in the Miller-Schupp series (1991), including three infinite subfamilies.
View on arXiv@article{shehper2025_2408.15332, title={ What makes math problems hard for reinforcement learning: a case study }, author={ Ali Shehper and Anibal M. Medina-Mardones and Lucas Fagan and Bartłomiej Lewandowski and Angus Gruen and Yang Qiu and Piotr Kucharski and Zhenghan Wang and Sergei Gukov }, journal={arXiv preprint arXiv:2408.15332}, year={ 2025 } }
Comments on this paper