249
v1v2v3v4 (latest)

Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math Competitions

Main:9 Pages
5 Figures
Bibliography:7 Pages
7 Tables
Appendix:6 Pages
Abstract

Mathematical reasoning is central to artificial intelligence, with applications in education, code generation, and research-level mathematical discovery. Mathematical competitions highlight two problem types: theorem proving, requiring rigorous proofs, and answer construction, requiring creative generation and formal verification of mathematical objects. Existing research reveals that LLMs can tackle difficult answer-construction tasks but are prone to errors from hallucinations and unverifiable steps, while symbolic methods guarantee rigor but falter in creative answer construction. This raises a key understudied question: how to solve answer-construction problems while preserving both LLM creativity and mathematical rigor? To address this problem, we introduce the Enumerate-Conjecture-Prove (ECP) framework, a modular neuro-symbolic method integrating LLM-based enumeration and pattern-driven conjecturing with formal theorem proving in Lean, and ConstructiveBench, a dataset of 3,640 formal answer-construction problems from math competitions. ECP is model agnostic and shows consistent improvements over pure LLM baselines: on the subset of PutnamBench for answer construction, ECP formally solves 6 out of 337 answer-construction problems end to end (up from 4 without ECP) using GPT-5 mini and DeepSeek-Prover-V2-7B. On ConstructiveBench, ECP achieves 33.1% end-to-end state-of-the-art accuracy (up from 32.5%), demonstrating its potential to advance formal mathematical reasoning by combining LLM conjecturing with formal verification. Our code and dataset are publicly available at GitHub (this https URL) and Hugging Face (this https URL).

View on arXiv
Comments on this paper