v1v2 (latest)

MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy

7 August 2025

Papers citing "MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy"

8 / 8 papers shown

Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation

30 Nov 2025

BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs

Ivo Petrov

Jasper Dekoninck

Martin Vechev

153

06 Oct 2025

Socratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolution

SyDa OffRL ReLM LRM ELM

291

29 Sep 2025

From Static to Dynamic: Adaptive Monte Carlo Search for Mathematical Process Supervision

166

29 Sep 2025

ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning

225

25 Sep 2025

Discovering New Theorems via LLMs with In-Context Proof Learning in Lean

122

16 Sep 2025

Merge-of-Thought Distillation

339

10 Sep 2025

ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning

179

27 Aug 2025