ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.11354
47
4

Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning

15 April 2025
Haiming Wang
Mert Unsal
Xiaohan Lin
Mantas Baksys
J. Liu
Marco Dos Santos
Flood Sung
Marina Vinyes
Zhenzhe Ying
Zekai Zhu
Jianqiao Lu
Hugues de Saxcé
Bolton Bailey
Chendong Song
Chenjun Xiao
Dehao Zhang
Ebony Zhang
Frederick Pu
Han Zhu
Jiawei Liu
Jonas Bayer
Julien Michel
L. Yu
Léo Dreyfus-Schmidt
Lewis Tunstall
Luigi Pagani
Moreira Machado
Pauline Bourigault
Ran Wang
Stanislas Polu
Thibaut Barroyer
Wen-Ding Li
Yazhe Niu
Yann Fleureau
Y. Hu
Zhouliang Yu
Z. Wang
Zhilin Yang
Zhengying Liu
Jia-Nan Li
    AIMat
    ReLM
    AI4TS
    LRM
ArXivPDFHTML
Abstract

We introduce Kimina-Prover Preview, a large language model that pioneers a novel reasoning-driven exploration paradigm for formal theorem proving, as showcased in this preview release. Trained with a large-scale reinforcement learning pipeline from Qwen2.5-72B, Kimina-Prover demonstrates strong performance in Lean 4 proof generation by employing a structured reasoning pattern we term \textit{formal reasoning pattern}. This approach allows the model to emulate human problem-solving strategies in Lean, iteratively generating and refining proof steps. Kimina-Prover sets a new state-of-the-art on the miniF2F benchmark, reaching 80.7% with pass@8192. Beyond improved benchmark performance, our work yields several key insights: (1) Kimina-Prover exhibits high sample efficiency, delivering strong results even with minimal sampling (pass@1) and scaling effectively with computational budget, stemming from its unique reasoning pattern and RL training; (2) we demonstrate clear performance scaling with model size, a trend previously unobserved for neural theorem provers in formal mathematics; (3) the learned reasoning style, distinct from traditional search algorithms, shows potential to bridge the gap between formal verification and informal mathematical intuition. We open source distilled versions with 1.5B and 7B parameters of Kimina-Prover

View on arXiv
@article{wang2025_2504.11354,
  title={ Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning },
  author={ Haiming Wang and Mert Unsal and Xiaohan Lin and Mantas Baksys and Junqi Liu and Marco Dos Santos and Flood Sung and Marina Vinyes and Zhenzhe Ying and Zekai Zhu and Jianqiao Lu and Hugues de Saxcé and Bolton Bailey and Chendong Song and Chenjun Xiao and Dehao Zhang and Ebony Zhang and Frederick Pu and Han Zhu and Jiawei Liu and Jonas Bayer and Julien Michel and Longhui Yu and Léo Dreyfus-Schmidt and Lewis Tunstall and Luigi Pagani and Moreira Machado and Pauline Bourigault and Ran Wang and Stanislas Polu and Thibaut Barroyer and Wen-Ding Li and Yazhe Niu and Yann Fleureau and Yangyang Hu and Zhouliang Yu and Zihan Wang and Zhilin Yang and Zhengying Liu and Jia Li },
  journal={arXiv preprint arXiv:2504.11354},
  year={ 2025 }
}
Comments on this paper