v1v2 (latest)

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Neural Information Processing Systems (NeurIPS), 2023

27 June 2023

ArXiv (abs)PDF HTML HuggingFace (17 upvotes)

Papers citing "LeanDojo: Theorem Proving with Retrieval-Augmented Language Models"

50 / 192 papers shown

Visual Reasoning Tracer: Object-Level Grounded Reasoning Benchmark

343

04 Dec 2025

Improving Autoformalization Using Direct Dependency Retrieval

Shaoqi Wang

Lu Yu

Chunjie Yang

Feng Yan

Chunjie Yang

Qing Cui

Jun Zhou

148

15 Nov 2025

FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels

...

421

04 Nov 2025

RLMEval: Evaluating Research-Level Neural Theorem Proving

299

29 Oct 2025

ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization

349

28 Oct 2025

ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings

442

17 Oct 2025

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

101

13 Oct 2025

DRIFT: Decompose, Retrieve, Illustrate, then Formalize Theorems

520

12 Oct 2025

Trustworthy Retrosynthesis: Eliminating Hallucinations with a Diverse Ensemble of Reaction Scorers

Paweł Włodarczyk-Pruszyński

Mikołaj Sacha

Piotr Kozakowski

Ruard van Workum

Stanislaw Jastrzebski

172

12 Oct 2025

MASA: LLM-Driven Multi-Agent Systems for Autoformalization

117

10 Oct 2025

Lean Finder: Semantic Search for Mathlib That Understands User Intents

165

08 Oct 2025

Aristotle: IMO-level Automated Theorem Proving

...

175

01 Oct 2025

Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving

253

01 Oct 2025

Hilbert: Recursively Building Formal Proofs with Informal Reasoning

253

26 Sep 2025

A benchmark for vericoding: formally verified program synthesis

Sergiu Bursuc

Theodore Ehrenborg

Shaowei Lin

Lacramioara Astefanoaei

...

26 Sep 2025

FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory

138

26 Sep 2025

EngiBench: A Benchmark for Evaluating Large Language Models on Engineering Problem Solving

...

103

22 Sep 2025

Natural Language Translation of Formal Proofs through Informalization of Proof Steps and Recursive Summarization along Proof Structure

Seiji Hattori

Takuya Matsuzaki

Makoto Fujiwara

10 Sep 2025

Deploying AI for Signal Processing education: Selected challenges and intriguing opportunities

167

10 Sep 2025

Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers

217

08 Sep 2025

Towards Repository-Level Program Verification with Large Language Models

Si Cheng Zhong

Xujie Si

ALM

31 Aug 2025

A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants

Barış Bayazıt

Yao Li

Xujie Si

26 Aug 2025

Automated Formalization via Conceptual Retrieval-Augmented LLMs

165

09 Aug 2025

Putnam-AXIOM: A Functional and Static Benchmark for Measuring Higher Level Mathematical Reasoning in LLMs

389

05 Aug 2025

Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction

...

174

05 Aug 2025

The SMeL Test: A simple benchmark for media literacy in language models

Gustaf Ahdritz

Anat Kleiman

244

04 Aug 2025

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

...

238

31 Jul 2025

Solving Formal Math Problems by Decomposition and Iterative Reflection

...

150

21 Jul 2025

LeanTree: Accelerating White-Box Proof Search with Factorized States in Lean 4

210

19 Jul 2025

Prover Agent: An Agent-Based Framework for Formal Mathematical Proofs

352

24 Jun 2025

Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving

153

20 Jun 2025

AlphaEvolve: A coding agent for scientific and algorithmic discovery

...

257

220

16 Jun 2025

Domain Specific Benchmarks for Evaluating Multimodal Large Language Models

Khizar Anjuma

Muhammad Arbab Arshad

Kadhim Hayawi

Efstathios Polyzos

A. Tariq

...

Nishith Reddy Mannuru

Ravi Varma Kumar Bevara

Taslim Mahbub

Muhammad Zeeshan Akram

Sakib Shahriar

ELM LRM

426

15 Jun 2025

Beyond Gold Standards: Epistemic Ensemble of LLM Judges for Formal Mathematical Reasoning

Lan Zhang

Marco Valentino

André Freitas

303

12 Jun 2025

A Survey on Large Language Models for Mathematical Reasoning

...

279

10 Jun 2025

LeanTutor: A Formally-Verified AI Tutor for Mathematical Proofs

217

10 Jun 2025

Premise Selection for a Lean Hammer

156

09 Jun 2025

Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

219

09 Jun 2025

MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?

241

06 Jun 2025

LeanExplore: A search engine for Lean 4 declarations

Justin Asher

171

04 Jun 2025

DINGO: Constrained Inference for Diffusion LLMs

199

29 May 2025

Using Reasoning Models to Generate Search Heuristics that Solve Open Instances of Combinatorial Design Problems

Christopher D. Rosin

LRM

227

29 May 2025

RocqStar: Leveraging Similarity-driven Retrieval and Agentic Systems for Rocq generation

Nikita Khramov

Gleb Solovev

Anton Podkopaev

229

28 May 2025

Generalizable Process Reward Models via Formally Verified Training Data

Ryo Kamoi

Yusen Zhang

Nan Zhang

Sarkar Snigdha Sarathi Das

Rui Zhang

OffRL LRM

298

21 May 2025

HybridProver: Augmenting Theorem Proving with LLM-Driven Proof Synthesis and Refinement

164

21 May 2025

Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems

Christian Walder

Deep Karkhanis

OffRL

391

21 May 2025

FOL-Traces: Verified First-Order Logic Reasoning Traces at Scale

204

20 May 2025

CLEVER: A Curated Benchmark for Formally Verified Code Generation

508

20 May 2025

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving on Inequalities

377

19 May 2025

LLM-based Automated Theorem Proving Hinges on Scalable Synthetic Data Generation

307

17 May 2025