Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.11214
Cited By
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition
15 July 2024
George Tsoukalas
Jasper Lee
John Jennings
Jimmy Xin
Michelle Ding
Michael Jennings
Amitayush Thakur
Swarat Chaudhuri
LRM
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition"
16 / 16 papers shown
Title
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving
Qi Liu
Xinhao Zheng
Renqiu Xia
Xingzhi Qi
Qinxiang Cao
Junchi Yan
AIMat
45
0
0
07 May 2025
CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics
J. Liu
Xiaohan Lin
Jonas Bayer
Yael Dillies
Weijie Jiang
...
Zhengfeng Yang
J. Zhang
Lihong Zhi
J. Li
Zhengying Liu
48
0
0
06 May 2025
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
Zhouliang Yu
Ruotian Peng
Keyi Ding
Y. K. Li
Zhongyuan Peng
...
Huajian Xin
W. R. Huang
Yandong Wen
Ge Zhang
Weiyang Liu
LRM
35
0
0
05 May 2025
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning
Haiming Wang
Mert Unsal
Xiaohan Lin
Mantas Baksys
J. Liu
...
Zhouliang Yu
Z. Wang
Zhilin Yang
Zhengying Liu
Jia-Nan Li
AIMat
ReLM
AI4TS
LRM
49
4
0
15 Apr 2025
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Ivo Petrov
Jasper Dekoninck
Lyuben Baltadzhiev
Maria Drencheva
Kristian Minchev
Mislav Balunović
Nikola Jovanović
Martin Vechev
LRM
ELM
62
8
0
27 Mar 2025
Chain-of-Thought Reasoning In The Wild Is Not Always Faithful
Iván Arcuschin
Jett Janiak
Robert Krzyzanowski
Senthooran Rajamanoharan
Neel Nanda
Arthur Conmy
LRM
ReLM
62
6
0
11 Mar 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRM
AI4CE
ReLM
KELM
110
1
0
21 Feb 2025
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Daniel J.H. Chung
Zhiqi Gao
Yurii Kvasiuk
Tianyi Li
Moritz Münchmeyer
Maja Rudolph
Frederic Sala
Sai Chaitanya Tadepalli
AIMat
44
3
0
19 Feb 2025
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
Yinghui Li
Jiayi Kuang
Haojing Huang
Zhikun Xu
Xinnian Liang
...
Xiaoyu Tan
C. Qu
Ying Shen
Hai-Tao Zheng
Philip S. Yu
LRM
41
3
0
12 Feb 2025
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
Xiaoyang Liu
Kangjie Bao
Jiashuo Zhang
Yunqi Liu
Yu Chen
Yuntian Liu
Yang Jiao
Tao Luo
AIMat
50
0
0
08 Feb 2025
Formal Mathematical Reasoning: A New Frontier in AI
Kaiyu Yang
Gabriel Poesia
Jingxuan He
Wenda Li
Kristin Lauter
Swarat Chaudhuri
Dawn Song
LRM
AI4CE
82
20
0
20 Dec 2024
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Kefan Dong
Arvind V. Mahankali
Tengyu Ma
ReLM
LRM
28
5
0
04 Nov 2024
LeanAgent: Lifelong Learning for Formal Theorem Proving
Adarsh Kumarappan
Mo Tiwari
Peiyang Song
Robert Joseph George
Chaowei Xiao
Anima Anandkumar
CLL
LLMAG
LRM
59
8
0
08 Oct 2024
A Survey on Deep Learning for Theorem Proving
Zhaoyu Li
Jialiang Sun
Logan Murphy
Qidong Su
Zenan Li
Xian Zhang
Kaiyu Yang
Xujie Si
LRM
42
21
0
15 Apr 2024
Baldur: Whole-Proof Generation and Repair with Large Language Models
E. First
M. Rabe
Talia Ringer
Yuriy Brun
53
90
0
08 Mar 2023
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
58
154
0
21 Oct 2022
1