ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.00110
  4. Cited By
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

31 August 2021
Kunhao Zheng
Jesse Michael Han
Stanislas Polu
    AIMat
ArXivPDFHTML

Papers citing "MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics"

50 / 105 papers shown
Title
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving
Qi Liu
Xinhao Zheng
Renqiu Xia
Xingzhi Qi
Qinxiang Cao
Junchi Yan
AIMat
45
0
0
07 May 2025
CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics
CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics
J. Liu
Xiaohan Lin
Jonas Bayer
Yael Dillies
Weijie Jiang
...
Zhengfeng Yang
J. Zhang
Lihong Zhi
J. Li
Zhengying Liu
67
0
0
06 May 2025
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models
Zhouliang Yu
Ruotian Peng
Keyi Ding
Y. K. Li
Zhongyuan Peng
...
Huajian Xin
W. R. Huang
Yandong Wen
Ge Zhang
Weiyang Liu
LRM
75
0
0
05 May 2025
Hierarchical Attention Generates Better Proofs
Hierarchical Attention Generates Better Proofs
Jianlong Chen
Chao Li
Yang Yuan
Andrew Chi-Chih Yao
AIMat
LRM
26
0
0
27 Apr 2025
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries
APE-Bench I: Towards File-level Automated Proof Engineering of Formal Math Libraries
Huajian Xin
Luming Li
Xiaoran Jin
Jacques Fleuriot
Wenda Li
AIMat
48
0
0
27 Apr 2025
Neural Theorem Proving: Generating and Structuring Proofs for Formal Verification
Neural Theorem Proving: Generating and Structuring Proofs for Formal Verification
Balaji Rao
William Eiers
Carlo Lipizzi
32
0
0
23 Apr 2025
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning
Haiming Wang
Mert Unsal
Xiaohan Lin
Mantas Baksys
J. Liu
...
Zhouliang Yu
Z. Wang
Zhilin Yang
Zhengying Liu
Jia-Nan Li
AIMat
ReLM
AI4TS
LRM
49
4
0
15 Apr 2025
Reasoning Models Can Be Effective Without Thinking
Reasoning Models Can Be Effective Without Thinking
Wenjie Ma
Jingxuan He
Charlie Snell
Tyler Griggs
Sewon Min
Matei A. Zaharia
ReLM
LRM
50
4
1
14 Apr 2025
Leanabell-Prover: Posttraining Scaling in Formal Reasoning
Leanabell-Prover: Posttraining Scaling in Formal Reasoning
Jingyuan Zhang
Qi Wang
Xingguang Ji
Y. Liu
Yang Yue
Fuzheng Zhang
Di Zhang
Guorui Zhou
Kun Gai
LRM
34
2
0
08 Apr 2025
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics
Hamed Mahdavi
Alireza Hashemi
Majid Daliri
Pegah Mohammadipour
Alireza Farhadi
Samira Malek
Yekta Yazdanifard
Amir Khasahmadi
V. Honavar
ELM
LRM
52
1
0
01 Apr 2025
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Ivo Petrov
Jasper Dekoninck
Lyuben Baltadzhiev
Maria Drencheva
Kristian Minchev
Mislav Balunović
Nikola Jovanović
Martin Vechev
LRM
ELM
62
8
0
27 Mar 2025
Rosetta-PL: Propositional Logic as a Benchmark for Large Language Model Reasoning
Rosetta-PL: Propositional Logic as a Benchmark for Large Language Model Reasoning
Shaun Baek
Shaun Esua-Mensah
Cyrus Tsui
Sejan Vigneswaralingam
Abdullah Alali
Michael Lu
Vasu Sharma
Sean O'Brien
Kevin Zhu
LRM
51
0
0
25 Mar 2025
A Survey on Mathematical Reasoning and Optimization with Large Language Models
A Survey on Mathematical Reasoning and Optimization with Large Language Models
Ali Forootani
OffRL
LRM
AI4CE
40
0
0
22 Mar 2025
FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4
Jiarui Yao
Ruida Wang
Tong Zhang
LRM
60
0
0
05 Mar 2025
MA-LoT: Multi-Agent Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving
Ruida Wang
Rui Pan
Yuxin Li
Jipeng Zhang
Yizhen Jia
Shizhe Diao
Renjie Pi
Junjie Hu
Tong Zhang
LRM
LLMAG
81
5
0
05 Mar 2025
From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support Systems
Zekun Zhou
Xiaocheng Feng
L. Huang
Xiachong Feng
Ziyun Song
...
Baoxin Wang
Dayong Wu
Guoping Hu
Ting Liu
Bing Qin
AI4TS
66
0
0
03 Mar 2025
CuDIP: Enhancing Theorem Proving in LLMs via Curriculum Learning-based Direct Preference Optimization
CuDIP: Enhancing Theorem Proving in LLMs via Curriculum Learning-based Direct Preference Optimization
Shuming Shi
Ruobing Zuo
Gaolei He
Jianlin Wang
Chenyang Xu
Zhengfeng Yang
60
0
0
25 Feb 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRM
AI4CE
ReLM
KELM
149
1
0
21 Feb 2025
Activation Steering in Neural Theorem Provers
Activation Steering in Neural Theorem Provers
Shashank Kirtania
LLMSV
111
0
0
21 Feb 2025
Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques
Simplifying Formal Proof-Generating Models with ChatGPT and Basic Searching Techniques
Sangjun Han
Taeil Hur
Youngmi Hur
Kathy Sangkyung Lee
Myungyoon Lee
Hyojae Lim
93
0
0
20 Feb 2025
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Daniel J.H. Chung
Zhiqi Gao
Yurii Kvasiuk
Tianyi Li
Moritz Münchmeyer
Maja Rudolph
Frederic Sala
Sai Chaitanya Tadepalli
AIMat
44
3
0
19 Feb 2025
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization
Lean-ing on Quality: How High-Quality Data Beats Diverse Multilingual Data in AutoFormalization
Willy Chan
Michael Souliman
Jakob Nordhagen
Brando Miranda
Elyas Obbad
Kai Fronsdal Sanmi Koyejo
34
0
0
18 Feb 2025
Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions
Formalizing Complex Mathematical Statements with LLMs: A Study on Mathematical Definitions
Lan Zhang
Marco Valentino
André Freitas
44
0
0
17 Feb 2025
Generating Millions Of Lean Theorems With Proofs By Exploring State Transition Graphs
David Yin
Jing Gao
42
0
0
16 Feb 2025
A cross-regional review of AI safety regulations in the commercial aviation
Penny A. Barr
Sohel M. Imroz
39
0
0
12 Feb 2025
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
Yinghui Li
Jiayi Kuang
Haojing Huang
Zhikun Xu
Xinnian Liang
...
Xiaoyu Tan
C. Qu
Ying Shen
Hai-Tao Zheng
Philip S. Yu
LRM
41
3
0
12 Feb 2025
Examining False Positives under Inference Scaling for Mathematical Reasoning
Examining False Positives under Inference Scaling for Mathematical Reasoning
Yu Guang Wang
Nan Yang
Liang Wang
Furu Wei
LRM
59
3
0
10 Feb 2025
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data
Xiaoyang Liu
Kangjie Bao
Jiashuo Zhang
Yunqi Liu
Yu Chen
Yuntian Liu
Yang Jiao
Tao Luo
AIMat
50
0
0
08 Feb 2025
Advanced Weakly-Supervised Formula Exploration for Neuro-Symbolic Mathematical Reasoning
Advanced Weakly-Supervised Formula Exploration for Neuro-Symbolic Mathematical Reasoning
Yuxuan Wu
Hideki Nakayama
NAI
53
0
0
02 Feb 2025
Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap
Understand, Solve and Translate: Bridging the Multilingual Mathematical Reasoning Gap
Hyunwoo Ko
Guijin Son
Dasol Choi
RALM
LRM
78
7
0
05 Jan 2025
Mathematical Language Models: A Survey
Mathematical Language Models: A Survey
W. Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
79
12
0
03 Jan 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
76
211
0
03 Jan 2025
Formal Mathematical Reasoning: A New Frontier in AI
Formal Mathematical Reasoning: A New Frontier in AI
Kaiyu Yang
Gabriel Poesia
Jingxuan He
Wenda Li
Kristin Lauter
Swarat Chaudhuri
Dawn Song
LRM
AI4CE
82
20
0
20 Dec 2024
Can Language Models Rival Mathematics Students? Evaluating Mathematical
  Reasoning through Textual Manipulation and Human Experiments
Can Language Models Rival Mathematics Students? Evaluating Mathematical Reasoning through Textual Manipulation and Human Experiments
Andrii Nikolaiev
Yiannos Stathopoulos
Simone Teufel
LRM
69
0
0
16 Dec 2024
Proposing and solving olympiad geometry with guided tree search
Proposing and solving olympiad geometry with guided tree search
Chi Zhang
Jiajun Song
Siyu Li
Yitao Liang
Yuxi Ma
Wei Wang
Yixin Zhu
Song-Chun Zhu
AIMat
LRM
74
3
0
14 Dec 2024
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs
  Hierarchically
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Kefan Dong
Arvind V. Mahankali
Tengyu Ma
ReLM
LRM
28
5
0
04 Nov 2024
Autoformalize Mathematical Statements by Symbolic Equivalence and
  Semantic Consistency
Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency
Zenan Li
Yifan Wu
Zhaoyu Li
Xinming Wei
Xian Zhang
Fan Yang
Xiaoxing Ma
32
3
0
28 Oct 2024
Library Learning Doesn't: The Curious Case of the Single-Use "Library"
Library Learning Doesn't: The Curious Case of the Single-Use "Library"
Ian Berlot-Attwell
Frank Rudzicz
Xujie Si
37
1
0
26 Oct 2024
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation
Shaonan Wu
Shuai Lu
Y. Gong
Nan Duan
Ping Wei
AIMat
40
0
0
21 Oct 2024
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large
  Language Models
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
Bofei Gao
Feifan Song
Z. Yang
Zefan Cai
Yibo Miao
...
Lei Sha
Yichang Zhang
Xuancheng Ren
Tianyu Liu
Baobao Chang
ELM
LRM
21
36
0
10 Oct 2024
Herald: A Natural Language Annotated Lean 4 Dataset
Herald: A Natural Language Annotated Lean 4 Dataset
Guoxiong Gao
Yutong Wang
Jiedong Jiang
Qi Gao
Zihan Qin
Tianyi Xu
Bin Dong
54
3
0
09 Oct 2024
Consistent Autoformalization for Constructing Mathematical Libraries
Consistent Autoformalization for Constructing Mathematical Libraries
Lan Zhang
Xin Quan
André Freitas
AI4CE
30
2
0
05 Oct 2024
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM
  Performance and Generalization
Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Mucong Ding
Chenghao Deng
Jocelyn Choo
Zichu Wu
Aakriti Agrawal
...
Tianyi Zhou
Tom Goldstein
John Langford
Anima Anandkumar
Furong Huang
51
5
0
27 Sep 2024
Proof Automation with Large Language Models
Proof Automation with Large Language Models
Minghai Lu
Benjamin Delaware
Tianyi Zhang
LRM
31
3
0
22 Sep 2024
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
Xueliang Zhao
Lin Zheng
Haige Bo
Changran Hu
Urmish Thakker
Lingpeng Kong
LRM
40
1
0
20 Aug 2024
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
  Reinforcement Learning and Monte-Carlo Tree Search
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Huajian Xin
Z. Z. Ren
Junxiao Song
Zhihong Shao
Wanjia Zhao
...
Dejian Yang
Zhibin Gou
Z. F. Wu
Fuli Luo
Chong Ruan
AIMat
LRM
39
45
0
15 Aug 2024
miniCTX: Neural Theorem Proving with (Long-)Contexts
miniCTX: Neural Theorem Proving with (Long-)Contexts
Jiewen Hu
Thomas Zhu
Sean Welleck
AIMat
58
6
0
05 Aug 2024
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN
  prover
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover
Zijian Wu
Jiayu Wang
Dahua Lin
Kai-xiang Chen
27
12
0
24 Jul 2024
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam
  Mathematical Competition
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition
George Tsoukalas
Jasper Lee
John Jennings
Jimmy Xin
Michelle Ding
Michael Jennings
Amitayush Thakur
Swarat Chaudhuri
LRM
AIMat
29
18
0
15 Jul 2024
Lean-STaR: Learning to Interleave Thinking and Proving
Lean-STaR: Learning to Interleave Thinking and Proving
Haohan Lin
Zhiqing Sun
Yiming Yang
Sean Welleck
ReLM
LRM
65
23
0
14 Jul 2024
123
Next