ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.00110
  4. Cited By
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
v1v2 (latest)

MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

International Conference on Learning Representations (ICLR), 2021
31 August 2021
Kunhao Zheng
Jesse Michael Han
Stanislas Polu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics"

50 / 168 papers shown
Title
Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training
Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training
Xinyuan Zhou
Yi Lei
Xiaoyu Zhou
Jingyi Sun
Yu Zhu
Zhongyi Ye
Weitai Zhang
Quan Liu
Si Wei
Cong Liu
ALMLRM
227
0
0
17 Nov 2025
Improving Autoformalization Using Direct Dependency Retrieval
Improving Autoformalization Using Direct Dependency Retrieval
Shaoqi Wang
Lu Yu
Chunjie Yang
65
0
0
15 Nov 2025
Towards Autoformalization of LLM-generated Outputs for Requirement Verification
Towards Autoformalization of LLM-generated Outputs for Requirement Verification
Mihir Gupte
Ramesh S
36
0
0
14 Nov 2025
miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward
miniF2F-Lean Revisited: Reviewing Limitations and Charting a Path Forward
Azim Ospanov
Farzan Farnia
Roozbeh Yousefzadeh
65
0
0
05 Nov 2025
The ORCA Benchmark: Evaluating Real-World Calculation Accuracy in Large Language Models
The ORCA Benchmark: Evaluating Real-World Calculation Accuracy in Large Language Models
Claudia Herambourg
Dawid Siuda
Julia Kopczyńska
Joao R. L. Santos
Wojciech Sas
Joanna Śmietańska-Nowak
ELMALMLRM
306
0
0
04 Nov 2025
FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels
FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels
Jiedong Jiang
Wanyi He
Yuefeng Wang
Guoxiong Gao
Yongle Hu
...
Nailing Guan
Peihao Wu
Chunbo Dai
Liang Xiao
Bin Dong
AIMatELMLRM
298
0
0
04 Nov 2025
ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization
ReForm: Reflective Autoformalization with Prospective Bounded Sequence Optimization
Guoxin Chen
Jing Wu
Xinjie Chen
Wayne Xin Zhao
Ruihua Song
Chengxi Li
Kai Fan
Dayiheng Liu
Minpeng Liao
AIMatOffRL
271
0
0
28 Oct 2025
ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings
ProofBridge: Auto-Formalization of Natural Language Proofs in Lean via Joint Embeddings
Prithwish Jana
Kaan Kale
Ahmet Ege Tanriverdi
Cruise Song
S. Vishwanath
Vijay Ganesh
AIMat
308
0
0
17 Oct 2025
Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
Binxin Gao
Jingjun Han
ELMLRM
161
0
0
14 Oct 2025
Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics
Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics
Marco Del Tredici
Jacob McCarran
Benjamin Breen
Javier Aspuru Mijares
Weichen Winston Yin
Jacob M. Taylor
Frank Koppens
Dirk Englund
Dirk Englund
LRM
156
0
0
14 Oct 2025
TopoAlign: A Framework for Aligning Code to Math via Topological Decomposition
TopoAlign: A Framework for Aligning Code to Math via Topological Decomposition
Yupei Li
Philipp Borchert
Gerasimos Lampouras
85
0
0
13 Oct 2025
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving
Ruida Wang
Jiarui Yao
Rui Pan
Shizhe Diao
Tong Zhang
59
0
0
13 Oct 2025
MASA: LLM-Driven Multi-Agent Systems for Autoformalization
MASA: LLM-Driven Multi-Agent Systems for Autoformalization
Lan Zhang
Marco Valentino
André Freitas
LLMAGAI4CE
72
0
0
10 Oct 2025
RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows
RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows
Hamed Mahdavi
Pouria Mahdavinia
Samira Malek
Pegah Mohammadipour
Alireza Hashemi
Majid Daliri
Alireza Farhadi
Amir Khasahmadi
Niloofar Mireshghallah
V. Honavar
100
1
0
10 Oct 2025
RAPID: An Efficient Reinforcement Learning Algorithm for Small Language Models
RAPID: An Efficient Reinforcement Learning Algorithm for Small Language Models
Lianghuan Huang
Sagnik Anupam
Insup Lee
Shuo Li
Osbert Bastani
81
1
0
03 Oct 2025
PRISM-Physics: Causal DAG-Based Process Evaluation for Physics Reasoning
PRISM-Physics: Causal DAG-Based Process Evaluation for Physics Reasoning
Wanjia Zhao
Qinwei Ma
Jingzhe Shi
Shirley Wu
Jiaqi Han
Yijia Xiao
S. Chen
Xiao Luo
Ludwig Schmidt
James Zou
LRM
72
0
0
03 Oct 2025
Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving
Shunfeng Zheng
Yudi Zhang
Meng Fang
Zihan Zhang
Zhitan Wu
Mykola Pechenizkiy
Ling-Hao Chen
ReLMRALMLRM
188
0
0
01 Oct 2025
Aristotle: IMO-level Automated Theorem Proving
Aristotle: IMO-level Automated Theorem Proving
Tudor Achim
Alex Best
Kevin Der
Mathïs Fédérico
Sergei Gukov
...
Matyas Tamas
Vlad Tenev
Jonathan Thomm
Harold Williams
Lawrence Wu
LRM
130
3
0
01 Oct 2025
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
Atomic Thinking of LLMs: Decoupling and Exploring Mathematical Reasoning Abilities
Jiayi Kuang
Haojing Huang
Yinghui Li
Xinnian Liang
Zhikun Xu
...
Xiaoyu Tan
Chao Qu
Meishan Zhang
Ying Shen
Philip S. Yu
LRM
125
4
0
30 Sep 2025
ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity
ASSESS: A Semantic and Structural Evaluation Framework for Statement Similarity
Xiaoyang Liu
Tao Zhu
Zineng Dong
Yuntian Liu
Qingfeng Guo
Zhaoxuan Liu
Yu Chen
Tao Luo
60
0
0
26 Sep 2025
Hilbert: Recursively Building Formal Proofs with Informal Reasoning
Hilbert: Recursively Building Formal Proofs with Informal Reasoning
Sumanth Varambally
Thomas Voice
Yanchao Sun
Zhifeng Chen
Rose Yu
Ke Ye
AIMatReLMLRM
177
8
0
26 Sep 2025
FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory
FormalML: A Benchmark for Evaluating Formal Subgoal Completion in Machine Learning Theory
Xiao-Wen Yang
Zihao Zhang
Jianuo Cao
Zhi Zhou
Zenan Li
Lan-Zhe Guo
Yuan Yao
Taolue Chen
Yu-Feng Li
Xiaoxing Ma
ALMLRM
96
1
0
26 Sep 2025
EngiBench: A Benchmark for Evaluating Large Language Models on Engineering Problem Solving
EngiBench: A Benchmark for Evaluating Large Language Models on Engineering Problem Solving
Xiyuan Zhou
Xinlei Wang
Yirui He
Yang Wu
Ruixi Zou
...
Wenxuan Liu
Huan Zhao
Yan Xu
Jinjin Gu
Junhua Zhao
ELMLRM
84
1
0
22 Sep 2025
Large Language Models as End-to-end Combinatorial Optimization Solvers
Large Language Models as End-to-end Combinatorial Optimization Solvers
Xia Jiang
Yaoxin Wu
Minshuo Li
Zhiguang Cao
Yingqian Zhang
144
1
0
21 Sep 2025
EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving
EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving
Mukai Li
Linfeng Song
Zhenwen Liang
Jiahao Xu
Shansan Gong
Qi Liu
Haitao Mi
Dong Yu
OffRLLRM
100
0
0
16 Sep 2025
REAMS: Reasoning Enhanced Algorithm for Maths Solving
REAMS: Reasoning Enhanced Algorithm for Maths Solving
Eishkaran Singh
Tanav Singh Bajaj
Siddharth Nayak
AIMat
173
0
0
16 Sep 2025
Natural Language Translation of Formal Proofs through Informalization of Proof Steps and Recursive Summarization along Proof Structure
Natural Language Translation of Formal Proofs through Informalization of Proof Steps and Recursive Summarization along Proof Structure
Seiji Hattori
Takuya Matsuzaki
Makoto Fujiwara
40
0
0
10 Sep 2025
A Fragile Number Sense: Probing the Elemental Limits of Numerical Reasoning in LLMs
A Fragile Number Sense: Probing the Elemental Limits of Numerical Reasoning in LLMs
Roussel Rahman
Aashwin Ananda Mishra
LRM
71
1
0
08 Sep 2025
FormaRL: Enhancing Autoformalization with no Labeled Data
FormaRL: Enhancing Autoformalization with no Labeled Data
Yanxing Huang
Xinling Jin
Sijie Liang
Peng Li
Yang Liu
OffRLAIMatAI4CE
163
3
0
26 Aug 2025
A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants
A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants
Barış Bayazıt
Yao Li
Xujie Si
52
1
0
26 Aug 2025
Lean Meets Theoretical Computer Science: Scalable Synthesis of Theorem Proving Challenges in Formal-Informal Pairs
Lean Meets Theoretical Computer Science: Scalable Synthesis of Theorem Proving Challenges in Formal-Informal Pairs
Terry Jingchen Zhang
Wenyuan Jiang
Rongchuan Liu
Yisong Wang
J. Yang
Ning Wang
Nicole Ni
Yinya Huang
Mrinmaya Sachan
AIMat
161
0
0
21 Aug 2025
Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
Xuyang Guo
Zekai Huang
Zhao Song
Jiahao Zhang
LRM
120
3
0
16 Aug 2025
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
Yuren Hao
Xiang Wan
Chengxiang Zhai
LRM
160
2
0
12 Aug 2025
Automated Formalization via Conceptual Retrieval-Augmented LLMs
Automated Formalization via Conceptual Retrieval-Augmented LLMs
Wangyue Lu
Lun Du
Sirui Li
Ke Weng
Haozhe Sun
Hengyu Liu
Minghe Yu
Tiancheng Zhang
Ge Yu
124
3
0
09 Aug 2025
StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion
StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs through Knowledge-Reasoning Fusion
Yutong Wu
Di Huang
Ruosi Wan
Yue Peng
Shijie Shang
...
Lei Qi
Rui Zhang
Zidong Du
Jie Yan
Xing Hu
ReLMOffRLLRM
123
6
0
06 Aug 2025
Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction
Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction
Yong Lin
Shange Tang
Bohan Lyu
Ziran Yang
Jui-Hui Chung
...
Hongzhou Lin
Yejin Choi
Danqi Chen
Sanjeev Arora
Chi Jin
MoELRM
130
36
0
05 Aug 2025
Proof2Hybrid: Automatic Mathematical Benchmark Synthesis for Proof-Centric Problems
Proof2Hybrid: Automatic Mathematical Benchmark Synthesis for Proof-Centric Problems
Yebo Peng
Zixiang Liu
Yaoming Li
Zhizhuo Yang
Xinye Xu
Bowen Ye
Weijun Yuan
Zihan Wang
Tong Yang
AIMat
188
0
0
04 Aug 2025
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
L. Chen
J. Gu
Daigang Xu
Wenhao Huang
Z. L. Jiang
...
Ge Zhang
Tianyun Zhao
Jianqiu Zhao
Yichi Zhou
Thomas Hanwen Zhu
AIMatLRM
190
30
0
31 Jul 2025
Solving Formal Math Problems by Decomposition and Iterative Reflection
Solving Formal Math Problems by Decomposition and Iterative Reflection
Yichi Zhou
Jianqiu Zhao
Yongxin Zhang
Bohan Wang
Siran Wang
...
Rong Ye
Phan Nhat Hoang
Huishuai Zhang
Peng Sun
Hang Li
115
14
0
21 Jul 2025
LeanTree: Accelerating White-Box Proof Search with Factorized States in Lean 4
LeanTree: Accelerating White-Box Proof Search with Factorized States in Lean 4
Matěj Kripner
Michal Šustr
Milan Straka
LRM
152
1
0
19 Jul 2025
ProofCompass: Enhancing Specialized Provers with LLM Guidance
ProofCompass: Enhancing Specialized Provers with LLM Guidance
Nicolas Wischermann
Claudio Mayrink Verdun
Gabriel Poesia
Francesco Noseda
LRM
120
3
0
18 Jul 2025
Generalized Tree Edit Distance (GTED): A Faithful Evaluation Metric for Statement Autoformalization
Generalized Tree Edit Distance (GTED): A Faithful Evaluation Metric for Statement Autoformalization
Yuntian Liu
Tao Zhu
Xiaoyang Liu
Yu Chen
Zhaoxuan Liu
Qingfeng Guo
Jiashuo Zhang
Kangjie Bao
Tao Luo
145
0
0
10 Jul 2025
Prover Agent: An Agent-Based Framework for Formal Mathematical Proofs
Prover Agent: An Agent-Based Framework for Formal Mathematical Proofs
Kaito Baba
Chaoran Liu
Shuhei Kurita
Akiyoshi Sannai
LLMAG
288
9
0
24 Jun 2025
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving
Chuxue Cao
Mengze Li
Juntao Dai
Jinluan Yang
Zijian Zhao
Shengyu Zhang
Weijie Shi
Chengzhong Liu
Sirui Han
Wenhan Luo
LRM
128
3
0
20 Jun 2025
Beyond Gold Standards: Epistemic Ensemble of LLM Judges for Formal Mathematical Reasoning
Beyond Gold Standards: Epistemic Ensemble of LLM Judges for Formal Mathematical Reasoning
Lan Zhang
Marco Valentino
André Freitas
228
0
0
12 Jun 2025
A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang
Tian-Shuo Liu
Chenyang Wang
Yi-Di Wang
Shu Yan
...
Xu-Hui Liu
Xin-Wei Chen
Jia-Cheng Xu
Ziniu Li
Yang Yu
LRM
208
13
0
10 Jun 2025
Mathesis: Towards Formal Theorem Proving from Natural Languages
Mathesis: Towards Formal Theorem Proving from Natural Languages
Yu Xuejun
Jianyuan Zhong
Zijin Feng
Pengyi Zhai
Roozbeh Yousefzadeh
...
Dongcai Lu
Jiacheng Sun
Q. Xu
Shen Xin
Zhenguo Li
AIMatOffRLLRM
189
8
0
08 Jun 2025
MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?
MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?
Zhitao He
Zongwei Lyu
Dazhong Chen
Dadi Guo
Yi R. Fung
LRM
184
5
0
06 Jun 2025
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal VerificationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Chengwu Liu
Ye Yuan
Yichun Yin
Yan Xu
Xin Xu
Zaoyu Chen
Yasheng Wang
Lifeng Shang
Qun Liu
Ming Zhang
LRM
306
7
0
05 Jun 2025
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research
ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research
Zhiyuan Wang
Bokui Chen
Yinya Huang
Qingxing Cao
Ming He
Jianping Fan
Xiaodan Liang
LRM
168
4
0
02 Jun 2025
1234
Next