ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.00110
  4. Cited By
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics
v1v2 (latest)

MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

International Conference on Learning Representations (ICLR), 2021
31 August 2021
Kunhao Zheng
Jesse Michael Han
Stanislas Polu
    AIMat
ArXiv (abs)PDFHTML

Papers citing "MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics"

50 / 170 papers shown
Can Language Models Rival Mathematics Students? Evaluating Mathematical
  Reasoning through Textual Manipulation and Human Experiments
Can Language Models Rival Mathematics Students? Evaluating Mathematical Reasoning through Textual Manipulation and Human Experiments
Andrii Nikolaiev
Yiannos Stathopoulos
Simone Teufel
LRM
188
2
0
16 Dec 2024
Proposing and solving olympiad geometry with guided tree search
Proposing and solving olympiad geometry with guided tree search
Chi Zhang
Jiajun Song
Siyu Li
Yitao Liang
Yuxi Ma
Wei Wang
Yixin Zhu
Song-Chun Zhu
AIMatLRM
264
10
0
14 Dec 2024
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs
  Hierarchically
Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically
Kefan Dong
Arvind V. Mahankali
Tengyu Ma
ReLMLRM
294
11
0
04 Nov 2024
Autoformalize Mathematical Statements by Symbolic Equivalence and
  Semantic Consistency
Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic ConsistencyNeural Information Processing Systems (NeurIPS), 2024
Zenan Li
Yifan Wu
Zhaoyu Li
Xinming Wei
Xian Zhang
Fan Yang
Xiaoxing Ma
254
0
0
28 Oct 2024
Library Learning Doesn't: The Curious Case of the Single-Use "Library"
Library Learning Doesn't: The Curious Case of the Single-Use "Library"
Ian Berlot-Attwell
Frank Rudzicz
Xujie Si
178
2
0
26 Oct 2024
Alchemy: Amplifying Theorem-Proving Capability through Symbolic Mutation
Alchemy: Amplifying Theorem-Proving Capability through Symbolic MutationInternational Conference on Learning Representations (ICLR), 2024
Shaonan Wu
Shuai Lu
Yeyun Gong
Nan Duan
Ping Wei
AIMat
295
1
0
21 Oct 2024
3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes
3D-Prover: Diversity Driven Theorem Proving With Determinantal Point Processes
Sean Lamont
Christian J. Walder
Amir Dezfouli
Paul Montague
Michael Norrish
308
0
0
14 Oct 2024
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large
  Language Models
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language ModelsInternational Conference on Learning Representations (ICLR), 2024
Bofei Gao
Feifan Song
Zhiyong Yang
Zefan Cai
Yibo Miao
...
Lei Sha
Yichang Zhang
Xuancheng Ren
Tianyu Liu
Baobao Chang
ELMLRM
280
135
0
10 Oct 2024
Herald: A Natural Language Annotated Lean 4 Dataset
Herald: A Natural Language Annotated Lean 4 DatasetInternational Conference on Learning Representations (ICLR), 2024
Guoxiong Gao
Yutong Wang
Jiedong Jiang
Qi Gao
Zihan Qin
Tianyi Xu
Bin Dong
404
25
0
09 Oct 2024
Consistent Autoformalization for Constructing Mathematical Libraries
Consistent Autoformalization for Constructing Mathematical LibrariesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Lan Zhang
Xin Quan
André Freitas
AI4CE
206
12
0
05 Oct 2024
Proof Automation with Large Language Models
Proof Automation with Large Language Models
Minghai Lu
Benjamin Delaware
Tianyi Zhang
LRM
170
23
0
22 Sep 2024
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
SubgoalXL: Subgoal-based Expert Learning for Theorem Proving
Xueliang Zhao
Lin Zheng
Haige Bo
Changran Hu
Urmish Thakker
Lingpeng Kong
LRM
269
11
0
20 Aug 2024
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for
  Reinforcement Learning and Monte-Carlo Tree Search
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Huajian Xin
Zhaochun Ren
Junxiao Song
Zhihong Shao
Wanjia Zhao
...
Dejian Yang
Zhibin Gou
Z. F. Wu
Fuli Luo
Chong Ruan
AIMatLRM
288
132
0
15 Aug 2024
miniCTX: Neural Theorem Proving with (Long-)Contexts
miniCTX: Neural Theorem Proving with (Long-)ContextsInternational Conference on Learning Representations (ICLR), 2024
Jiewen Hu
Thomas Zhu
Sean Welleck
AIMat
454
23
0
05 Aug 2024
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN
  prover
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover
Zijian Wu
Jiayu Wang
Dahua Lin
Kai-xiang Chen
295
23
0
24 Jul 2024
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam
  Mathematical Competition
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition
George Tsoukalas
Jasper Lee
John Jennings
Jimmy Xin
Michelle Ding
Michael Jennings
Amitayush Thakur
Swarat Chaudhuri
LRMAIMat
254
81
0
15 Jul 2024
Lean-STaR: Learning to Interleave Thinking and Proving
Lean-STaR: Learning to Interleave Thinking and Proving
Haohan Lin
Zhiqing Sun
Yiming Yang
Sean Welleck
ReLMLRM
683
50
0
14 Jul 2024
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical
  Reasoning with Checklist
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Zihao Zhou
Shudong Liu
Maizhen Ning
Wei Liu
Jindong Wang
Yang Li
Xiaowei Huang
Qiufeng Wang
Kaizhu Huang
ELMLRM
254
45
0
11 Jul 2024
Towards Automated Functional Equation Proving: A Benchmark Dataset and A
  Domain-Specific In-Context Agent
Towards Automated Functional Equation Proving: A Benchmark Dataset and A Domain-Specific In-Context Agent
Mahdi Buali
Robert Hoehndorf
242
0
0
05 Jul 2024
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
Ruida Wang
Jipeng Zhang
Yizhen Jia
Boyao Wang
Shizhe Diao
Renjie Pi
Tong Zhang
LRM
284
46
0
03 Jul 2024
Learning Formal Mathematics From Intrinsic Motivation
Learning Formal Mathematics From Intrinsic Motivation
Gabriel Poesia
David Broman
Nick Haber
Noah D. Goodman
LRM
309
29
0
30 Jun 2024
FVEL: Interactive Formal Verification Environment with Large Language
  Models via Theorem Proving
FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving
Xiaohan Lin
Qingxing Cao
Yinya Huang
Haiming Wang
Jianqiao Lu
Zhengying Liu
Linqi Song
Xiaodan Liang
LRM
164
20
0
20 Jun 2024
Proving Olympiad Algebraic Inequalities without Human Demonstrations
Proving Olympiad Algebraic Inequalities without Human Demonstrations
Chenrui Wei
Mengzhou Sun
Wei Wang
LRM
238
14
0
20 Jun 2024
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical
  Problem-Solving
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Yuxuan Tong
Xiwen Zhang
Rui Wang
R. Wu
Junxian He
AIMatLRM
240
81
0
18 Jun 2024
miniCodeProps: a Minimal Benchmark for Proving Code Properties
miniCodeProps: a Minimal Benchmark for Proving Code Properties
Evan Lohn
Sean Welleck
181
14
0
16 Jun 2024
Reliable Evaluation and Benchmarks for Statement Autoformalization
Reliable Evaluation and Benchmarks for Statement Autoformalization
Auguste Poiroux
Gail Weiss
Viktor Kuncak
Antoine Bosselut
412
10
0
11 Jun 2024
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
Huaiyuan Ying
Zijian Wu
Yihan Geng
Zheng Yuan
Dahua Lin
Kai Chen
501
84
0
06 Jun 2024
Process-Driven Autoformalization in Lean 4
Process-Driven Autoformalization in Lean 4
Jianqiao Lu
Zhengying Liu
Yingjia Wan
Yinya Huang
Haiming Wang
Zhicheng YANG
Jing Tang
Zhijiang Guo
AI4CE
394
34
0
04 Jun 2024
Proving Theorems Recursively
Proving Theorems RecursivelyNeural Information Processing Systems (NeurIPS), 2024
Haiming Wang
Huajian Xin
Zhengying Liu
Wenda Li
Yinya Huang
...
Zhicheng YANG
Jing Tang
Jian Yin
Zhenguo Li
Xiaodan Liang
LRM
218
24
0
23 May 2024
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale
  Synthetic Data
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Huajian Xin
Daya Guo
Zhihong Shao
Zhaochun Ren
Qihao Zhu
Bo Liu
Chong Ruan
Wenda Li
Xiaodan Liang
SyDa
286
151
0
23 May 2024
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean
Peiyang Song
Kaiyu Yang
A. Anandkumar
372
10
0
18 Apr 2024
A Survey on Deep Learning for Theorem Proving
A Survey on Deep Learning for Theorem Proving
Zhaoyu Li
Jialiang Sun
Logan Murphy
Qidong Su
Zenan Li
Xian Zhang
Kaiyu Yang
Xujie Si
LRM
286
49
0
15 Apr 2024
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with
  Autoformalization
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Jin Peng Zhou
Charles Staats
Wenda Li
Christian Szegedy
Kilian Q. Weinberger
Yuhuai Wu
LRM
190
61
0
26 Mar 2024
BAIT: Benchmarking (Embedding) Architectures for Interactive
  Theorem-Proving
BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving
Sean Lamont
Michael Norrish
Amir Dezfouli
Christian J. Walder
Paul Montague
279
3
0
06 Mar 2024
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of
  LLMs as Mathematical Problem Solvers
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
Qintong Li
Leyang Cui
Xueliang Zhao
Lingpeng Kong
Wei Bi
LRM
334
107
0
29 Feb 2024
Measuring Vision-Language STEM Skills of Neural Models
Measuring Vision-Language STEM Skills of Neural Models
Jianhao Shen
Ye Yuan
Srbuhi Mirzoyan
Ming Zhang
Chenguang Wang
VLM
426
13
0
27 Feb 2024
OlympiadBench: A Challenging Benchmark for Promoting AGI with
  Olympiad-Level Bilingual Multimodal Scientific Problems
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Chaoqun He
Renjie Luo
Yuzhuo Bai
Shengding Hu
Zhen Leng Thai
...
Yuxiang Zhang
Jie Liu
Lei Qi
Zhiyuan Liu
Maosong Sun
ELMAIMat
401
677
0
21 Feb 2024
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
Yinya Huang
Xiaohan Lin
Zhengying Liu
Qingxing Cao
Huajian Xin
Haiming Wang
Zhenguo Li
Linqi Song
Xiaodan Liang
ALM
374
45
0
14 Feb 2024
InternLM-Math: Open Math Large Language Models Toward Verifiable
  Reasoning
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Huaiyuan Ying
Shuo Zhang
Linyang Li
Zhejian Zhou
Yunfan Shao
...
Hang Yan
Xipeng Qiu
Jiayu Wang
Kai-xiang Chen
Dahua Lin
ReLMLRM
235
114
0
09 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLMLRM
1.5K
3,856
0
05 Feb 2024
Large Language Models for Mathematical Reasoning: Progresses and
  Challenges
Large Language Models for Mathematical Reasoning: Progresses and Challenges
Janice Ahn
Rishu Verma
Renze Lou
Di Liu
Rui Zhang
Wenpeng Yin
LRM
361
267
0
31 Jan 2024
Evaluating LLMs' Mathematical and Coding Competency through
  Ontology-guided Interventions
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided InterventionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Pengfei Hong
Navonil Majumder
Deepanway Ghosal
Somak Aditya
Amélie Reymond
Soujanya Poria
LRM
360
13
0
17 Jan 2024
Enhancing Neural Theorem Proving through Data Augmentation and Dynamic
  Sampling Method
Enhancing Neural Theorem Proving through Data Augmentation and Dynamic Sampling Method
Rahul Vishwakarma
Subhankar Mishra
AIMat
265
2
0
20 Dec 2023
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
Haoran Liao
Qinyi Du
Shaohua Hu
Hao He
Yanyan Xu
Jidong Tian
Yaohui Jin
LRMAI4CE
195
2
0
14 Dec 2023
Large Language Models' Understanding of Math: Source Criticism and
  Extrapolation
Large Language Models' Understanding of Math: Source Criticism and Extrapolation
Roozbeh Yousefzadeh
Xuenan Cao
ELMLRM
127
6
0
12 Nov 2023
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric
  Problem Solving
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving
Xiaokai Zhang
Yang Li
Yiming He
Jia Zou
Qike Huang
...
Na Zhu
Zhen Zeng
Shaorong Xie
Xiangfeng Luo
Tuo Leng
AIMatAI4CE
390
9
0
27 Oct 2023
Llemma: An Open Language Model For Mathematics
Llemma: An Open Language Model For MathematicsInternational Conference on Learning Representations (ICLR), 2023
Zhangir Azerbayev
Hailey Schoelkopf
Keiran Paster
Marco Dos Santos
Alexander Shmakov
Albert Q. Jiang
Jia Deng
Stella Biderman
Sean Welleck
CLL
328
386
0
16 Oct 2023
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative
  Language Models
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jing Xiong
Jianhao Shen
Ye Yuan
Haiming Wang
Yichun Yin
...
Yinya Huang
Chuanyang Zheng
Xiaodan Liang
Ming Zhang
Qun Liu
AIMatLRM
203
26
0
16 Oct 2023
A New Approach Towards Autoformalization
A New Approach Towards Autoformalization
Nilay Patel
Rahul Saha
Jeffrey Flanigan
AI4CEAIMat
370
9
0
12 Oct 2023
An In-Context Learning Agent for Formal Theorem-Proving
An In-Context Learning Agent for Formal Theorem-Proving
Amitayush Thakur
George Tsoukalas
Yeming Wen
Jimmy Xin
Swarat Chaudhuri
LLMAG
271
49
0
06 Oct 2023
Previous
1234
Next