ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.00110
  4. Cited By
MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics

31 August 2021
Kunhao Zheng
Jesse Michael Han
Stanislas Polu
    AIMat
ArXivPDFHTML

Papers citing "MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics"

50 / 105 papers shown
Title
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical
  Reasoning with Checklist
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Zihao Zhou
Shudong Liu
Maizhen Ning
Wei Liu
Jindong Wang
Derek F. Wong
Xiaowei Huang
Qiufeng Wang
Kaizhu Huang
ELM
LRM
61
23
0
11 Jul 2024
Towards Automated Functional Equation Proving: A Benchmark Dataset and A
  Domain-Specific In-Context Agent
Towards Automated Functional Equation Proving: A Benchmark Dataset and A Domain-Specific In-Context Agent
Mahdi Buali
R. Hoehndorf
16
0
0
05 Jul 2024
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
Ruida Wang
Jipeng Zhang
Yizhen Jia
Rui Pan
Shizhe Diao
Renjie Pi
Tong Zhang
LRM
33
15
0
03 Jul 2024
Learning Formal Mathematics From Intrinsic Motivation
Learning Formal Mathematics From Intrinsic Motivation
Gabriel Poesia
David Broman
Nick Haber
Noah D. Goodman
LRM
25
9
0
30 Jun 2024
FVEL: Interactive Formal Verification Environment with Large Language
  Models via Theorem Proving
FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving
Xiaohan Lin
Qingxing Cao
Yinya Huang
Haiming Wang
Jianqiao Lu
Zhengying Liu
Linqi Song
Xiaodan Liang
LRM
36
4
0
20 Jun 2024
Proving Olympiad Algebraic Inequalities without Human Demonstrations
Proving Olympiad Algebraic Inequalities without Human Demonstrations
Chenrui Wei
Mengzhou Sun
Wei Wang
LRM
42
6
0
20 Jun 2024
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical
  Problem-Solving
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Yuxuan Tong
Xiwen Zhang
Rui Wang
R. Wu
Junxian He
AIMat
LRM
33
30
0
18 Jun 2024
miniCodeProps: a Minimal Benchmark for Proving Code Properties
miniCodeProps: a Minimal Benchmark for Proving Code Properties
Evan Lohn
Sean Welleck
32
2
0
16 Jun 2024
Improving Autoformalization using Type Checking
Improving Autoformalization using Type Checking
Auguste Poiroux
Gail Weiss
Viktor Kunčak
Antoine Bosselut
37
2
0
11 Jun 2024
Lean Workbook: A large-scale Lean problem set formalized from natural
  language math problems
Lean Workbook: A large-scale Lean problem set formalized from natural language math problems
Huaiyuan Ying
Zijian Wu
Yihan Geng
Jiayu Wang
Dahua Lin
Kai Chen
38
24
0
06 Jun 2024
Process-Driven Autoformalization in Lean 4
Process-Driven Autoformalization in Lean 4
Jianqiao Lu
Zhengying Liu
Yingjia Wan
Yinya Huang
Haiming Wang
Zhicheng YANG
Jing Tang
Zhijiang Guo
AI4CE
37
14
0
04 Jun 2024
Proving Theorems Recursively
Proving Theorems Recursively
Haiming Wang
Huajian Xin
Zhengying Liu
Wenda Li
Yinya Huang
...
Zhicheng YANG
Jing Tang
Jian Yin
Zhenguo Li
Xiaodan Liang
LRM
28
10
0
23 May 2024
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale
  Synthetic Data
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Huajian Xin
Daya Guo
Zhihong Shao
Z. Z. Ren
Qihao Zhu
Bo Liu
Chong Ruan
Wenda Li
Xiaodan Liang
SyDa
32
61
0
23 May 2024
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean
Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean
Peiyang Song
Kaiyu Yang
A. Anandkumar
37
10
0
18 Apr 2024
A Survey on Deep Learning for Theorem Proving
A Survey on Deep Learning for Theorem Proving
Zhaoyu Li
Jialiang Sun
Logan Murphy
Qidong Su
Zenan Li
Xian Zhang
Kaiyu Yang
Xujie Si
LRM
42
21
0
15 Apr 2024
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with
  Autoformalization
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Jin Peng Zhou
Charles Staats
Wenda Li
Christian Szegedy
Kilian Q. Weinberger
Yuhuai Wu
LRM
19
27
0
26 Mar 2024
BAIT: Benchmarking (Embedding) Architectures for Interactive
  Theorem-Proving
BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving
Sean Lamont
Michael Norrish
Amir Dezfouli
Christian J. Walder
Paul Montague
41
2
0
06 Mar 2024
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of
  LLMs as Mathematical Problem Solvers
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
Qintong Li
Leyang Cui
Xueliang Zhao
Lingpeng Kong
Wei Bi
LRM
35
46
0
29 Feb 2024
Measuring Vision-Language STEM Skills of Neural Models
Measuring Vision-Language STEM Skills of Neural Models
Jianhao Shen
Ye Yuan
Srbuhi Mirzoyan
Ming Zhang
Chenguang Wang
VLM
33
8
0
27 Feb 2024
OlympiadBench: A Challenging Benchmark for Promoting AGI with
  Olympiad-Level Bilingual Multimodal Scientific Problems
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Chaoqun He
Renjie Luo
Yuzhuo Bai
Shengding Hu
Zhen Leng Thai
...
Yuxiang Zhang
Jie Liu
Lei Qi
Zhiyuan Liu
Maosong Sun
ELM
AIMat
33
136
0
21 Feb 2024
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data
Yinya Huang
Xiaohan Lin
Zhengying Liu
Qingxing Cao
Huajian Xin
Haiming Wang
Zhenguo Li
Linqi Song
Xiaodan Liang
ALM
31
35
0
14 Feb 2024
InternLM-Math: Open Math Large Language Models Toward Verifiable
  Reasoning
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Huaiyuan Ying
Shuo Zhang
Linyang Li
Zhejian Zhou
Yunfan Shao
...
Hang Yan
Xipeng Qiu
Jiayu Wang
Kai-xiang Chen
Dahua Lin
ReLM
LRM
25
69
0
09 Feb 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
  Language Models
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Y. K. Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
26
631
0
05 Feb 2024
Large Language Models for Mathematical Reasoning: Progresses and
  Challenges
Large Language Models for Mathematical Reasoning: Progresses and Challenges
Janice Ahn
Rishu Verma
Renze Lou
Di Liu
Rui Zhang
Wenpeng Yin
LRM
36
115
0
31 Jan 2024
Evaluating LLMs' Mathematical and Coding Competency through
  Ontology-guided Interventions
Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions
Pengfei Hong
Navonil Majumder
Deepanway Ghosal
Somak Aditya
Rada Mihalcea
Soujanya Poria
LRM
40
4
0
17 Jan 2024
Enhancing Neural Theorem Proving through Data Augmentation and Dynamic
  Sampling Method
Enhancing Neural Theorem Proving through Data Augmentation and Dynamic Sampling Method
Rahul Vishwakarma
Subhankar Mishra
AIMat
11
1
0
20 Dec 2023
A Survey of Reasoning with Foundation Models
A Survey of Reasoning with Foundation Models
Jiankai Sun
Chuanyang Zheng
E. Xie
Zhengying Liu
Ruihang Chu
...
Xipeng Qiu
Yi-Chen Guo
Hui Xiong
Qun Liu
Zhenguo Li
ReLM
LRM
AI4CE
22
76
0
17 Dec 2023
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
Haoran Liao
Qinyi Du
Shaohua Hu
Hao He
Yanyan Xu
Jidong Tian
Yaohui Jin
LRM
AI4CE
27
1
0
14 Dec 2023
Large Language Models' Understanding of Math: Source Criticism and
  Extrapolation
Large Language Models' Understanding of Math: Source Criticism and Extrapolation
Roozbeh Yousefzadeh
Xuenan Cao
ELM
LRM
9
2
0
12 Nov 2023
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric
  Problem Solving
FormalGeo: An Extensible Formalized Framework for Olympiad Geometric Problem Solving
Xiaokai Zhang
Na Zhu
Yiming He
Jia Zou
Qike Huang
...
Cheng Qin
Zhen Zeng
Shaorong Xie
Xiangfeng Luo
Tuo Leng
AIMat
AI4CE
14
3
0
27 Oct 2023
Llemma: An Open Language Model For Mathematics
Llemma: An Open Language Model For Mathematics
Zhangir Azerbayev
Hailey Schoelkopf
Keiran Paster
Marco Dos Santos
Stephen Marcus McAleer
Albert Q. Jiang
Jia Deng
Stella Biderman
Sean Welleck
CLL
24
270
0
16 Oct 2023
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative
  Language Models
TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models
Jing Xiong
Jianhao Shen
Ye Yuan
Haiming Wang
Yichun Yin
...
Yinya Huang
Chuanyang Zheng
Xiaodan Liang
Ming Zhang
Qun Liu
AIMat
LRM
16
15
0
16 Oct 2023
A New Approach Towards Autoformalization
A New Approach Towards Autoformalization
Nilay Patel
Rahul Saha
Jeffrey Flanigan
AI4CE
AIMat
13
3
0
12 Oct 2023
An In-Context Learning Agent for Formal Theorem-Proving
An In-Context Learning Agent for Formal Theorem-Proving
Amitayush Thakur
George Tsoukalas
Yeming Wen
Jimmy Xin
Swarat Chaudhuri
LLMAG
25
24
0
06 Oct 2023
LEGO-Prover: Neural Theorem Proving with Growing Libraries
LEGO-Prover: Neural Theorem Proving with Growing Libraries
Haiming Wang
Huajian Xin
Chuanyang Zheng
Lin Li
Zhengying Liu
...
Enze Xie
Jian Yin
Zhenguo Li
Heng Liao
Xiaodan Liang
LRM
39
61
0
01 Oct 2023
FIMO: A Challenge Formal Dataset for Automated Theorem Proving
FIMO: A Challenge Formal Dataset for Automated Theorem Proving
Chengwu Liu
Jianhao Shen
Huajian Xin
Zhengying Liu
Ye Yuan
...
Chuanyang Zheng
Yichun Yin
Lin Li
Ming Zhang
Qun Liu
AIMat
AI4CE
27
31
0
08 Sep 2023
Large Language Models
Large Language Models
Michael R Douglas
LLMAG
LM&MA
33
555
0
11 Jul 2023
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Kaiyu Yang
Aidan M. Swope
Alex Gu
Rahul Chalamala
Peiyang Song
Shixing Yu
Saad Godil
R. Prenger
Anima Anandkumar
RALM
14
207
0
27 Jun 2023
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal
  Theorem Proving
Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving
Xueliang Zhao
Wenda Li
Lingpeng Kong
25
28
0
25 May 2023
Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For
  Large Language Models
Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models
Daman Arora
H. Singh
Mausam
ELM
LRM
30
49
0
24 May 2023
Baldur: Whole-Proof Generation and Repair with Large Language Models
Baldur: Whole-Proof Generation and Repair with Large Language Models
E. First
M. Rabe
Talia Ringer
Yuriy Brun
59
92
0
08 Mar 2023
Magnushammer: A Transformer-Based Approach to Premise Selection
Magnushammer: A Transformer-Based Approach to Premise Selection
Maciej Mikuła
Szymon Tworkowski
Szymon Antoniak
Bartosz Piotrowski
Albert Qiaochu Jiang
Jinyi Zhou
Christian Szegedy
Lukasz Kuciñski
Piotr Milo's
Yuhuai Wu
39
42
0
08 Mar 2023
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level
  Mathematics
ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics
Zhangir Azerbayev
Bartosz Piotrowski
Hailey Schoelkopf
Edward W. Ayers
Dragomir R. Radev
J. Avigad
AIMat
11
66
0
24 Feb 2023
Parsel: Algorithmic Reasoning with Language Models by Composing
  Decompositions
Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions
E. Zelikman
Qian Huang
Gabriel Poesia
Noah D. Goodman
Nick Haber
ReLM
LRM
19
53
0
20 Dec 2022
A Survey of Deep Learning for Mathematical Reasoning
A Survey of Deep Learning for Mathematical Reasoning
Pan Lu
Liang Qiu
Wenhao Yu
Sean Welleck
Kai-Wei Chang
ReLM
LRM
32
137
0
20 Dec 2022
Towards a Mathematics Formalisation Assistant using Large Language
  Models
Towards a Mathematics Formalisation Assistant using Large Language Models
Ayush Agrawal
Siddhartha Gadgil
Navin Goyal
Ashvni Narayanan
Anand Tadipatri
27
15
0
14 Nov 2022
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal
  Proofs
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Q. Jiang
Sean Welleck
Jin Peng Zhou
Wenda Li
Jiacheng Liu
M. Jamnik
Timothée Lacroix
Yuhuai Wu
Guillaume Lample
AIMat
63
157
0
21 Oct 2022
Learning to Prove Trigonometric Identities
Learning to Prove Trigonometric Identities
Zhouwu Liu
Yujun Li
Zhengying Liu
Lin Li
Zheng Li
12
1
0
14 Jul 2022
A Survey in Mathematical Language Processing
A Survey in Mathematical Language Processing
Jordan Meadows
André Freitas
AIMat
19
15
0
30 May 2022
Autoformalization with Large Language Models
Autoformalization with Large Language Models
Yuhuai Wu
Albert Q. Jiang
Wenda Li
M. Rabe
Charles Staats
M. Jamnik
Christian Szegedy
AI4CE
108
156
0
25 May 2022
Previous
123
Next