ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.13867
  4. Cited By
Mathematical Capabilities of ChatGPT
v1v2 (latest)

Mathematical Capabilities of ChatGPT

Neural Information Processing Systems (NeurIPS), 2023
31 January 2023
Simon Frieder
Luca Pinchetti
Alexis Chevalier
Ryan-Rhys Griffiths
Tommaso Salvatori
Thomas Lukasiewicz
P. Petersen
Julius Berner
    ELMAI4MH
ArXiv (abs)PDFHTML

Papers citing "Mathematical Capabilities of ChatGPT"

50 / 227 papers shown
Sequential Enumeration in Large Language Models
Sequential Enumeration in Large Language Models
Kuinan Hou
Marco Zorzi
Alberto Testolin
121
1
0
04 Dec 2025
CoT4AD: A Vision-Language-Action Model with Explicit Chain-of-Thought Reasoning for Autonomous Driving
CoT4AD: A Vision-Language-Action Model with Explicit Chain-of-Thought Reasoning for Autonomous Driving
Zhaohui Wang
Tengbo Yu
Hao Tang
LRM
161
0
0
27 Nov 2025
Enhancing Large Language Models for Automated Homework Assessment in Undergraduate Circuit Analysis
Enhancing Large Language Models for Automated Homework Assessment in Undergraduate Circuit Analysis
Liangliang Chen
Huiru Xie
Zhihao Qin
Yiming Guo
Jacqueline Rohde
Ying Zhang
93
2
0
22 Nov 2025
Trustworthy LLM-Mediated Communication: Evaluating Information Fidelity in LLM as a Communicator (LAAC) Framework in Multiple Application Domains
Trustworthy LLM-Mediated Communication: Evaluating Information Fidelity in LLM as a Communicator (LAAC) Framework in Multiple Application Domains
Mohammed Musthafa Rafi
Adarsh Krishnamurthy
Aditya Balu
125
0
0
06 Nov 2025
The ORCA Benchmark: Evaluating Real-World Calculation Accuracy in Large Language Models
The ORCA Benchmark: Evaluating Real-World Calculation Accuracy in Large Language Models
Claudia Herambourg
Dawid Siuda
Julia Kopczyńska
Joao R. L. Santos
Wojciech Sas
Joanna Śmietańska-Nowak
ELMALMLRM
397
0
0
04 Nov 2025
Test-Time Tuned Language Models Enable End-to-end De Novo Molecular Structure Generation from MS/MS Spectra
Test-Time Tuned Language Models Enable End-to-end De Novo Molecular Structure Generation from MS/MS Spectra
Laura Mismetti
Marvin Alberts
Andreas Krause
Mara Graziani
92
0
0
27 Oct 2025
Foundation of Intelligence: Review of Math Word Problems from Human Cognition Perspective
Foundation of Intelligence: Review of Math Word Problems from Human Cognition Perspective
Zhenya Huang
Jiayu Liu
Xin Lin
Zhiyuan Ma
Shangzi Xue
Tong Xiao
Qi Liu
Yee Whye Teh
Enhong Chen
AIMatLRM
350
0
0
24 Oct 2025
Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
Max It or Miss It: Benchmarking LLM On Solving Extremal Problems
Binxin Gao
Jingjun Han
ELMLRM
210
0
0
14 Oct 2025
RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows
RefGrader: Automated Grading of Mathematical Competition Proofs using Agentic Workflows
Hamed Mahdavi
Pouria Mahdavinia
Samira Malek
Pegah Mohammadipour
Alireza Hashemi
Majid Daliri
Alireza Farhadi
Amir Khasahmadi
Niloofar Mireshghallah
V. Honavar
154
1
0
10 Oct 2025
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
Ivo Petrov
Jasper Dekoninck
Martin Vechev
147
3
0
06 Oct 2025
IMProofBench: Benchmarking AI on Research-Level Mathematical Proof Generation
IMProofBench: Benchmarking AI on Research-Level Mathematical Proof Generation
Johannes Schmitt
Gergely Bérczi
Jasper Dekoninck
Jeremy Feusi
Tim Gehrunger
...
Raúl Sánchez Galán
Zheming Sun
Josef Teichmann
Richard P. Thomas
Charles Vial
LRM
126
3
0
30 Sep 2025
WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning
WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning
Xin Li
Mengbing Liu
Yiyang Zhu
W. Zhang
Li Wei
Jiancheng An
Chau Yuen
LRM
69
0
0
27 Sep 2025
Evaluating undergraduate mathematics examinations in the era of generative AI: a curriculum-level case study
Evaluating undergraduate mathematics examinations in the era of generative AI: a curriculum-level case study
Benjamin J. Walker
Nikoleta Kalaydzhieva
Beatriz Navarro Lameda
Ruth A. Reynolds
ELM
181
0
0
15 Sep 2025
Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
Fangzhou Wu
Sandeep Silwal
231
0
0
02 Sep 2025
A perishable ability? The future of writing in the face of generative artificial intelligence
A perishable ability? The future of writing in the face of generative artificial intelligence
Evandro L. T. P. Cunha
DeLMO
171
0
0
26 Aug 2025
Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT
Rethinking Reasoning in LLMs: Neuro-Symbolic Local RetoMaton Beyond ICL and CoT
Rushitha Santhoshi Mamidala
Anshuman Chhabra
Ankur Mali
OffRLLRM
128
0
0
22 Aug 2025
Structuring the Unstructured: A Systematic Review of Text-to-Structure Generation for Agentic AI with a Universal Evaluation Framework
Structuring the Unstructured: A Systematic Review of Text-to-Structure Generation for Agentic AI with a Universal Evaluation Framework
Zheye Deng
Chunkit Chan
Tianshi Zheng
Wei Fan
Weiqi Wang
Yangqiu Song
129
3
0
17 Aug 2025
Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
Too Easily Fooled? Prompt Injection Breaks LLMs on Frustratingly Simple Multiple-Choice Questions
Xuyang Guo
Zekai Huang
Zhao Song
Jiahao Zhang
LRM
140
3
0
16 Aug 2025
A Multi-Task Evaluation of LLMs' Processing of Academic Text Input
A Multi-Task Evaluation of LLMs' Processing of Academic Text Input
Tianyi Li
Yu Qin
Olivia R. Liu Sheng
115
0
0
15 Aug 2025
NLP Methods May Actually Be Better Than Professors at Estimating Question Difficulty
NLP Methods May Actually Be Better Than Professors at Estimating Question Difficulty
Leonidas Zotos
Ivo Pascal de Jong
Matias Valdenegro-Toro
A. Sburlea
Malvina Nissim
H. Rijn
164
0
0
05 Aug 2025
Causes in neuron diagrams, and testing causal reasoning in Large Language Models. A glimpse of the future of philosophy?
Causes in neuron diagrams, and testing causal reasoning in Large Language Models. A glimpse of the future of philosophy?
Louis Vervoort
Vitaly Nikolaev
162
0
0
17 Jun 2025
AbsenceBench: Language Models Can't Tell What's Missing
AbsenceBench: Language Models Can't Tell What's Missing
Harvey Yiyun Fu
Aryan Shrivastava
Jared Moore
Peter West
Chenhao Tan
Ari Holtzman
RALM
203
3
0
13 Jun 2025
MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?
MATP-BENCH: Can MLLM Be a Good Automated Theorem Prover for Multimodal Problems?
Zhitao He
Zongwei Lyu
Dazhong Chen
Dadi Guo
Yi R. Fung
LRM
232
6
0
06 Jun 2025
XToM: Exploring the Multilingual Theory of Mind for Large Language Models
XToM: Exploring the Multilingual Theory of Mind for Large Language Models
Chunkit Chan
Yauwai Yim
Hongchuan Zeng
Zhiying Zou
Xinyuan Cheng
...
Ginny Wong
Helmut Schmid
Hinrich Schütze
Simon See
Yangqiu Song
LRM
201
0
0
03 Jun 2025
Evaluation of LLMs for mathematical problem solving
Evaluation of LLMs for mathematical problem solving
Ruonan Wang
Runxi Wang
Yunwen Shen
Chengfeng Wu
Qinglin Zhou
Rohitash Chandra
ELMLRM
398
2
0
30 May 2025
ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark
ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark
M. Shalyt
Rotem Elimelech
I. Kaminer
153
3
0
28 May 2025
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Yana Veitsman
Mayank Jobanputra
Yash Sarrof
Aleksandra Bakalova
Vera Demberg
Ellie Pavlick
Michael Hahn
473
2
0
27 May 2025
Two Causally Related Needles in a Video Haystack
Two Causally Related Needles in a Video Haystack
Miaoyu Li
Qin Chao
Boyang Albert Li
CML
301
0
0
26 May 2025
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks
Debargha Ganguly
Vikash Singh
Sreehari Sankar
Biyao Zhang
Xuecen Zhang
Srinivasan Iyengar
Xiaotian Han
Amit Sharma
Shivkumar Kalyanaraman
Vipin Chaudhary
307
2
0
26 May 2025
Small Models, Smarter Learning: The Power of Joint Task Training
Small Models, Smarter Learning: The Power of Joint Task Training
C. Both
Benjamin Hoover
Hendrik Strobelt
Dmitry Krotov
Daniel Karl I. Weidele
Mauro Martino
Nima Dehmamy
224
0
0
23 May 2025
AutoMathKG: The automated mathematical knowledge graph based on LLM and vector database
AutoMathKG: The automated mathematical knowledge graph based on LLM and vector databaseInternational Conference on Climate Informatics (ICCI), 2025
Rong Bian
Yu Geng
Zijian Yang
Bing Cheng
533
2
0
19 May 2025
From Recall to Reasoning: Automated Question Generation for Deeper Math Learning through Large Language Models
From Recall to Reasoning: Automated Question Generation for Deeper Math Learning through Large Language ModelsInternational Conference on Artificial Intelligence in Education (AIED), 2025
Yongan Yu
Alexandre Krantz
Nikki G. Lobczowski
LRM
187
1
0
17 May 2025
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Combining Bayesian Inference and Reinforcement Learning for Agent Decision Making: A Review
Chengmin Zhou
Ville Kyrki
Pasi Fränti
Laura Ruotsalainen
BDLAI4CE
423
0
0
12 May 2025
Assessment of Evolving Large Language Models in Upper Secondary Mathematics
Assessment of Evolving Large Language Models in Upper Secondary Mathematics
Mika Setälä
Pieta Sikström
Ville Heilala
T. Karkkainen
ELMLRM
252
1
0
15 Apr 2025
Physics-informed KAN PointNet: Deep learning for simultaneous solutions to inverse problems in incompressible flow on numerous irregular geometries
Physics-informed KAN PointNet: Deep learning for simultaneous solutions to inverse problems in incompressible flow on numerous irregular geometries
Ali Kashefi
T. Mukerji
3DPCPINN
363
5
0
08 Apr 2025
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics
Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics
Hamed Mahdavi
Alireza Hashemi
Majid Daliri
Pegah Mohammadipour
Alireza Farhadi
Samira Malek
Yekta Yazdanifard
Amir Khasahmadi
V. Honavar
ELMLRM
428
16
0
01 Apr 2025
Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic Relations
Integrating Artificial Intelligence with Human Expertise: An In-depth Analysis of ChatGPT's Capabilities in Generating Metamorphic Relations
Yifan Zhang
Dave Towey
Matthew Pike
Q. Luu
Huai Liu
T. Chen
210
0
0
28 Mar 2025
Boosting Large Language Models with Mask Fine-Tuning
Boosting Large Language Models with Mask Fine-Tuning
M. Zhang
Yue Bai
Huan Wang
Yizhou Wang
Qihua Dong
Y. Fu
CLL
239
2
0
27 Mar 2025
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Ivo Petrov
Jasper Dekoninck
Lyuben Baltadzhiev
Maria Drencheva
Kristian Minchev
Mislav Balunović
Nikola Jovanović
Martin Vechev
LRMELM
505
59
0
27 Mar 2025
ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation
ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation
Haoyu Fu
Diankun Zhang
Zongchuang Zhao
Jianfeng Cui
Dingkang Liang
Chong Zhang
Dingyuan Zhang
Hongwei Xie
Bing Wang
Xiang Bai
362
59
0
25 Mar 2025
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems
Felix Chen
Hangjie Yuan
Yunqiu Xu
Tao Feng
Jun Cen
Pengwei Liu
Zeying Huang
Yi Yang
LRM
280
6
0
19 Mar 2025
Unlocking Learning Potentials: The Transformative Effect of Generative AI in Education Across Grade Levels
Unlocking Learning Potentials: The Transformative Effect of Generative AI in Education Across Grade Levels
Meijuan Xie
Liling Luo
100
0
0
15 Mar 2025
Out-of-Context Reasoning in Large Language Models
Out-of-Context Reasoning in Large Language Models
Jonathan Shaki
Emanuele La Malfa
Michael Wooldridge
Sarit Kraus
LRMReLM
425
0
0
13 Mar 2025
Numerical Error Analysis of Large Language Models
Stanislav Budzinskiy
Wenyi Fang
Longbin Zeng
Philipp Petersen
220
2
0
13 Mar 2025
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models
SciHorizon: Benchmarking AI-for-Science Readiness from Scientific Data to Large Language Models
Chuan Qin
Xiusi Chen
Chengrui Wang
Pengmin Wu
Xi Chen
...
Han Wu
Chong Li
Yuanchun Zhou
H. Xiong
Hengshu Zhu
ELM
306
6
0
12 Mar 2025
Measure Twice, Cut Once: Grasping Video Structures and Event Semantics with LLMs for Video Temporal Localization
Zongshang Pang
Mayu Otani
Yuta Nakashima
335
3
0
12 Mar 2025
AlphaDrive: Unleashing the Power of VLMs in Autonomous Driving via Reinforcement Learning and Reasoning
Bo Jiang
Shaoyu Chen
Qian Zhang
Wenyu Liu
Xinggang Wang
OffRLLRMVLM
342
44
0
10 Mar 2025
PiCO: Peer Review in LLMs based on the Consistency Optimization
PiCO: Peer Review in LLMs based on the Consistency Optimization
Hai-Jian Ke
Shuo Yang
Yu-Yang Liu
Jia-Yu Yao
Zhen-Hui Liu
Yu Wang
Ming Pang
Li Yuan
ALM
504
14
0
24 Feb 2025
Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents
Curie: Toward Rigorous and Automated Scientific Experimentation with AI Agents
Patrick Tser Jern Kon
Jiachen Liu
Qiuyi Ding
Yiming Qiu
Zhenning Yang
Yibo Huang
Jayanth Srinivasa
Myungjin Lee
Mosharaf Chowdhury
Ang Chen
385
17
0
22 Feb 2025
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems
Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems
E. Davis
S. Aaronson
ELM
376
27
0
21 Feb 2025
12345
Next