ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.13867
  4. Cited By
Mathematical Capabilities of ChatGPT
v1v2 (latest)

Mathematical Capabilities of ChatGPT

Neural Information Processing Systems (NeurIPS), 2023
31 January 2023
Simon Frieder
Luca Pinchetti
Alexis Chevalier
Ryan-Rhys Griffiths
Tommaso Salvatori
Thomas Lukasiewicz
P. Petersen
Julius Berner
    ELMAI4MH
ArXiv (abs)PDFHTML

Papers citing "Mathematical Capabilities of ChatGPT"

50 / 227 papers shown
The potential of large language models for improving probability
  learning: A study on ChatGPT3.5 and first-year computer engineering students
The potential of large language models for improving probability learning: A study on ChatGPT3.5 and first-year computer engineering students
Angel Udias
A. Alonso-Ayuso
Ignacio Sanchez
Sonia Hernandez
Maria Eugenia Castellanos
R. M. Diez
Emilio Lopez Cano
166
1
0
09 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
FELM: Benchmarking Factuality Evaluation of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
423
62
0
01 Oct 2023
Language Models as a Service: Overview of a New Paradigm and its
  Challenges
Language Models as a Service: Overview of a New Paradigm and its ChallengesJournal of Artificial Intelligence Research (JAIR), 2023
Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt
Michael Wooldridge
ALMELM
294
12
0
28 Sep 2023
ChatGPT & Mechanical Engineering: Examining performance on the FE
  Mechanical Engineering and Undergraduate Exams
ChatGPT & Mechanical Engineering: Examining performance on the FE Mechanical Engineering and Undergraduate Exams
Matthew Frenkel
Hebah Emara
156
4
0
26 Sep 2023
What does ChatGPT know about natural science and engineering?
What does ChatGPT know about natural science and engineering?
Lukas Schulze Balhorn
Jana M. Weber
Stefan Buijsman
J. Hildebrandt
Martina Ziefle
Artur M. Schweidtmann
AI4MHAI4CEELM
127
5
0
18 Sep 2023
How much can ChatGPT really help Computational Biologists in
  Programming?
How much can ChatGPT really help Computational Biologists in Programming?
C. R. Rahman
Limsoon Wong
AI4CE
161
3
0
17 Sep 2023
ChatGPT-4 with Code Interpreter can be used to solve introductory
  college-level vector calculus and electromagnetism problems
ChatGPT-4 with Code Interpreter can be used to solve introductory college-level vector calculus and electromagnetism problemsAmerican Journal of Physics (AJP), 2023
Tanuj Kumar
M. Kats
111
12
0
16 Sep 2023
TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation
  Models
TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models
Siyao Zhang
Daocheng Fu
Zhao Zhang
Bin Yu
Pinlong Cai
160
62
0
13 Sep 2023
Towards LLM-based Autograding for Short Textual Answers
Towards LLM-based Autograding for Short Textual AnswersInternational Conference on Computer Supported Education (CSEDU), 2023
Johannes Schneider
Bernd Schenk
Christina Niklaus
AI4Ed
242
51
0
09 Sep 2023
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Jiatong Li
Rui Li
Qi Liu
235
27
0
08 Sep 2023
LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection
LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection
Jiaxing Qi
Shaohan Huang
Zhongzhi Luan
Carol J. Fung
Hailong Yang
D. Qian
154
57
0
03 Sep 2023
No Train Still Gain. Unleash Mathematical Reasoning of Large Language
  Models with Monte Carlo Tree Search Guided by Energy Function
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function
Haotian Xu
LRM
241
15
0
01 Sep 2023
GPTEval: A Survey on Assessments of ChatGPT and GPT-4
GPTEval: A Survey on Assessments of ChatGPT and GPT-4International Conference on Language Resources and Evaluation (LREC), 2023
Rui Mao
Guanyi Chen
Xulang Zhang
Frank Guerin
Xiaoshi Zhong
ELMLM&MA
185
146
0
24 Aug 2023
Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis
Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis
Akshat Gupta
LLMAGAI4MH
159
16
0
23 Aug 2023
Diversity Measures: Domain-Independent Proxies for Failure in Language
  Model Queries
Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries
Noel Ngu
Nathaniel Lee
Paulo Shakarian
164
5
0
22 Aug 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructInternational Conference on Learning Representations (ICLR), 2023
Haipeng Luo
Qingfeng Sun
Can Xu
Lu Wang
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRMOSLM
800
622
0
18 Aug 2023
A criterion for Artificial General Intelligence: hypothetic-deductive
  reasoning, tested on ChatGPT
A criterion for Artificial General Intelligence: hypothetic-deductive reasoning, tested on ChatGPT
L. Vervoort
Vitaliy Mizyakov
Anastasia V. Ugleva
ReLMELMLRM
114
1
0
05 Aug 2023
Does Correction Remain A Problem For Large Language Models?
Does Correction Remain A Problem For Large Language Models?
Xiaowu Zhang
Xiaotian Zhang
Cheng Yang
Hang Yan
Xipeng Qiu
LRMKELM
152
7
0
03 Aug 2023
What Is the Difference Between a Mountain and a Molehill? Quantifying
  Semantic Labeling of Visual Features in Line Charts
What Is the Difference Between a Mountain and a Molehill? Quantifying Semantic Labeling of Visual Features in Line ChartsVisual .. (VISUAL), 2023
Dennis Bromley
V. Setlur
79
13
0
02 Aug 2023
Olio: A Semantic Search Interface for Data Repositories
Olio: A Semantic Search Interface for Data RepositoriesACM Symposium on User Interface Software and Technology (UIST), 2023
V. Setlur
Andriy Kanyuka
Arjun Srinivasan
231
16
0
31 Jul 2023
How to Design and Deliver Courses for Higher Education in the AI Era:
  Insights from Exam Data Analysis
How to Design and Deliver Courses for Higher Education in the AI Era: Insights from Exam Data Analysis
A. Wazan
I. Taj
Abdulhadi Shoufan
R. Laborde
Rémi Venant
ELM
111
2
0
22 Jul 2023
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities
  of Large Language Models
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Xiaoxuan Wang
Ziniu Hu
Pan Lu
Yanqiao Zhu
Jieyu Zhang
Satyen Subramaniam
Arjun R. Loomba
Shichang Zhang
Luke Huan
Wei Wang
ELMLRM
395
172
0
20 Jul 2023
PharmacyGPT: The AI Pharmacist
PharmacyGPT: The AI Pharmacist
Zheng Liu
Zihao Wu
Mengxuan Hu
Bokai Zhao
Lin Zhao
...
Ye Shen
Sheng Li
Brian Murray
Tianming Liu
Andrea Sikora
LM&MAAI4MH
268
8
0
19 Jul 2023
Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency
  in coding algorithms and data structures
Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structuresInternational Conference on Agents and Artificial Intelligence (ICAART), 2023
Sayed Erfan Arefin
T. Ashrafi
H. Al-Qudah
Ynes Ineza
Abdul Serwadda
ELM
273
7
0
10 Jul 2023
Can LLMs be Good Financial Advisors?: An Initial Study in Personal
  Decision Making for Optimized Outcomes
Can LLMs be Good Financial Advisors?: An Initial Study in Personal Decision Making for Optimized Outcomes
Kausik Lakkaraju
Sai Krishna Revanth Vuruma
Vishal Pallagani
Bharath Muppasani
Biplav Srivastava
136
18
0
08 Jul 2023
A Survey on Evaluation of Large Language Models
A Survey on Evaluation of Large Language ModelsACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Yu-Chu Chang
Xu Wang
Yongfeng Zhang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELMLM&MAALM
700
2,732
0
06 Jul 2023
Evaluating the Effectiveness of Large Language Models in Representing
  Textual Descriptions of Geometry and Spatial Relations
Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial RelationsInternational Conference Geographic Information Science (GIScience), 2023
Yu Ji
Song Gao
161
25
0
05 Jul 2023
Evaluating ChatGPT's Decimal Skills and Feedback Generation in a Digital
  Learning Game
Evaluating ChatGPT's Decimal Skills and Feedback Generation in a Digital Learning GameEuropean Conference on Technology Enhanced Learning (EC-TEL), 2023
H. Nguyen
Hayden Stec
Xinying Hou
Sarah Di
B. McLaren
LRM
178
47
0
29 Jun 2023
Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models
Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models
Zaid Alyafeai
Maged S. Alshaibani
Badr AlKhamissi
H. Luqman
Ebrahim Alareqi
A. Fadel
ELMLM&MAAI4MH
127
22
0
28 Jun 2023
MyCrunchGPT: A chatGPT assisted framework for scientific machine
  learning
MyCrunchGPT: A chatGPT assisted framework for scientific machine learningJournal of Machine Learning for Modeling and Computing (JMLMC), 2023
Varun V. Kumar
Leonard Gleyzer
Adar Kahana
K. Shukla
George Karniadakis
AI4CE
274
16
0
27 Jun 2023
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and
  Problem Solving: Evidence from the Vietnamese National High School Graduation
  Examination
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination
Xuan-Quy Dao
Ngoc-Bich Le
158
45
0
10 Jun 2023
A survey of Generative AI Applications
A survey of Generative AI ApplicationsJournal of Computer Science (JCS), 2023
Roberto Gozalo-Brizuela
Eduardo C. Garrido-Merchán
3DVMedIm
366
129
0
05 Jun 2023
ChatGPT is a Remarkable Tool -- For Experts
ChatGPT is a Remarkable Tool -- For ExpertsData Intelligence (DI), 2023
A. Azaria
Rina Azoulay-Schwartz
S. Reches
135
112
0
02 Jun 2023
Inspecting Spoken Language Understanding from Kids for Basic Math
  Learning at Home
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at HomeWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2023
Eda Okur
Roddy Fuentes Alba
Saurav Sahay
L. Nachman
188
1
0
01 Jun 2023
Domain Specialization as the Key to Make Large Language Models
  Disruptive: A Comprehensive Survey
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive SurveyACM Computing Surveys (ACM Comput. Surv.), 2023
Chen Ling
Xujiang Zhao
Jiaying Lu
Chengyuan Deng
Can Zheng
...
Chris White
Quanquan Gu
Jian Pei
Carl Yang
Bo Pan
ALM
397
211
0
30 May 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark
  Datasets
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark DatasetsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Md Tahmid Rahman Laskar
M Saiful Bari
Mizanur Rahman
Md Amran Hossen Bhuiyan
Shafiq Joty
J. Huang
LM&MAELMALM
479
212
0
29 May 2023
Chatbots to ChatGPT in a Cybersecurity Space: Evolution,
  Vulnerabilities, Attacks, Challenges, and Future Recommendations
Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations
Attia Qammar
Hongmei Wang
Jianguo Ding
Abdenacer Naouri
M. Daneshmand
Huansheng Ning
SILM
160
25
0
29 May 2023
What can Large Language Models do in chemistry? A comprehensive
  benchmark on eight tasks
What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasksNeural Information Processing Systems (NeurIPS), 2023
Taicheng Guo
Kehan Guo
B. Nan
Zhengwen Liang
Zhichun Guo
Nitesh Chawla
Olaf Wiest
Xiangliang Zhang
ELM
512
208
0
27 May 2023
Disentangled Phonetic Representation for Chinese Spelling Correction
Disentangled Phonetic Representation for Chinese Spelling CorrectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zihong Liang
Xiaojun Quan
Qifan Wang
182
26
0
24 May 2023
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties
  Grounded in Math Reasoning Problems
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning ProblemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jakub Macina
Nico Daheim
Sankalan Pal Chowdhury
Tanmay Sinha
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
294
115
0
23 May 2023
ChatGPT to Replace Crowdsourcing of Paraphrases for Intent
  Classification: Higher Diversity and Comparable Model Robustness
ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model RobustnessConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ján Cegin
Jakub Simko
Peter Brusilovsky
241
57
0
22 May 2023
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist
  Examination
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist ExaminationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dongfang Li
Jindi Yu
Baotian Hu
Zhenran Xu
Hao Fei
ELM
177
14
0
22 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through
  the Lens of Verification and Validation
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and ValidationArtificial Intelligence Review (AIR), 2023
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
347
143
0
19 May 2023
A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5
  and Bard AI Models for Java Functions
A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5 and Bard AI Models for Java Functions
Giuseppe Destefanis
Silvia Bartolucci
Marco Ortu
ELM
181
26
0
16 May 2023
Enhancing Chemistry Learning with ChatGPT and Bing Chat as Agents to
  Think With: A Comparative Case Study
Enhancing Chemistry Learning with ChatGPT and Bing Chat as Agents to Think With: A Comparative Case StudySocial Science Research Network (SSRN), 2023
R. P. D. Santos
AI4CELLMAG
121
33
0
12 May 2023
Humans are Still Better than ChatGPT: Case of the IEEEXtreme Competition
Humans are Still Better than ChatGPT: Case of the IEEEXtreme Competition
Anis Koubaa
B. Qureshi
Adel Ammar
Zahid Khan
W. Boulila
L. Ghouti
ELMALM
180
32
0
10 May 2023
Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback
  with an Existing Taxonomy
Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy
Andrew Katz
Siqing Wei
Gaurav Nanda
Christopher G. Brinton
M. Ohland
106
17
0
09 May 2023
Professional Certification Benchmark Dataset: The First 500 Jobs For
  Large Language Models
Professional Certification Benchmark Dataset: The First 500 Jobs For Large Language Models
David Noever
Matt Ciolino
ELM
131
4
0
07 May 2023
Enhancing STEM Learning with ChatGPT and Bing Chat as Objects to Think
  With: A Case Study
Enhancing STEM Learning with ChatGPT and Bing Chat as Objects to Think With: A Case StudySocial Science Research Network (SSRN), 2023
Marco Antonio Rodrigues Vasconcelos
R. P. D. Santos
LRMAI4CE
169
92
0
01 May 2023
ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal,
  Causal, and Discourse Relations
ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Chunkit Chan
Cheng Jiayang
Weiqi Wang
Yuxin Jiang
Tianqing Fang
Xin Liu
Yangqiu Song
LRM
335
67
0
28 Apr 2023
Previous
12345
Next