Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2301.13867
Cited By
v1
v2 (latest)
Mathematical Capabilities of ChatGPT
Neural Information Processing Systems (NeurIPS), 2023
31 January 2023
Simon Frieder
Luca Pinchetti
Alexis Chevalier
Ryan-Rhys Griffiths
Tommaso Salvatori
Thomas Lukasiewicz
P. Petersen
Julius Berner
ELM
AI4MH
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Mathematical Capabilities of ChatGPT"
50 / 227 papers shown
The potential of large language models for improving probability learning: A study on ChatGPT3.5 and first-year computer engineering students
Angel Udias
A. Alonso-Ayuso
Ignacio Sanchez
Sonia Hernandez
Maria Eugenia Castellanos
R. M. Diez
Emilio Lopez Cano
166
1
0
09 Oct 2023
FELM: Benchmarking Factuality Evaluation of Large Language Models
Neural Information Processing Systems (NeurIPS), 2023
Shiqi Chen
Yiran Zhao
Jinghan Zhang
Ethan Chern
Siyang Gao
Pengfei Liu
Junxian He
HILM
423
62
0
01 Oct 2023
Language Models as a Service: Overview of a New Paradigm and its Challenges
Journal of Artificial Intelligence Research (JAIR), 2023
Emanuele La Malfa
Aleksandar Petrov
Simon Frieder
Christoph Weinhuber
Ryan Burnell
Raza Nazar
Anthony Cohn
Nigel Shadbolt
Michael Wooldridge
ALM
ELM
294
12
0
28 Sep 2023
ChatGPT & Mechanical Engineering: Examining performance on the FE Mechanical Engineering and Undergraduate Exams
Matthew Frenkel
Hebah Emara
156
4
0
26 Sep 2023
What does ChatGPT know about natural science and engineering?
Lukas Schulze Balhorn
Jana M. Weber
Stefan Buijsman
J. Hildebrandt
Martina Ziefle
Artur M. Schweidtmann
AI4MH
AI4CE
ELM
127
5
0
18 Sep 2023
How much can ChatGPT really help Computational Biologists in Programming?
C. R. Rahman
Limsoon Wong
AI4CE
161
3
0
17 Sep 2023
ChatGPT-4 with Code Interpreter can be used to solve introductory college-level vector calculus and electromagnetism problems
American Journal of Physics (AJP), 2023
Tanuj Kumar
M. Kats
111
12
0
16 Sep 2023
TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models
Siyao Zhang
Daocheng Fu
Zhao Zhang
Bin Yu
Pinlong Cai
160
62
0
13 Sep 2023
Towards LLM-based Autograding for Short Textual Answers
International Conference on Computer Supported Education (CSEDU), 2023
Johannes Schneider
Bernd Schenk
Christina Niklaus
AI4Ed
242
51
0
09 Sep 2023
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation
Jiatong Li
Rui Li
Qi Liu
235
27
0
08 Sep 2023
LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection
Jiaxing Qi
Shaohan Huang
Zhongzhi Luan
Carol J. Fung
Hailong Yang
D. Qian
154
57
0
03 Sep 2023
No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function
Haotian Xu
LRM
241
15
0
01 Sep 2023
GPTEval: A Survey on Assessments of ChatGPT and GPT-4
International Conference on Language Resources and Evaluation (LREC), 2023
Rui Mao
Guanyi Chen
Xulang Zhang
Frank Guerin
Xiaoshi Zhong
ELM
LM&MA
185
146
0
24 Aug 2023
Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis
Akshat Gupta
LLMAG
AI4MH
159
16
0
23 Aug 2023
Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries
Noel Ngu
Nathaniel Lee
Paulo Shakarian
164
5
0
22 Aug 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
International Conference on Learning Representations (ICLR), 2023
Haipeng Luo
Qingfeng Sun
Can Xu
Lu Wang
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRM
OSLM
800
622
0
18 Aug 2023
A criterion for Artificial General Intelligence: hypothetic-deductive reasoning, tested on ChatGPT
L. Vervoort
Vitaliy Mizyakov
Anastasia V. Ugleva
ReLM
ELM
LRM
114
1
0
05 Aug 2023
Does Correction Remain A Problem For Large Language Models?
Xiaowu Zhang
Xiaotian Zhang
Cheng Yang
Hang Yan
Xipeng Qiu
LRM
KELM
152
7
0
03 Aug 2023
What Is the Difference Between a Mountain and a Molehill? Quantifying Semantic Labeling of Visual Features in Line Charts
Visual .. (VISUAL), 2023
Dennis Bromley
V. Setlur
79
13
0
02 Aug 2023
Olio: A Semantic Search Interface for Data Repositories
ACM Symposium on User Interface Software and Technology (UIST), 2023
V. Setlur
Andriy Kanyuka
Arjun Srinivasan
231
16
0
31 Jul 2023
How to Design and Deliver Courses for Higher Education in the AI Era: Insights from Exam Data Analysis
A. Wazan
I. Taj
Abdulhadi Shoufan
R. Laborde
Rémi Venant
ELM
111
2
0
22 Jul 2023
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
International Conference on Machine Learning (ICML), 2023
Xiaoxuan Wang
Ziniu Hu
Pan Lu
Yanqiao Zhu
Jieyu Zhang
Satyen Subramaniam
Arjun R. Loomba
Shichang Zhang
Luke Huan
Wei Wang
ELM
LRM
395
172
0
20 Jul 2023
PharmacyGPT: The AI Pharmacist
Zheng Liu
Zihao Wu
Mengxuan Hu
Bokai Zhao
Lin Zhao
...
Ye Shen
Sheng Li
Brian Murray
Tianming Liu
Andrea Sikora
LM&MA
AI4MH
268
8
0
19 Jul 2023
Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structures
International Conference on Agents and Artificial Intelligence (ICAART), 2023
Sayed Erfan Arefin
T. Ashrafi
H. Al-Qudah
Ynes Ineza
Abdul Serwadda
ELM
273
7
0
10 Jul 2023
Can LLMs be Good Financial Advisors?: An Initial Study in Personal Decision Making for Optimized Outcomes
Kausik Lakkaraju
Sai Krishna Revanth Vuruma
Vishal Pallagani
Bharath Muppasani
Biplav Srivastava
136
18
0
08 Jul 2023
A Survey on Evaluation of Large Language Models
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Yu-Chu Chang
Xu Wang
Yongfeng Zhang
Yuanyi Wu
Linyi Yang
...
Yue Zhang
Yi-Ju Chang
Philip S. Yu
Qian Yang
Xingxu Xie
ELM
LM&MA
ALM
700
2,732
0
06 Jul 2023
Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial Relations
International Conference Geographic Information Science (GIScience), 2023
Yu Ji
Song Gao
161
25
0
05 Jul 2023
Evaluating ChatGPT's Decimal Skills and Feedback Generation in a Digital Learning Game
European Conference on Technology Enhanced Learning (EC-TEL), 2023
H. Nguyen
Hayden Stec
Xinying Hou
Sarah Di
B. McLaren
LRM
178
47
0
29 Jun 2023
Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models
Zaid Alyafeai
Maged S. Alshaibani
Badr AlKhamissi
H. Luqman
Ebrahim Alareqi
A. Fadel
ELM
LM&MA
AI4MH
127
22
0
28 Jun 2023
MyCrunchGPT: A chatGPT assisted framework for scientific machine learning
Journal of Machine Learning for Modeling and Computing (JMLMC), 2023
Varun V. Kumar
Leonard Gleyzer
Adar Kahana
K. Shukla
George Karniadakis
AI4CE
274
16
0
27 Jun 2023
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination
Xuan-Quy Dao
Ngoc-Bich Le
158
45
0
10 Jun 2023
A survey of Generative AI Applications
Journal of Computer Science (JCS), 2023
Roberto Gozalo-Brizuela
Eduardo C. Garrido-Merchán
3DV
MedIm
366
129
0
05 Jun 2023
ChatGPT is a Remarkable Tool -- For Experts
Data Intelligence (DI), 2023
A. Azaria
Rina Azoulay-Schwartz
S. Reches
135
112
0
02 Jun 2023
Inspecting Spoken Language Understanding from Kids for Basic Math Learning at Home
Workshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2023
Eda Okur
Roddy Fuentes Alba
Saurav Sahay
L. Nachman
188
1
0
01 Jun 2023
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Chen Ling
Xujiang Zhao
Jiaying Lu
Chengyuan Deng
Can Zheng
...
Chris White
Quanquan Gu
Jian Pei
Carl Yang
Bo Pan
ALM
397
211
0
30 May 2023
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Md Tahmid Rahman Laskar
M Saiful Bari
Mizanur Rahman
Md Amran Hossen Bhuiyan
Shafiq Joty
J. Huang
LM&MA
ELM
ALM
479
212
0
29 May 2023
Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations
Attia Qammar
Hongmei Wang
Jianguo Ding
Abdenacer Naouri
M. Daneshmand
Huansheng Ning
SILM
160
25
0
29 May 2023
What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks
Neural Information Processing Systems (NeurIPS), 2023
Taicheng Guo
Kehan Guo
B. Nan
Zhengwen Liang
Zhichun Guo
Nitesh Chawla
Olaf Wiest
Xiangliang Zhang
ELM
512
208
0
27 May 2023
Disentangled Phonetic Representation for Chinese Spelling Correction
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zihong Liang
Xiaojun Quan
Qifan Wang
182
26
0
24 May 2023
MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Jakub Macina
Nico Daheim
Sankalan Pal Chowdhury
Tanmay Sinha
Manu Kapur
Iryna Gurevych
Mrinmaya Sachan
LRM
294
115
0
23 May 2023
ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ján Cegin
Jakub Simko
Peter Brusilovsky
241
57
0
22 May 2023
ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Dongfang Li
Jindi Yu
Baotian Hu
Zhenran Xu
Hao Fei
ELM
177
14
0
22 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Artificial Intelligence Review (AIR), 2023
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
347
143
0
19 May 2023
A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5 and Bard AI Models for Java Functions
Giuseppe Destefanis
Silvia Bartolucci
Marco Ortu
ELM
181
26
0
16 May 2023
Enhancing Chemistry Learning with ChatGPT and Bing Chat as Agents to Think With: A Comparative Case Study
Social Science Research Network (SSRN), 2023
R. P. D. Santos
AI4CE
LLMAG
121
33
0
12 May 2023
Humans are Still Better than ChatGPT: Case of the IEEEXtreme Competition
Anis Koubaa
B. Qureshi
Adel Ammar
Zahid Khan
W. Boulila
L. Ghouti
ELM
ALM
180
32
0
10 May 2023
Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy
Andrew Katz
Siqing Wei
Gaurav Nanda
Christopher G. Brinton
M. Ohland
106
17
0
09 May 2023
Professional Certification Benchmark Dataset: The First 500 Jobs For Large Language Models
David Noever
Matt Ciolino
ELM
131
4
0
07 May 2023
Enhancing STEM Learning with ChatGPT and Bing Chat as Objects to Think With: A Case Study
Social Science Research Network (SSRN), 2023
Marco Antonio Rodrigues Vasconcelos
R. P. D. Santos
LRM
AI4CE
169
92
0
01 May 2023
ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
Chunkit Chan
Cheng Jiayang
Weiqi Wang
Yuxin Jiang
Tianqing Fang
Xin Liu
Yangqiu Song
LRM
335
67
0
28 Apr 2023
Previous
1
2
3
4
5
Next