v1v2 (latest)

Mathematical Capabilities of ChatGPT

Neural Information Processing Systems (NeurIPS), 2023

31 January 2023

Papers citing "Mathematical Capabilities of ChatGPT"

50 / 227 papers shown

The potential of large language models for improving probability learning: A study on ChatGPT3.5 and first-year computer engineering students

Maria Eugenia Castellanos

R. M. Diez

Emilio Lopez Cano

166

09 Oct 2023

FELM: Benchmarking Factuality Evaluation of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2023

423

01 Oct 2023

Language Models as a Service: Overview of a New Paradigm and its ChallengesJournal of Artificial Intelligence Research (JAIR), 2023

294

28 Sep 2023

ChatGPT & Mechanical Engineering: Examining performance on the FE Mechanical Engineering and Undergraduate Exams

Matthew Frenkel

Hebah Emara

156

26 Sep 2023

What does ChatGPT know about natural science and engineering?

Lukas Schulze Balhorn

Artur M. Schweidtmann

AI4MH AI4CE ELM

127

18 Sep 2023

How much can ChatGPT really help Computational Biologists in Programming?

C. R. Rahman

Limsoon Wong

AI4CE

161

17 Sep 2023

ChatGPT-4 with Code Interpreter can be used to solve introductory college-level vector calculus and electromagnetism problemsAmerican Journal of Physics (AJP), 2023

Tanuj Kumar

M. Kats

111

16 Sep 2023

TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models

160

13 Sep 2023

Towards LLM-based Autograding for Short Textual AnswersInternational Conference on Computer Supported Education (CSEDU), 2023

242

09 Sep 2023

Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation

Jiatong Li

Rui Li

Qi Liu

235

08 Sep 2023

LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection

154

03 Sep 2023

No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function

Haotian Xu

LRM

241

01 Sep 2023

GPTEval: A Survey on Assessments of ChatGPT and GPT-4International Conference on Language Resources and Evaluation (LREC), 2023

185

146

24 Aug 2023

Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis

Akshat Gupta

LLMAG AI4MH

159

23 Aug 2023

Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries

Noel Ngu

Nathaniel Lee

Paulo Shakarian

164

22 Aug 2023

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructInternational Conference on Learning Representations (ICLR), 2023

...

800

622

18 Aug 2023

A criterion for Artificial General Intelligence: hypothetic-deductive reasoning, tested on ChatGPT

114

05 Aug 2023

Does Correction Remain A Problem For Large Language Models?

Xipeng Qiu

152

03 Aug 2023

What Is the Difference Between a Mountain and a Molehill? Quantifying Semantic Labeling of Visual Features in Line ChartsVisual .. (VISUAL), 2023

Dennis Bromley

V. Setlur

02 Aug 2023

Olio: A Semantic Search Interface for Data RepositoriesACM Symposium on User Interface Software and Technology (UIST), 2023

V. Setlur

Andriy Kanyuka

Arjun Srinivasan

231

31 Jul 2023

How to Design and Deliver Courses for Higher Education in the AI Era: Insights from Exam Data Analysis

111

22 Jul 2023

SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language ModelsInternational Conference on Machine Learning (ICML), 2023

395

172

20 Jul 2023

PharmacyGPT: The AI Pharmacist

...

Tianming Liu

268

19 Jul 2023

Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structuresInternational Conference on Agents and Artificial Intelligence (ICAART), 2023

273

10 Jul 2023

Can LLMs be Good Financial Advisors?: An Initial Study in Personal Decision Making for Optimized Outcomes

Kausik Lakkaraju

Sai Krishna Revanth Vuruma

Vishal Pallagani

Bharath Muppasani

Biplav Srivastava

136

08 Jul 2023

A Survey on Evaluation of Large Language ModelsACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023

...

Yue Zhang

Philip S. Yu

700

2,732

06 Jul 2023

Evaluating the Effectiveness of Large Language Models in Representing Textual Descriptions of Geometry and Spatial RelationsInternational Conference Geographic Information Science (GIScience), 2023

Yu Ji

Song Gao

161

05 Jul 2023

Evaluating ChatGPT's Decimal Skills and Feedback Generation in a Digital Learning GameEuropean Conference on Technology Enhanced Learning (EC-TEL), 2023

178

29 Jun 2023

Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models

127

28 Jun 2023

MyCrunchGPT: A chatGPT assisted framework for scientific machine learningJournal of Machine Learning for Modeling and Computing (JMLMC), 2023

274

27 Jun 2023

Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination

Xuan-Quy Dao

Ngoc-Bich Le

158

10 Jun 2023

A survey of Generative AI ApplicationsJournal of Computer Science (JCS), 2023

Roberto Gozalo-Brizuela

Eduardo C. Garrido-Merchán

3DV MedIm

366

129

05 Jun 2023

ChatGPT is a Remarkable Tool -- For ExpertsData Intelligence (DI), 2023

A. Azaria

Rina Azoulay-Schwartz

S. Reches

135

112

02 Jun 2023

Inspecting Spoken Language Understanding from Kids for Basic Math Learning at HomeWorkshop on Innovative Use of NLP for Building Educational Applications (UNBEA), 2023

188

01 Jun 2023

Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive SurveyACM Computing Surveys (ACM Comput. Surv.), 2023

...

Quanquan Gu

397

211

30 May 2023

A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark DatasetsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Md Tahmid Rahman Laskar

M Saiful Bari

Mizanur Rahman

Md Amran Hossen Bhuiyan

Shafiq Joty

J. Huang

LM&MA ELM ALM

479

212

29 May 2023

Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations

160

29 May 2023

What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasksNeural Information Processing Systems (NeurIPS), 2023

512

208

27 May 2023

Disentangled Phonetic Representation for Chinese Spelling CorrectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Zihong Liang

Xiaojun Quan

Qifan Wang

182

24 May 2023

MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning ProblemsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Jakub Macina

Nico Daheim

Sankalan Pal Chowdhury

294

115

23 May 2023

ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model RobustnessConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Ján Cegin

Jakub Simko

Peter Brusilovsky

241

22 May 2023

ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist ExaminationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Baotian Hu

177

22 May 2023

A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and ValidationArtificial Intelligence Review (AIR), 2023

...

347

143

19 May 2023

A Preliminary Analysis on the Code Generation Capabilities of GPT-3.5 and Bard AI Models for Java Functions

181

16 May 2023

Enhancing Chemistry Learning with ChatGPT and Bing Chat as Agents to Think With: A Comparative Case StudySocial Science Research Network (SSRN), 2023

R. P. D. Santos

AI4CE LLMAG

121

12 May 2023

Humans are Still Better than ChatGPT: Case of the IEEEXtreme Competition

180

10 May 2023

Exploring the Efficacy of ChatGPT in Analyzing Student Teamwork Feedback with an Existing Taxonomy

Andrew Katz

Siqing Wei

Gaurav Nanda

Christopher G. Brinton

M. Ohland

106

09 May 2023

Professional Certification Benchmark Dataset: The First 500 Jobs For Large Language Models

David Noever

Matt Ciolino

ELM

131

07 May 2023

Enhancing STEM Learning with ChatGPT and Bing Chat as Objects to Think With: A Case StudySocial Science Research Network (SSRN), 2023

Marco Antonio Rodrigues Vasconcelos

R. P. D. Santos

LRM AI4CE

169

01 May 2023

ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations

Xin Liu

335

28 Apr 2023