ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.13867
  4. Cited By
Mathematical Capabilities of ChatGPT
v1v2 (latest)

Mathematical Capabilities of ChatGPT

Neural Information Processing Systems (NeurIPS), 2023
31 January 2023
Simon Frieder
Luca Pinchetti
Alexis Chevalier
Ryan-Rhys Griffiths
Tommaso Salvatori
Thomas Lukasiewicz
P. Petersen
Julius Berner
    ELMAI4MH
ArXiv (abs)PDFHTML

Papers citing "Mathematical Capabilities of ChatGPT"

50 / 227 papers shown
Can LLMs Understand Computer Networks? Towards a Virtual System
  Administrator
Can LLMs Understand Computer Networks? Towards a Virtual System Administrator
Denis Donadel
Francesco Marchiori
Luca Pajola
Mauro Conti
293
18
0
19 Apr 2024
A Survey on Deep Learning for Theorem Proving
A Survey on Deep Learning for Theorem Proving
Zhaoyu Li
Jialiang Sun
Logan Murphy
Qidong Su
Zenan Li
Xian Zhang
Kaiyu Yang
Xujie Si
LRM
284
49
0
15 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path
  Forward
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Yuheng Huang
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
385
12
0
12 Apr 2024
Capabilities of Large Language Models in Control Engineering: A
  Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra
Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra
Darioush Kevian
U. Syed
Xing-ming Guo
Aaron J. Havens
Geir Dullerud
Peter M. Seiler
Lianhui Qin
Bin Hu
ELM
202
57
0
04 Apr 2024
From Large to Tiny: Distilling and Refining Mathematical Expertise for
  Math Word Problems with Weakly Supervision
From Large to Tiny: Distilling and Refining Mathematical Expertise for Math Word Problems with Weakly Supervision
Qingwen Lin
Boyan Xu
Zhengting Huang
Ruichu Cai
306
4
0
21 Mar 2024
Review of Generative AI Methods in Cybersecurity
Review of Generative AI Methods in Cybersecurity
Yagmur Yigit
William J. Buchanan
Madjid G Tehrani
Leandros A. Maglaras
AAML
442
37
0
13 Mar 2024
Human I/O: Towards a Unified Approach to Detecting Situational
  Impairments
Human I/O: Towards a Unified Approach to Detecting Situational Impairments
Xingyu Bruce Liu
Jiahao Nick Li
David Kim
Xiang Ánthony' Chen
Andrea Colaço
223
23
0
06 Mar 2024
Chaining thoughts and LLMs to learn DNA structural biophysics
Chaining thoughts and LLMs to learn DNA structural biophysics
Tyler D. Ross
Ashwin Gopinath
AI4CE
120
3
0
02 Mar 2024
Large Language Models and Games: A Survey and Roadmap
Large Language Models and Games: A Survey and Roadmap
Roberto Gallotta
Graham Todd
Marvin Zammit
Sam Earle
Antonios Liapis
Julian Togelius
Georgios N. Yannakakis
LLMAGLM&MAAI4CELRM
485
131
0
28 Feb 2024
A New Era in LLM Security: Exploring Security Concerns in Real-World
  LLM-based Systems
A New Era in LLM Security: Exploring Security Concerns in Real-World LLM-based Systems
Fangzhou Wu
Ning Zhang
Somesh Jha
P. McDaniel
Chaowei Xiao
256
101
0
28 Feb 2024
WIPI: A New Web Threat for LLM-Driven Web Agents
WIPI: A New Web Threat for LLM-Driven Web Agents
Fangzhou Wu
Shutong Wu
Yulong Cao
Chaowei Xiao
LLMAG
235
38
0
26 Feb 2024
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing
  Study
How Large Language Models Encode Context Knowledge? A Layer-Wise Probing Study
Tianjie Ju
Weiwei Sun
Wei Du
Xinwei Yuan
Zhaochun Ren
Gongshen Liu
KELM
222
56
0
25 Feb 2024
OlympiadBench: A Challenging Benchmark for Promoting AGI with
  Olympiad-Level Bilingual Multimodal Scientific Problems
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Chaoqun He
Renjie Luo
Yuzhuo Bai
Shengding Hu
Zhen Leng Thai
...
Yuxiang Zhang
Jie Liu
Lei Qi
Zhiyuan Liu
Maosong Sun
ELMAIMat
398
677
0
21 Feb 2024
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
Xiao Li
Bolin Zhu
Kaiwen Shi
Sichen Liu
Yin Zhu
Yiwei Liu
Gong Cheng
AIMat
603
1
0
20 Feb 2024
Language Models as Science Tutors
Language Models as Science Tutors
Alexis Chevalier
Jiayi Geng
Alexander Wettig
Howard Chen
Sebastian Mizera
...
Jiatong Yu
Jun-Jie Zhu
Z. Ren
Sanjeev Arora
Danqi Chen
ELM
250
15
0
16 Feb 2024
UrbanKGent: A Unified Large Language Model Agent Framework for Urban
  Knowledge Graph Construction
UrbanKGent: A Unified Large Language Model Agent Framework for Urban Knowledge Graph ConstructionNeural Information Processing Systems (NeurIPS), 2024
Yansong Ning
Hao Liu
LLMAG
259
16
0
10 Feb 2024
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in
  Closed-Source LLMs
Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Source LLMsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2024
Simone Balloccu
Patrícia Schmidtová
Mateusz Lango
Ondrej Dusek
SILMELMPILM
441
251
0
06 Feb 2024
Large Language Models for Mathematical Reasoning: Progresses and
  Challenges
Large Language Models for Mathematical Reasoning: Progresses and Challenges
Janice Ahn
Rishu Verma
Renze Lou
Di Liu
Rui Zhang
Wenpeng Yin
LRM
361
267
0
31 Jan 2024
ChatGPT in the classroom. Exploring its potential and limitations in a
  Functional Programming course
ChatGPT in the classroom. Exploring its potential and limitations in a Functional Programming courseInternational journal of human computer interactions (IJHCI), 2023
Dan-Matei Popovici
158
55
0
20 Jan 2024
Code Simulation Challenges for Large Language Models
Code Simulation Challenges for Large Language Models
Emanuele La Malfa
Christoph Weinhuber
Orazio Torre
Fangru Lin
Samuele Marro
Anthony Cohn
Nigel Shadbolt
Michael Wooldridge
LLMAGLRM
309
10
0
17 Jan 2024
Stability Analysis of ChatGPT-based Sentiment Analysis in AI Quality
  Assurance
Stability Analysis of ChatGPT-based Sentiment Analysis in AI Quality Assurance
Tinghui Ouyang
AprilPyone Maungmaung
Koichi Konishi
Yoshiki Seo
Isao Echizen
AI4MH
193
13
0
15 Jan 2024
Exploring the Reasoning Abilities of Multimodal Large Language Models
  (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning
Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning
Yiqi Wang
Wentao Chen
Xiaotian Han
Xudong Lin
Haiteng Zhao
Yongfei Liu
Bohan Zhai
Jianbo Yuan
Quanzeng You
Hongxia Yang
LRM
305
146
0
10 Jan 2024
AI Hallucinations: A Misnomer Worth Clarifying
AI Hallucinations: A Misnomer Worth ClarifyingConference on Algebraic Informatics (CAI), 2024
Negar Maleki
Balaji Padmanabhan
Kaushik Dutta
442
102
0
09 Jan 2024
Computational Argumentation-based Chatbots: a Survey
Computational Argumentation-based Chatbots: a Survey
Federico Castagna
Nadin Kökciyan
I. Sassoon
Simon Parsons
Elizabeth I. Sklar
317
16
0
07 Jan 2024
Self-Contrast: Better Reflection Through Inconsistent Solving
  Perspectives
Self-Contrast: Better Reflection Through Inconsistent Solving PerspectivesAnnual Meeting of the Association for Computational Linguistics (ACL), 2024
Wenqi Zhang
Yongliang Shen
Linjuan Wu
Qiuying Peng
Jun Wang
Yueting Zhuang
Weiming Lu
LRMLLMAG
511
95
0
04 Jan 2024
NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language
  Models via Complexity Classes
NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes
Lizhou Fan
Qingfeng Lan
Jinkui Chi
Haoyang Ling
Yongfeng Zhang
LRM
337
88
0
22 Dec 2023
Assessing the Impact of Prompting Methods on ChatGPT's Mathematical
  Capabilities
Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities
Yuhao Chen
Chloe Wong
Hanwen Yang
Juan Aguenza
Sai Bhujangari
...
Eric Phuong
Minghao Liu
Raja Kumar
Vanshika Vats
James Davis
335
1
0
22 Dec 2023
Evaluating AI Vocational Skills Through Professional Testing
Evaluating AI Vocational Skills Through Professional Testing
David Noever
Matt Ciolino
ELM
130
0
0
17 Dec 2023
Exploring Large Language Models in Resolving Environment-Related Crash Bugs: Localizing and Repairing
Exploring Large Language Models in Resolving Environment-Related Crash Bugs: Localizing and Repairing
Xueying Du
Wentai Deng
Hanlin Wang
Juntao Li
Xin Peng
Xin Peng
LRM
156
8
0
16 Dec 2023
Early ChatGPT User Portrait through the Lens of Data
Early ChatGPT User Portrait through the Lens of Data
Yuyang Deng
Ni Zhao
Xin Huang
138
9
0
10 Dec 2023
Exploring the Limits of ChatGPT in Software Security Applications
Exploring the Limits of ChatGPT in Software Security Applications
Fangzhou Wu
Qingzhao Zhang
Ati Priya Bajaj
Tiffany Bao
Ning Zhang
Ruoyu Wang
Chaowei Xiao
ALMSILMELM
229
12
0
08 Dec 2023
DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial
  Natural Language Instructions
DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial Natural Language Instructions
Fangzhou Wu
Xiaogeng Liu
Chaowei Xiao
AAMLSILM
305
37
0
07 Dec 2023
Large Language Models for Mathematicians
Large Language Models for Mathematicians
Simon Frieder
Julius Berner
P. Petersen
Thomas Lukasiewicz
220
8
0
07 Dec 2023
InteraSSort: Interactive Assortment Planning Using Large Language Models
InteraSSort: Interactive Assortment Planning Using Large Language ModelsSocial Science Research Network (SSRN), 2023
Saketh Reddy Karra
Theja Tulabandhula
169
3
0
20 Nov 2023
Exploring the Potential of Large Language Models in Computational
  Argumentation
Exploring the Potential of Large Language Models in Computational ArgumentationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Guizhen Chen
Liying Cheng
Anh Tuan Luu
Lidong Bing
LLMAGLRM
268
45
0
15 Nov 2023
When does In-context Learning Fall Short and Why? A Study on
  Specification-Heavy Tasks
When does In-context Learning Fall Short and Why? A Study on Specification-Heavy Tasks
Hao Peng
Xiaozhi Wang
Jianhui Chen
Weikai Li
Yunjia Qi
...
Zhili Wu
Kaisheng Zeng
Bin Xu
Lei Hou
Juanzi Li
258
42
0
15 Nov 2023
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM
  Game
Adversarial Preference Optimization: Enhancing Your Alignment via RM-LLM GameAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Pengyu Cheng
Yifan Yang
Jian Li
Yong Dai
Tianhao Hu
Peixin Cao
Nan Du
Xiaolong Li
698
33
0
14 Nov 2023
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought
  Generation
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ruomeng Ding
Chaoyun Zhang
Lu Wang
Yong Xu
Ming-Jie Ma
Wei Zhang
Si Qin
Saravan Rajmohan
Qingwei Lin
Dongmei Zhang
LRM
379
85
0
07 Nov 2023
An Interdisciplinary Outlook on Large Language Models for Scientific
  Research
An Interdisciplinary Outlook on Large Language Models for Scientific Research
James Boyko
Joseph Cohen
Nathan Fox
Maria Han Veiga
Jennifer I-Hsiu Li
...
Andreas H. Rauch
Kenneth N. Reid
Soumi Tribedi
Anastasia Visheratina
Xin Xie
234
22
0
03 Nov 2023
The Expressibility of Polynomial based Attention Scheme
The Expressibility of Polynomial based Attention Scheme
Zhao Song
Guangyi Xu
Junze Yin
311
7
0
30 Oct 2023
The Eval4NLP 2023 Shared Task on Prompting Large Language Models as
  Explainable Metrics
The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics
Christoph Leiter
Juri Opitz
Daniel Deutsch
Yang Gao
Rotem Dror
Steffen Eger
ALMLRMELM
324
37
0
30 Oct 2023
Enhancing Chemistry Learning with ChatGPT, Bing Chat, Bard, and Claude
  as Agents-to-Think-With: A Comparative Case Study
Enhancing Chemistry Learning with ChatGPT, Bing Chat, Bard, and Claude as Agents-to-Think-With: A Comparative Case StudySocial Science Research Network (SSRN), 2023
Renato P. dos Santos
122
10
0
23 Oct 2023
LUNA: A Model-Based Universal Analysis Framework for Large Language
  Models
LUNA: A Model-Based Universal Analysis Framework for Large Language ModelsIEEE Transactions on Software Engineering (TSE), 2023
Da Song
Xuan Xie
Yuheng Huang
Derui Zhu
Yuheng Huang
Felix Juefei Xu
Lei Ma
ALM
350
9
0
22 Oct 2023
AI for Mathematics: A Cognitive Science Perspective
AI for Mathematics: A Cognitive Science Perspective
Cedegao E. Zhang
Katherine M. Collins
Adrian Weller
Joshua B. Tenenbaum
208
12
0
19 Oct 2023
Can Large Language Models Explain Themselves? A Study of LLM-Generated
  Self-Explanations
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
Shiyuan Huang
Siddarth Mamidanna
Shreedhar Jangam
Yilun Zhou
Leilani H. Gilpin
LRMMILMELM
372
108
0
17 Oct 2023
Large Language Models Meet Open-World Intent Discovery and Recognition:
  An Evaluation of ChatGPT
Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPTConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xiaoshuai Song
Keqing He
Pei Wang
Guanting Dong
Yutao Mou
Jingang Wang
Yunsen Xian
Xunliang Cai
Weiran Xu
LRM
223
25
0
16 Oct 2023
GLoRE: Evaluating Logical Reasoning of Large Language Models
GLoRE: Evaluating Logical Reasoning of Large Language Models
Hanmeng Liu
Zhiyang Teng
Ruoxi Ning
Jian Liu
Qiji Zhou
Yuexin Zhang
Yue Zhang
ReLMELMLRM
386
8
0
13 Oct 2023
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4
  on mock CFA Exams
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams
Ethan Callanan
A. Mbakwe
Antony Papadimitriou
Yulong Pei
Mathieu Sibue
Xiaodan Zhu
Zhiqiang Ma
Xiaomo Liu
Sameena Shah
ELM
238
26
0
12 Oct 2023
A New Benchmark and Reverse Validation Method for Passage-level
  Hallucination Detection
A New Benchmark and Reverse Validation Method for Passage-level Hallucination DetectionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shiping Yang
Renliang Sun
Xiao-Yi Wan
HILM
298
54
0
10 Oct 2023
OptiMUS: Optimization Modeling Using MIP Solvers and large language
  models
OptiMUS: Optimization Modeling Using MIP Solvers and large language models
Ali AhmadiTeshnizi
Wenzhi Gao
Madeleine Udell
LLMAG
139
48
0
09 Oct 2023
Previous
12345
Next