ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.07921
  4. Cited By
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with
  Code-based Self-Verification

Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

15 August 2023
Aojun Zhou
Ke Wang
Zimu Lu
Weikang Shi
Sichun Luo
Zipeng Qin
Shaoqing Lu
Anya Jia
Linqi Song
Mingjie Zhan
Hongsheng Li
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification"

50 / 131 papers shown
Title
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Zhen Huang
Zengzhi Wang
Shijie Xia
Xuefeng Li
Haoyang Zou
...
Yuxiang Zheng
Shaoting Zhang
Dahua Lin
Yu Qiao
Pengfei Liu
ELM
LRM
43
25
0
18 Jun 2024
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with
  Geometric Image Generation
GeoGPT4V: Towards Geometric Multi-modal Large Language Models with Geometric Image Generation
Shihao Cai
Keqin Bao
Hangyu Guo
Jizhi Zhang
Jun Song
Bo Zheng
39
14
0
17 Jun 2024
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and
  Latent Concept
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept
Guangliang Liu
Haitao Mao
Bochuan Cao
Zhiyu Xue
K. Johnson
Jiliang Tang
Rongrong Wang
LRM
24
9
0
04 Jun 2024
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step
Owen Dugan
Donato Manuel Jimenez Beneto
Charlotte Loh
Zhuo Chen
Rumen Dangovski
Marin Soljacic
LRM
29
1
0
04 Jun 2024
Evaluating Mathematical Reasoning of Large Language Models: A Focus on
  Error Identification and Correction
Evaluating Mathematical Reasoning of Large Language Models: A Focus on Error Identification and Correction
Xiaoyuan Li
Wenjie Wang
Moxin Li
Junrong Guo
Yang Zhang
Fuli Feng
ELM
LRM
33
15
0
02 Jun 2024
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off
  Code Generation
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Aojun Zhou
Junting Pan
Hongsheng Li
SyDa
27
7
0
27 May 2024
GeneAgent: Self-verification Language Agent for Gene Set Knowledge
  Discovery using Domain Databases
GeneAgent: Self-verification Language Agent for Gene Set Knowledge Discovery using Domain Databases
Zhizheng Wang
Qiao Jin
Chih-Hsuan Wei
Shubo Tian
Po-Ting Lai
Qingqing Zhu
Chi-Ping Day
Christina Ross
Zhiyong Lu
LLMAG
19
8
0
25 May 2024
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large
  Language Models
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Models
Xudong Lu
Aojun Zhou
Yuhui Xu
Renrui Zhang
Peng Gao
Hongsheng Li
19
7
0
25 May 2024
DuetSim: Building User Simulator with Dual Large Language Models for
  Task-Oriented Dialogues
DuetSim: Building User Simulator with Dual Large Language Models for Task-Oriented Dialogues
Xiang Luo
Zhiwen Tang
Jin Wang
Xuejie Zhang
21
4
0
16 May 2024
MacBehaviour: An R package for behavioural experimentation on large
  language models
MacBehaviour: An R package for behavioural experimentation on large language models
Xufeng Duan
Shixuan Li
Zhenguang G. Cai
MLLM
34
2
0
13 May 2024
A Philosophical Introduction to Language Models - Part II: The Way
  Forward
A Philosophical Introduction to Language Models - Part II: The Way Forward
Raphael Milliere
Cameron Buckner
LRM
52
11
0
06 May 2024
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models
Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models
Leonardo Ranaldi
André Freitas
LRM
ReLM
29
8
0
01 May 2024
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Small Language Models Need Strong Verifiers to Self-Correct Reasoning
Yunxiang Zhang
Muhammad Khalifa
Lajanugen Logeswaran
Jaekyeom Kim
Moontae Lee
Honglak Lee
Lu Wang
LRM
KELM
ReLM
23
31
0
26 Apr 2024
iTBLS: A Dataset of Interactive Conversations Over Tabular Information
iTBLS: A Dataset of Interactive Conversations Over Tabular Information
Anirudh S. Sundar
Christopher Richardson
William Gay
Larry Heck
LMTD
29
1
0
19 Apr 2024
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving
  Complex Mathematical Problems
MACM: Utilizing a Multi-Agent System for Condition Mining in Solving Complex Mathematical Problems
Bin Lei
LLMAG
AI4CE
28
11
0
06 Apr 2024
Can LLMs Master Math? Investigating Large Language Models on Math Stack
  Exchange
Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange
Ankit Satpute
Noah Giessing
André Greiner-Petter
M. Schubotz
O. Teschke
Akiko Aizawa
Bela Gipp
ELM
LRM
26
18
0
30 Mar 2024
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
  Math Problems?
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang
Dongzhi Jiang
Yichi Zhang
Haokun Lin
Ziyu Guo
...
Aojun Zhou
Pan Lu
Kai-Wei Chang
Peng Gao
Hongsheng Li
27
165
0
21 Mar 2024
StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows
StateFlow: Enhancing LLM Task-Solving through State-Driven Workflows
Yiran Wu
Tianwei Yue
Shaokun Zhang
Chi Wang
Qingyun Wu
40
21
0
17 Mar 2024
Think Twice Before Trusting: Self-Detection for Large Language Models
  through Comprehensive Answer Reflection
Think Twice Before Trusting: Self-Detection for Large Language Models through Comprehensive Answer Reflection
Moxin Li
Wenjie Wang
Fuli Feng
Fengbin Zhu
Qifan Wang
Tat-Seng Chua
HILM
LRM
33
8
0
15 Mar 2024
Can LLM Substitute Human Labeling? A Case Study of Fine-grained Chinese
  Address Entity Recognition Dataset for UAV Delivery
Can LLM Substitute Human Labeling? A Case Study of Fine-grained Chinese Address Entity Recognition Dataset for UAV Delivery
Yuxuan Yao
Sichun Luo
Haohan Zhao
Guanzhi Deng
Linqi Song
29
7
0
10 Mar 2024
Teaching Large Language Models to Reason with Reinforcement Learning
Teaching Large Language Models to Reason with Reinforcement Learning
Alex Havrilla
Yuqing Du
Sharath Chandra Raparthy
Christoforos Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Sainbayar Sukhbaatar
Roberta Raileanu
ReLM
LRM
29
67
0
07 Mar 2024
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical
  Reasoning
Key-Point-Driven Data Synthesis with its Enhancement on Mathematical Reasoning
Yiming Huang
Xiao Liu
Yeyun Gong
Zhibin Gou
Yelong Shen
Nan Duan
Weizhu Chen
AIMat
LRM
56
35
0
04 Mar 2024
FAC$^2$E: Better Understanding Large Language Model Capabilities by
  Dissociating Language and Cognition
FAC2^22E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition
Xiaoqiang Wang
Bang Liu
Lingfei Wu
22
0
0
29 Feb 2024
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of
  LLMs as Mathematical Problem Solvers
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
Qintong Li
Leyang Cui
Xueliang Zhao
Lingpeng Kong
Wei Bi
LRM
35
46
0
29 Feb 2024
Data Interpreter: An LLM Agent For Data Science
Data Interpreter: An LLM Agent For Data Science
Sirui Hong
Yizhang Lin
Bang Liu
Bangbang Liu
Binhao Wu
...
Xinbing Liang
Yaying Fei
Yuheng Cheng
Zongze Xu
Chenglin Wu
LLMAG
AI4CE
47
58
0
28 Feb 2024
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical
  Reasoning
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning
Debrup Das
Debopriyo Banerjee
Somak Aditya
Ashish Kulkarni
ReLM
LRM
21
10
0
27 Feb 2024
MathGenie: Generating Synthetic Data with Question Back-translation for
  Enhancing Mathematical Reasoning of LLMs
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
Zimu Lu
Aojun Zhou
Houxing Ren
Ke Wang
Weikang Shi
Junting Pan
Mingjie Zhan
Hongsheng Li
SyDa
LRM
45
42
0
26 Feb 2024
Debug like a Human: A Large Language Model Debugger via Verifying
  Runtime Execution Step-by-step
Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step
Li Zhong
Zilong Wang
Jingbo Shang
19
47
0
25 Feb 2024
Not All Experts are Equal: Efficient Expert Pruning and Skipping for
  Mixture-of-Experts Large Language Models
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Xudong Lu
Qi Liu
Yuhui Xu
Aojun Zhou
Siyuan Huang
Bo-Wen Zhang
Junchi Yan
Hongsheng Li
MoE
27
25
0
22 Feb 2024
OlympiadBench: A Challenging Benchmark for Promoting AGI with
  Olympiad-Level Bilingual Multimodal Scientific Problems
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
Chaoqun He
Renjie Luo
Yuzhuo Bai
Shengding Hu
Zhen Leng Thai
...
Yuxiang Zhang
Jie Liu
Lei Qi
Zhiyuan Liu
Maosong Sun
ELM
AIMat
33
134
0
21 Feb 2024
EmoBench: Evaluating the Emotional Intelligence of Large Language Models
EmoBench: Evaluating the Emotional Intelligence of Large Language Models
Sahand Sabour
Siyang Liu
Zheyuan Zhang
June M. Liu
Jinfeng Zhou
Alvionna S. Sunaryo
Juanzi Li
Tatia M.C. Lee
Rada Mihalcea
Minlie Huang
22
11
0
19 Feb 2024
SciAgent: Tool-augmented Language Models for Scientific Reasoning
SciAgent: Tool-augmented Language Models for Scientific Reasoning
Yubo Ma
Zhibin Gou
Junheng Hao
Ruochen Xu
Shuohang Wang
...
Yujiu Yang
Yixin Cao
Aixin Sun
Hany Awadalla
Weizhu Chen
RALM
LRM
LLMAG
38
20
0
18 Feb 2024
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Shubham Toshniwal
Ivan Moshkov
Sean Narenthiran
Daria Gitman
Fei Jia
Igor Gitman
23
75
0
15 Feb 2024
SwissNYF: Tool Grounded LLM Agents for Black Box Setting
SwissNYF: Tool Grounded LLM Agents for Black Box Setting
Somnath Sendhil Kumar
Dhruv Jain
Eshaan Agarwal
Raunak Pandey
LLMAG
27
0
0
15 Feb 2024
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and
  Local Refinements
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alex Havrilla
Sharath Raparthy
Christoforus Nalmpantis
Jane Dwivedi-Yu
Maksym Zhuravinskyi
Eric Hambro
Roberta Railneau
ReLM
LRM
28
49
0
13 Feb 2024
Feedback Loops With Language Models Drive In-Context Reward Hacking
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan
Erik Jones
Meena Jagadeesan
Jacob Steinhardt
KELM
42
25
0
09 Feb 2024
InternLM-Math: Open Math Large Language Models Toward Verifiable
  Reasoning
InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning
Huaiyuan Ying
Shuo Zhang
Linyang Li
Zhejian Zhou
Yunfan Shao
...
Hang Yan
Xipeng Qiu
Jiayu Wang
Kai-xiang Chen
Dahua Lin
ReLM
LRM
25
68
0
09 Feb 2024
Training Large Language Models for Reasoning through Reverse Curriculum
  Reinforcement Learning
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi
Wenxiang Chen
Boyang Hong
Senjie Jin
Rui Zheng
...
Xinbo Zhang
Peng Sun
Tao Gui
Qi Zhang
Xuanjing Huang
LRM
27
20
0
08 Feb 2024
Limits of Transformer Language Models on Learning to Compose Algorithms
Limits of Transformer Language Models on Learning to Compose Algorithms
Jonathan Thomm
Aleksandar Terzić
Giacomo Camposampiero
Michael Hersche
Bernhard Schölkopf
Abbas Rahimi
34
3
0
08 Feb 2024
Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large
  Language Models
Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models
Spyridon Mouselinos
Henryk Michalewski
Mateusz Malinowski
LRM
26
5
0
06 Feb 2024
Large Language Models for Mathematical Reasoning: Progresses and
  Challenges
Large Language Models for Mathematical Reasoning: Progresses and Challenges
Janice Ahn
Rishu Verma
Renze Lou
Di Liu
Rui Zhang
Wenpeng Yin
LRM
33
113
0
31 Jan 2024
TPD: Enhancing Student Language Model Reasoning via Principle Discovery
  and Guidance
TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidance
Haorui Wang
Rongzhi Zhang
Yinghao Li
Lingkai Kong
Yuchen Zhuang
Xiusi Chen
Chao Zhang
LRM
38
4
0
24 Jan 2024
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
BETA: Binarized Energy-Efficient Transformer Accelerator at the Edge
Yuhao Ji
Chao Fang
Zhongfeng Wang
19
3
0
22 Jan 2024
ReFT: Reasoning with Reinforced Fine-Tuning
ReFT: Reasoning with Reinforced Fine-Tuning
Trung Quoc Luong
Xinbo Zhang
Zhanming Jie
Peng Sun
Xiaoran Jin
Hang Li
OffRL
LRM
ReLM
32
79
0
17 Jan 2024
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible
  Pipeline
MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline
Minpeng Liao
Wei Luo
Chengxi Li
Jing Wu
Kai Fan
LRM
32
37
0
16 Jan 2024
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs'
  Mathematical Reasoning Capabilities
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities
Yujun Mao
Yoon Kim
Yilun Zhou
LRM
ReLM
12
17
0
13 Jan 2024
I am a Strange Dataset: Metalinguistic Tests for Language Models
I am a Strange Dataset: Metalinguistic Tests for Language Models
Tristan Thrush
Jared Moore
Miguel Monares
Christopher Potts
Douwe Kiela
14
5
0
10 Jan 2024
A Philosophical Introduction to Language Models -- Part I: Continuity
  With Classic Debates
A Philosophical Introduction to Language Models -- Part I: Continuity With Classic Debates
Raphael Milliere
Cameron Buckner
LRM
ELM
22
18
0
08 Jan 2024
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
  Empowers Large Language Models to Serve as Intelligent Agents
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents
Ke Yang
Jiateng Liu
John Wu
Chaoqi Yang
Yi Ren Fung
...
Xu Cao
Xingyao Wang
Yiquan Wang
Heng Ji
Chengxiang Zhai
LLMAG
ELM
18
71
0
01 Jan 2024
Adapting Large Language Models for Education: Foundational Capabilities,
  Potentials, and Challenges
Adapting Large Language Models for Education: Foundational Capabilities, Potentials, and Challenges
Qingyao Li
Lingyue Fu
Weiming Zhang
Xianyu Chen
Jingwei Yu
Wei Xia
Weinan Zhang
Ruiming Tang
Yong Yu
AI4Ed
ELM
27
17
0
27 Dec 2023
Previous
123
Next