ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.05229
  4. Cited By
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in
  Large Language Models

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

7 October 2024
Iman Mirzadeh
Keivan Alizadeh
Hooman Shahrokhi
Oncel Tuzel
Samy Bengio
Mehrdad Farajtabar
    AIMat
    LRM
ArXivPDFHTML

Papers citing "GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models"

39 / 89 papers shown
Title
RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering
RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering
Sichu Liang
Linhai Zhang
Hongyu Zhu
Wenwen Wang
Yulan He
Deyu Zhou
RALM
39
0
0
19 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
52
1
0
17 Feb 2025
Do Large Language Models Reason Causally Like Us? Even Better?
Do Large Language Models Reason Causally Like Us? Even Better?
Hanna M. Dettki
Brenden M. Lake
Charley M. Wu
Bob Rehder
ReLM
ELM
LRM
90
0
0
14 Feb 2025
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
One Example Shown, Many Concepts Known! Counterexample-Driven Conceptual Reasoning in Mathematical LLMs
Yinghui Li
Jiayi Kuang
Haojing Huang
Zhikun Xu
Xinnian Liang
...
Xiaoyu Tan
C. Qu
Ying Shen
Hai-Tao Zheng
Philip S. Yu
LRM
41
3
0
12 Feb 2025
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring
Alex Heyman
Joel Zylberberg
LRM
40
0
0
10 Feb 2025
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
Kaixuan Huang
Jiacheng Guo
Zihao Li
X. Ji
Jiawei Ge
...
Yangsibo Huang
Chi Jin
Xinyun Chen
Chiyuan Zhang
Mengdi Wang
AAML
LRM
78
7
0
10 Feb 2025
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
Yang Zhou
Hongyi Liu
Zhuoming Chen
Yuandong Tian
Beidi Chen
LRM
52
7
0
07 Feb 2025
Large Language Models for Multi-Robot Systems: A Survey
Large Language Models for Multi-Robot Systems: A Survey
Peihan Li
Zijian An
Shams Abrar
Lifeng Zhou
LM&Ro
LRM
44
4
0
06 Feb 2025
Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment
Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment
Cheryl Li
Tianyuan Xu
Yiwen Guo
LRM
67
2
0
05 Feb 2025
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning
Yibo Yan
Shen Wang
Jiahao Huo
Jingheng Ye
Zhendong Chu
Xuming Hu
Philip S. Yu
Carla P. Gomes
B. Selman
Qingsong Wen
LRM
111
9
0
05 Feb 2025
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Maohao Shen
Guangtao Zeng
Zhenting Qi
Zhang-Wei Hong
Zhenfang Chen
Wei Lu
G. Wornell
Subhro Das
David D. Cox
Chuang Gan
LLMAG
LRM
70
5
0
04 Feb 2025
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models
PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models
C. Anderson
Joydeep Biswas
Aleksander Boruch-Gruszecki
Federico Cassano
Molly Q. Feldman
Joydeep Biswas
Francesca Lucchetti
Zixuan Wu
Arjun Guha
ReLM
ELM
LRM
39
3
0
03 Feb 2025
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
AtmosSci-Bench: Evaluating the Recent Advance of Large Language Model for Atmospheric Science
Chenyue Li
Wen Deng
Mengqian Lu
Binhang Yuan
ELM
AI4Cl
LRM
87
0
0
03 Feb 2025
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping
Pu Yang
Yunzhen Feng
Ziyuan Chen
Yuhang Wu
Zhuoyuan Li
DiffM
91
0
0
31 Jan 2025
Improving Rule-based Reasoning in LLMs via Neurosymbolic Representations
Improving Rule-based Reasoning in LLMs via Neurosymbolic Representations
Varun Dhanraj
Chris Eliasmith
LRM
40
0
0
31 Jan 2025
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Samira Abnar
Harshay Shah
Dan Busbridge
Alaaeldin Mohamed Elnouby Ali
J. Susskind
Vimal Thilak
MoE
LRM
33
4
0
28 Jan 2025
MCP-Solver: Integrating Language Models with Constraint Programming Systems
MCP-Solver: Integrating Language Models with Constraint Programming Systems
Stefan Szeider
27
0
0
31 Dec 2024
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
Jiajun Song
Zhuoyan Xu
Yiqiao Zhong
67
4
0
31 Dec 2024
Formal Mathematical Reasoning: A New Frontier in AI
Formal Mathematical Reasoning: A New Frontier in AI
Kaiyu Yang
Gabriel Poesia
Jingxuan He
Wenda Li
Kristin Lauter
Swarat Chaudhuri
Dawn Song
LRM
AI4CE
82
20
0
20 Dec 2024
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative
  Querying
Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying
Federico Castagna
I. Sassoon
Simon Parsons
LRM
85
0
0
19 Dec 2024
Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text
  Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios
  and Lightweight Deployment
Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment
Kevin You
64
0
0
18 Dec 2024
On Large Language Models in Mission-Critical IT Governance: Are We Ready Yet?
On Large Language Models in Mission-Critical IT Governance: Are We Ready Yet?
Matteo Esposito
Francesco Palagiano
Valentina Lenarduzzi
Davide Taibi
ELM
64
2
0
16 Dec 2024
Mining Math Conjectures from LLMs: A Pruning Approach
Mining Math Conjectures from LLMs: A Pruning Approach
Jake Chuharski
Elias Rojas Collins
Mark Meringolo
LRM
69
0
0
09 Dec 2024
TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with
  Scalable Context and Symbolic Extension
TQA-Bench: Evaluating LLMs for Multi-Table Question Answering with Scalable Context and Symbolic Extension
Zipeng Qiu
You Peng
Guangxin He
Binhang Yuan
Chen Wang
LMTD
83
2
0
29 Nov 2024
Relations, Negations, and Numbers: Looking for Logic in Generative
  Text-to-Image Models
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models
C. Conwell
Rupert Tawiah-Quashie
T. Ullman
71
2
0
26 Nov 2024
One-Layer Transformer Provably Learns One-Nearest Neighbor In Context
Zihao Li
Yuan Cao
Cheng Gao
Yihan He
Han Liu
Jason M. Klusowski
Jianqing Fan
Mengdi Wang
MLT
44
1
0
16 Nov 2024
Enhancing LLM Evaluations: The Garbling Trick
Enhancing LLM Evaluations: The Garbling Trick
William F. Bradley
ELM
LRM
36
1
0
03 Nov 2024
Exploratory Models of Human-AI Teams: Leveraging Human Digital Twins to
  Investigate Trust Development
Exploratory Models of Human-AI Teams: Leveraging Human Digital Twins to Investigate Trust Development
Daniel Nguyen
Myke C. Cohen
Hsien-Te Kao
Grant Engberson
Louis Penafiel
Spencer Lynch
Svitlana Volkova
21
1
0
01 Nov 2024
Toward Automated Algorithm Design: A Survey and Practical Guide to Meta-Black-Box-Optimization
Toward Automated Algorithm Design: A Survey and Practical Guide to Meta-Black-Box-Optimization
Zeyuan Ma
Hongshu Guo
Yue-jiao Gong
Jun Zhang
Kay Chen Tan
95
2
0
01 Nov 2024
On Memorization of Large Language Models in Logical Reasoning
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie
Yangsibo Huang
Chiyuan Zhang
Da Yu
Xinyun Chen
Bill Yuchen Lin
Bo Li
Badih Ghazi
Ravi Kumar
LRM
41
20
0
30 Oct 2024
Can Large Language Models Act as Symbolic Reasoners?
Can Large Language Models Act as Symbolic Reasoners?
Rob Sullivan
Nelly Elsayed
ELM
LRM
22
4
0
28 Oct 2024
Learning Mathematical Rules with Large Language Models
Learning Mathematical Rules with Large Language Models
Antoine Gorceix
Bastien Le Chenadec
Ahmad Rammal
N. Vadori
Manuela Veloso
18
1
0
22 Oct 2024
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI
  with a Focus on Model Confidence
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Norbert Tihanyi
Tamás Bisztray
Richard A. Dubniczky
Rebeka Tóth
B. Borsos
...
Ryan Marinelli
Lucas C. Cordeiro
Merouane Debbah
Vasileios Mavroeidis
Audun Josang
16
4
0
20 Oct 2024
Do Large Language Models Truly Grasp Mathematics? An Empirical
  Exploration From Cognitive Psychology
Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration From Cognitive Psychology
Wei Xie
Shuoyoucheng Ma
Zhenhua Wang
Enze Wang
Kai Chen
Xiaobing Sun
Baosheng Wang
LRM
35
0
0
19 Oct 2024
Models Can and Should Embrace the Communicative Nature of
  Human-Generated Math
Models Can and Should Embrace the Communicative Nature of Human-Generated Math
Sasha Boguraev
Ben Lipkin
Leonie Weissweiler
Kyle Mahowald
41
1
0
25 Sep 2024
Validation Requirements for AI-based Intervention-Evaluation in Aging
  and Longevity Research and Practice
Validation Requirements for AI-based Intervention-Evaluation in Aging and Longevity Research and Practice
G. Fuellen
Anton Y Kulaga
Sebastian Lobentanzer
Maximilian Unfried
Roberto Avelar
Daniel Palmer
Brian K. Kennedy
19
1
0
11 Aug 2024
AI-Assisted Generation of Difficult Math Questions
AI-Assisted Generation of Difficult Math Questions
Vedant Shah
Dingli Yu
Kaifeng Lyu
Simon Park
Nan Rosemary Ke
...
Yoshua Bengio
Sanjeev Arora
Anirudh Goyal
Sanjeev Arora
Anirudh Goyal
32
14
0
30 Jul 2024
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
PUZZLES: A Benchmark for Neural Algorithmic Reasoning
Benjamin Estermann
Luca A. Lanzendörfer
Yannick Niedermayr
Roger Wattenhofer
40
2
0
29 Jun 2024
Evaluating Large Vision-and-Language Models on Children's Mathematical
  Olympiads
Evaluating Large Vision-and-Language Models on Children's Mathematical Olympiads
A. Cherian
Kuan-Chuan Peng
Suhas Lohit
Joanna Matthiesen
Kevin A. Smith
J. Tenenbaum
ELM
LRM
39
6
0
22 Jun 2024
Previous
12