ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.19676
  4. Cited By
Large Language Models' Reasoning Stalls: An Investigation into the Capabilities of Frontier Models
v1v2 (latest)

Large Language Models' Reasoning Stalls: An Investigation into the Capabilities of Frontier Models

26 May 2025
Lachlan McGinness
Peter Baumgartner
    ReLMLRMELM
ArXiv (abs)PDFHTML

Papers citing "Large Language Models' Reasoning Stalls: An Investigation into the Capabilities of Frontier Models"

19 / 19 papers shown
Title
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
Xumeng Wen
Zihan Liu
Shun Zheng
Zhijian Xu
Shengyu Ye
...
Yang Wang
Junjie Li
Ziming Miao
Jiang Bian
Mao Yang
LRM
45
0
0
17 Jun 2025
Steamroller Problems: An Evaluation of LLM Reasoning Capability with
  Automated Theorem Prover Strategies
Steamroller Problems: An Evaluation of LLM Reasoning Capability with Automated Theorem Prover Strategies
Lachlan McGinness
Peter Baumgartner
LRM
63
1
0
17 Jul 2024
Reasoning in Large Language Models: A Geometric Perspective
Reasoning in Large Language Models: A Geometric Perspective
Romain Cosentino
Sarath Shekkizhar
LRM
104
3
0
02 Jul 2024
Let's Think Dot by Dot: Hidden Computation in Transformer Language
  Models
Let's Think Dot by Dot: Hidden Computation in Transformer Language Models
Jacob Pfau
William Merrill
Samuel R. Bowman
LRM
100
83
0
24 Apr 2024
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DVRALM
347
1,846
1
18 Dec 2023
Instruction-Following Evaluation for Large Language Models
Instruction-Following Evaluation for Large Language Models
Jeffrey Zhou
Tianjian Lu
Swaroop Mishra
Siddhartha Brahma
Sujoy Basu
Yi Luan
Denny Zhou
Le Hou
ELMALMLRM
107
299
0
14 Nov 2023
Unleashing the potential of prompt engineering in Large Language Models:
  a comprehensive review
Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review
Banghao Chen
Zhaofeng Zhang
Nicolas Langrené
Shengxin Zhu
LLMAG
122
89
0
23 Oct 2023
Embers of Autoregression: Understanding Large Language Models Through
  the Problem They are Trained to Solve
Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve
R. Thomas McCoy
Shunyu Yao
Dan Friedman
Matthew Hardy
Thomas Griffiths
71
160
0
24 Sep 2023
DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal
  Services
DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services
Shengbin Yue
Wei Chen
Siyuan Wang
Bingxuan Li
Chenchen Shen
...
Yuxuan Zhou
Yao Xiao
Song Yun
Xuanjing Huang
Zhongyu Wei
AILawELM
119
99
0
20 Sep 2023
Large Language Models Fail on Trivial Alterations to Theory-of-Mind
  Tasks
Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks
T. Ullman
LRM
92
241
0
16 Feb 2023
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALMELMLRMReLM
301
1,144
0
17 Oct 2022
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLMOffRLLRM
451
4,610
0
27 Oct 2021
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELMAIMatReCodALM
218
2,024
0
16 Aug 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
307
5,702
0
07 Jul 2021
Scaling Laws for Transfer
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
100
251
0
02 Feb 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
679
4,948
0
23 Jan 2020
HellaSwag: Can a Machine Really Finish Your Sentence?
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
250
2,538
0
19 May 2019
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning
  Over Paragraphs
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs
Dheeru Dua
Yizhong Wang
Pradeep Dasigi
Gabriel Stanovsky
Sameer Singh
Matt Gardner
AIMat
189
967
0
01 Mar 2019
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question
  Answering
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Zhilin Yang
Peng Qi
Saizheng Zhang
Yoshua Bengio
William W. Cohen
Ruslan Salakhutdinov
Christopher D. Manning
RALM
277
2,712
0
25 Sep 2018
1