ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.16293
  4. Cited By
Physics of Language Models: Part 2.2, How to Learn From Mistakes on
  Grade-School Math Problems

Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

International Conference on Learning Representations (ICLR), 2024
29 August 2024
Tian Ye
Zicheng Xu
Yuanzhi Li
Zeyuan Allen-Zhu
    ReLMLRM
ArXiv (abs)PDFHTMLHuggingFace (28 upvotes)Github

Papers citing "Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems"

32 / 32 papers shown
Tailored Primitive Initialization is the Secret Key to Reinforcement Learning
Tailored Primitive Initialization is the Secret Key to Reinforcement Learning
Yihang Yao
Guangtao Zeng
Raina Wu
Yang Zhang
Ding Zhao
Zhang-Wei Hong
Chuang Gan
OffRLLRM
198
1
0
16 Nov 2025
EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences
EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences
Kshitish Ghate
Andy Liu
Devansh Jain
Taylor Sorensen
Atoosa Kasirzadeh
Aylin Caliskan
Mona Diab
Maarten Sap
LLMSV
337
0
0
07 Oct 2025
Modeling Student Learning with 3.8 Million Program Traces
Modeling Student Learning with 3.8 Million Program Traces
Alexis Ross
Megha Srivastava
Jeremiah Blanchard
Jacob Andreas
153
5
0
06 Oct 2025
MedReflect: Teaching Medical LLMs to Self-Improve via Reflective Correction
MedReflect: Teaching Medical LLMs to Self-Improve via Reflective Correction
Yue Huang
Yanyuan Chen
Dexuan Xu
Weihua Yue
H. Zhang
Meikang Qiu
AI4MHLRM
207
4
0
04 Oct 2025
OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows
OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows
John Nguyen
Marton Havasi
Tariq Berrada
Luke Zettlemoyer
Ricky T. Q. Chen
285
9
0
03 Oct 2025
Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error
Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error
Panagiotis Giannoulis
Yorgos Pantis
Christos Tzamos
170
1
0
26 Sep 2025
PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases
PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases
Sri Vatsa Vuddanti
Aarav Shah
Satwik Kumar Chittiprolu
Tony Song
Sunishchal Dev
Kevin Zhu
Maheep Chaudhary
KELM
171
7
0
25 Sep 2025
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels
Junjie Ye
Yuming Yang
Yang Nan
Shuo Li
Qi Zhang
Tao Gui
Xuanjing Huang
Liang Luo
Zhongchao Shi
Jianping Fan
156
2
0
20 Sep 2025
RetrySQL: text-to-SQL training with retry data for self-correcting query generation
RetrySQL: text-to-SQL training with retry data for self-correcting query generation
Alicja Rączkowska
Riccardo Belluzzo
Piotr Zieliński
Joanna Baran
Paweł Olszewski
SyDaLRM
383
1
0
03 Jul 2025
A Survey on Large Language Models for Mathematical Reasoning
Peng-Yuan Wang
Tian-Shuo Liu
Chenyang Wang
Yi-Di Wang
Shu Yan
...
Xu-Hui Liu
Xin-Wei Chen
Jia-Cheng Xu
Ziniu Li
Yang Yu
LRM
376
34
0
10 Jun 2025
Boosting LLM Reasoning via Spontaneous Self-Correction
Boosting LLM Reasoning via Spontaneous Self-Correction
Xutong Zhao
Tengyu Xu
Xuewei Wang
Zhengxing Chen
Di Jin
...
Yun He
Sinong Wang
Han Fang
Sarath Chandar
Chen Zhu
ReLMLRMKELM
300
10
0
07 Jun 2025
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
Gouki Minegishi
Hiroki Furuta
Takeshi Kojima
Yusuke Iwasawa
Y. Matsuo
LRM
1.2K
16
0
06 Jun 2025
Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding
Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding
Yixuan Wang
Yijun Liu
Shiyu Ji
Yuzhuang Xu
Yang Xu
Qingfu Zhu
Wanxiang Che
OffRLLRM
347
2
0
24 May 2025
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving
Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving
Zijun Chen
Xinhao Zheng
Renqiu Xia
Xingzhi Qi
Qinxiang Cao
Junchi Yan
AIMat
400
2
0
07 May 2025
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Xiaochen Li
Jiajie Jin
Guanting Dong
Hongjin Qian
Yutao Zhu
Yongkang Wu
Ji-Rong Wen
Zhicheng Dou
LLMAGLRM
675
220
0
30 Apr 2025
Process Reward Models That Think
Process Reward Models That Think
Muhammad Khalifa
Rishabh Agarwal
Lajanugen Logeswaran
Jaekyeom Kim
Hao Peng
Moontae Lee
Honglak Lee
Lu Wang
OffRLALMLRM
631
62
0
23 Apr 2025
LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception
LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception
Yuan-Hong Liao
Sven Elflein
Liu He
Laura Leal-Taixe
Yejin Choi
Sanja Fidler
David Acuna
ReLMLRMVLM
1.1K
13
0
21 Apr 2025
CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models
CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models
Runlong Zhou
Yi Zhang
RALM
328
1
0
02 Apr 2025
RARE: Retrieval-Augmented Reasoning Modeling
RARE: Retrieval-Augmented Reasoning Modeling
Zhengren Wang
Jiayang Yu
Dongsheng Ma
Zhe Chen
Yu Wang
...
Feiyu Xiong
Yanfeng Wang
Weinan E
Linpeng Tang
Feiyu Xiong
RALMLRM
466
7
0
30 Mar 2025
Controlling Large Language Model with Latent Actions
Controlling Large Language Model with Latent Actions
Chengxing Jia
Ziniu Li
Pengyuan Wang
Yi-Chen Li
Zhenyu Hou
Yuxiao Dong
Y. Yu
366
5
0
27 Mar 2025
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
Kanishk Gandhi
Ayush Chakravarthy
Anikait Singh
Nathan Lile
Noah D. Goodman
ReLMLRM
653
352
0
03 Mar 2025
Self-Training Elicits Concise Reasoning in Large Language Models
Self-Training Elicits Concise Reasoning in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Tergel Munkhbat
Namgyu Ho
S. Kim
Yongjin Yang
Yujin Kim
Se-Young Yun
ReLMLRM
783
79
0
27 Feb 2025
Learning to Reason from Feedback at Test-Time
Learning to Reason from Feedback at Test-TimeAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Yanyang Li
Michael R. Lyu
Liwei Wang
LRM
444
10
0
16 Feb 2025
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
Yang Zhou
Hongyi Liu
Zhuoming Chen
Yuandong Tian
Beidi Chen
LRM
337
0
0
07 Feb 2025
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End
  LLM Plan Generation
Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation
Sukai Huang
Trevor Cohn
Nir Lipovetzky
LRM
371
5
0
14 Dec 2024
COrAL: Order-Agnostic Language Modeling for Efficient Iterative
  Refinement
COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement
Yuxi Xie
Anirudh Goyal
Xiaobao Wu
Xunjian Yin
Xiao Xu
Min-Yen Kan
Liangming Pan
William Yang Wang
LRM
937
2
0
12 Oct 2024
O1 Replication Journey: A Strategic Progress Report -- Part 1
O1 Replication Journey: A Strategic Progress Report -- Part 1
Yiwei Qin
Xuefeng Li
Haoyang Zou
Yixiu Liu
Shijie Xia
...
Yixin Ye
Weizhe Yuan
Hector Liu
Rui Wang
Pengfei Liu
VLM
431
149
0
08 Oct 2024
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on
  Logic Puzzles
Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic PuzzlesNeural Information Processing Systems (NeurIPS), 2024
Kulin Shah
Nishanth Dikkala
Xin Wang
Rina Panigrahy
ELMReLMLRM
300
27
0
16 Sep 2024
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in
  Language Models
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Siwei Wang
Yifei Shen
Shi Feng
Haoran Sun
Shang-Hua Teng
Wei Chen
312
15
0
15 May 2024
Physics of Language Models: Part 3.2, Knowledge Manipulation
Physics of Language Models: Part 3.2, Knowledge ManipulationInternational Conference on Learning Representations (ICLR), 2023
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
561
145
0
25 Sep 2023
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Physics of Language Models: Part 3.1, Knowledge Storage and ExtractionInternational Conference on Machine Learning (ICML), 2023
Zeyuan Allen-Zhu
Yuanzhi Li
KELM
664
258
0
25 Sep 2023
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Physics of Language Models: Part 1, Learning Hierarchical Language Structures
Zeyuan Allen-Zhu
Yuanzhi Li
634
48
0
23 May 2023
1
Page 1 of 1