Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

International Conference on Learning Representations (ICLR), 2024

29 August 2024

ArXiv (abs)PDF HTML HuggingFace (28 upvotes)Github

Papers citing "Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems"

32 / 32 papers shown

Tailored Primitive Initialization is the Secret Key to Reinforcement Learning

198

16 Nov 2025

EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences

337

07 Oct 2025

Modeling Student Learning with 3.8 Million Program Traces

153

06 Oct 2025

MedReflect: Teaching Medical LLMs to Self-Improve via Reflective Correction

207

04 Oct 2025

OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows

285

03 Oct 2025

Teaching Transformers to Solve Combinatorial Problems through Efficient Trial & Error

Panagiotis Giannoulis

Yorgos Pantis

Christos Tzamos

170

26 Sep 2025

PALADIN: Self-Correcting Language Model Agents to Cure Tool-Failure Cases

Sri Vatsa Vuddanti

Aarav Shah

Satwik Kumar Chittiprolu

171

25 Sep 2025

Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels

156

20 Sep 2025

RetrySQL: text-to-SQL training with retry data for self-correcting query generation

383

03 Jul 2025

A Survey on Large Language Models for Mathematical Reasoning

...

376

10 Jun 2025

Boosting LLM Reasoning via Spontaneous Self-Correction

...

300

07 Jun 2025

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

1.2K

06 Jun 2025

Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding

347

24 May 2025

Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving

400

07 May 2025

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

675

220

30 Apr 2025

Process Reward Models That Think

631

23 Apr 2025

LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception

1.1K

21 Apr 2025

CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models

Runlong Zhou

Yi Zhang

RALM

328

02 Apr 2025

RARE: Retrieval-Augmented Reasoning Modeling

...

466

30 Mar 2025

Controlling Large Language Model with Latent Actions

366

27 Mar 2025

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

653

352

03 Mar 2025

Self-Training Elicits Concise Reasoning in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

783

27 Feb 2025

Learning to Reason from Feedback at Test-TimeAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

444

16 Feb 2025

GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?

337

07 Feb 2025

Chasing Progress, Not Perfection: Revisiting Strategies for End-to-End LLM Plan Generation

371

14 Dec 2024

COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement

937

12 Oct 2024

O1 Replication Journey: A Strategic Progress Report -- Part 1

...

431

149

08 Oct 2024

Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic PuzzlesNeural Information Processing Systems (NeurIPS), 2024

300

16 Sep 2024

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language ModelsNeural Information Processing Systems (NeurIPS), 2024

312

15 May 2024

Physics of Language Models: Part 3.2, Knowledge ManipulationInternational Conference on Learning Representations (ICLR), 2023

Zeyuan Allen-Zhu

Yuanzhi Li

KELM

561

145

25 Sep 2023

Physics of Language Models: Part 3.1, Knowledge Storage and ExtractionInternational Conference on Machine Learning (ICML), 2023

Zeyuan Allen-Zhu

Yuanzhi Li

KELM

664

258

25 Sep 2023

Physics of Language Models: Part 1, Learning Hierarchical Language Structures

Zeyuan Allen-Zhu

Yuanzhi Li

634

23 May 2023