v1v2 (latest)

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

International Conference on Learning Representations (ICLR), 2022

28 May 2022

Chenglong Wang

ArXiv (abs)PDF HTML Github (27★)

Papers citing "Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions"

40 / 40 papers shown

In-Token Rationality Optimization: Towards Accurate and Concise LLM Reasoning via Self-Feedback

216

13 Nov 2025

ReviewScore: Misinformed Peer Review Detection with Large Language Models

...

135

25 Sep 2025

GPO: Learning from Critical Steps to Improve LLM Reasoning

184

19 Sep 2025

Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)

Chongli Qin

Jost Tobias Springenberg

OffRL

208

17 Jul 2025

Can Large Reasoning Models Self-Train?

412

27 May 2025

Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

524

23 May 2025

STaR-SQL: Self-Taught Reasoner for Text-to-SQLAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

198

20 Feb 2025

Evolutionary Pre-Prompt Optimization for Mathematical Reasoning

231

05 Dec 2024

Keep Guessing? When Considering Inference Scaling, Mind the BaselinesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

403

20 Oct 2024

Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024

Xiaolin Ai

394

08 Oct 2024

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse PathsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

233

07 Oct 2024

Interpreting and Improving Large Language Models in Arithmetic CalculationInternational Conference on Machine Learning (ICML), 2024

Wei Zhang

Chaoqun Wan

Yonggang Zhang

Yiu-ming Cheung

Xinmei Tian

Xu Shen

Jieping Ye

LRM

323

03 Sep 2024

Weak-to-Strong Reasoning

330

18 Jul 2024

Advancing Process Verification for Large Language Models via Tree-Based Preference Learning

Weiming Lu

220

29 Jun 2024

PORT: Preference Optimization on Reasoning Traces

331

23 Jun 2024

Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models

Zhiyong Wu

189

17 Jun 2024

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

Qian Liu

279

118

13 Jun 2024

AICoderEval: Improving AI Domain Code Generation of Large Language Models

155

07 Jun 2024

mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models

Huiyuan Lai

Malvina Nissim

LRM

424

04 Jun 2024

NExT: Teaching Large Language Models to Reason about Code Execution

Ansong Ni

Miltiadis Allamanis

Arman Cohan

Yinlin Deng

267

23 Apr 2024

Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

339

16 Apr 2024

Eliciting Better Multilingual Structured Reasoning from LLMs through Code

342

05 Mar 2024

Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step

Li Zhong

Zilong Wang

Jingbo Shang

439

121

25 Feb 2024

An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning

191

23 Feb 2024

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

259

17 Feb 2024

V-STaR: Training Verifiers for Self-Taught Reasoners

Nikolay Malkin

327

192

09 Feb 2024

CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay

Michaël Defferrard

207

07 Feb 2024

Multi-step Problem Solving Through a Verifier: An Empirical Analysis on Model-induced Process Supervision

231

05 Feb 2024

TinyGSM: achieving >80% on GSM8k with small language models

251

14 Dec 2023

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

...

Jascha Narain Sohl-Dickstein

Noah Fiedel

ALM LRM ReLM SyDa

614

246

11 Dec 2023

SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving

Xueliang Zhao

Xinting Huang

Wei Bi

Lingpeng Kong

LRM

239

19 Oct 2023

Exploration with Principles for Diverse AI Supervision

Hao Liu

Matei A. Zaharia

Pieter Abbeel

307

13 Oct 2023

MuggleMath: Assessing the Impact of Query and Response Augmentation on Math ReasoningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Hongyi Yuan

Keming Lu

Chuanqi Tan

Chang Zhou

314

09 Oct 2023

Resprompt: Residual Connection Prompting Advances Multi-Step Reasoning in Large Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

...

288

07 Oct 2023

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient ReasoningInternational Conference on Learning Representations (ICLR), 2023

351

118

04 Oct 2023

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023

...

Yingbo Zhou

Arman Cohan

244

29 Sep 2023

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-InstructInternational Conference on Learning Representations (ICLR), 2023

...

800

624

18 Aug 2023

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Zheng Yuan

Hongyi Yuan

Cheng Li

Guanting Dong

Keming Lu

Chuanqi Tan

Chang Zhou

Jingren Zhou

LRM ALM

337

281

03 Aug 2023

GRACE: Discriminator-Guided Chain-of-Thought ReasoningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

346

24 May 2023

Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023

...

320

21 May 2023