Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2305.20050
Cited By
Let's Verify Step by Step
International Conference on Learning Representations (ICLR), 2023
31 May 2023
Hunter Lightman
V. Kosaraju
Yura Burda
Harrison Edwards
Bowen Baker
Teddy Lee
Jan Leike
John Schulman
Ilya Sutskever
K. Cobbe
ALM
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (10 upvotes)
Papers citing
"Let's Verify Step by Step"
50 / 1,390 papers shown
Title
ScRPO: From Errors to Insights
Lianrui Li
Dakuan Lu
Jiawei Shao
Chi Zhang
Xuelong Li
LRM
119
0
0
08 Nov 2025
An Empirical Study of Reasoning Steps in Thinking Code LLMs
Haoran Xue
Gias Uddin
Song Wang
LRM
92
1
0
08 Nov 2025
Adapting Web Agents with Synthetic Supervision
Zhaoyang Wang
Yiming Liang
Xuchao Zhang
Qianhui Wu
Siwei Han
...
Chetan Bansal
Baolin Peng
J. Gao
Saravan Rajmohan
Huaxiu Yao
96
0
0
08 Nov 2025
Lethe: Layer- and Time-Adaptive KV Cache Pruning for Reasoning-Intensive LLM Serving
Hui Zeng
Daming Zhao
Pengfei Yang
WenXuan Hou
Tianyang Zheng
Hui Li
Weiye Ji
Jidong Zhai
172
1
0
08 Nov 2025
Motif 2 12.7B technical report
Junghwan Lim
S. W. Lee
Dongseok Kim
Taehyun Kim
Eunhwan Park
...
Kungyu Lee
Dongpin Oh
Yeongjae Park
Bokki Ryu
Dongjoo Weon
92
0
0
07 Nov 2025
Reusing Pre-Training Data at Test Time is a Compute Multiplier
Alex Fang
Thomas Voice
Ruoming Pang
Ludwig Schmidt
Tom Gunter
102
0
0
06 Nov 2025
An MLCommons Scientific Benchmarks Ontology
B. Hawks
G. V. Laszewski
Matthew D. Sinclair
Marco Colombo
Shivaram Venkataraman
Rutwik Jain
Yiwei Jiang
Nhan Tran
Geoffrey C. Fox
80
1
0
06 Nov 2025
If I Could Turn Back Time: Temporal Reframing as a Historical Reasoning Task for LLMs
Lars Bungum
Charles Yijia Huang
Abeer Kashar
117
0
0
06 Nov 2025
Efficient Reasoning via Thought-Training and Thought-Free Inference
Canhui Wu
Qiong Cao
Chao Xue
Wei Xi
Xiaodong He
ReLM
LRM
379
0
0
05 Nov 2025
CGES: Confidence-Guided Early Stopping for Efficient and Accurate Self-Consistency
Ehsan Aghazadeh
Ahmad Ghasemi
Hedyeh Beyhaghi
Hossein Pishro-Nik
LRM
137
0
0
04 Nov 2025
Lookahead Unmasking Elicits Accurate Decoding in Diffusion Language Models
Sanghyun Lee
Seungryong Kim
Jongho Park
D. Park
115
1
0
04 Nov 2025
Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning
Ru Wang
Wei Huang
Qi Cao
Yusuke Iwasawa
Yutaka Matsuo
Jiaxian Guo
114
0
0
03 Nov 2025
Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration
Yan Sun
Jia Guo
Stanley Kok
Zihao Wang
ZuJie Wen
Zhiqiang Zhang
OffRL
LRM
148
0
0
02 Nov 2025
Reasoning Planning for Language Models
Bao Nguyen
Hieu Trung Nguyen
Ruifeng She
Xiaojin Fu
V. Nguyen
ReLM
LRM
440
0
0
01 Nov 2025
LongCat-Flash-Omni Technical Report
M-A-P Team
Bairui Wang
Bayan
Bin Xiao
Bo Zhang
...
Xin Pan
Xin Chen
Xiusong Sun
Xu Xiang
X. Xing
MLLM
VLM
502
2
0
31 Oct 2025
VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision
Xuan Gong
Senmiao Wang
Hanbo Huang
Ruoyu Sun
Shiyu Liang
OffRL
LRM
102
0
0
31 Oct 2025
Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space
Sekitoshi Kanai
Tsukasa Yoshida
Hiroshi Takahashi
Haru Kuroki
Kazumune Hashimoto
93
0
0
30 Oct 2025
e1: Learning Adaptive Control of Reasoning Effort
Michael Kleinman
Matthew Trager
Alessandro Achille
Wei Xia
Stefano Soatto
LRM
219
2
0
30 Oct 2025
RCScore: Quantifying Response Consistency in Large Language Models
Dongjun Jang
Youngchae Ahn
Hyopil Shin
120
0
0
30 Oct 2025
A Survey on Efficient Large Language Model Training: From Data-centric Perspectives
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Junyu Luo
Bohan Wu
Xiao Luo
Zhiping Xiao
Yiqiao Jin
...
Nan Yin
Yifan Wang
Jingyang Yuan
Wei Ju
Ming Zhang
132
3
0
29 Oct 2025
TextualVerifier: Verify TextGrad Step-by-Step
Eugenius Mario Situmorang
Adila Alfa Krisnadhi
Ari Wibisono
LRM
80
1
0
29 Oct 2025
SPICE: Self-Play In Corpus Environments Improves Reasoning
Bo Liu
Chuanyang Jin
Seungone Kim
Weizhe Yuan
Wenting Zhao
Ilia Kulikov
Xian Li
Sainbayar Sukhbaatar
Jack Lanchantin
Jason Weston
ReLM
LRM
230
6
0
28 Oct 2025
PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection
Yusu Qian
Cheng Wan
Chao Jia
Yinfei Yang
Qingyu Zhao
Zhe Gan
LRM
ReLM
482
1
0
27 Oct 2025
Think before Recommendation: Autonomous Reasoning-enhanced Recommender
Xiaoyu Kong
Junguang Jiang
Bin Liu
Ziru Xu
Han Zhu
Jian Xu
Bo Zheng
Jiancan Wu
Xiang Wang
LRM
124
0
0
27 Oct 2025
Process Reward Models for Sentence-Level Verification of LVLM Radiology Reports
Alois Thomas
M. Varma
Jean-Benoit Delbrouck
Curtis P. Langlotz
84
0
0
27 Oct 2025
Once Upon an Input: Reasoning via Per-Instance Program Synthesis
Adam Stein
Neelay Velingker
Mayur Naik
Eric Wong
ReLM
LRM
152
0
0
26 Oct 2025
When Fewer Layers Break More Chains: Layer Pruning Harms Test-Time Scaling in LLMs
Keyu Wang
Tian Lyu
Guinan Su
Jonas Geiping
L. Yin
Marco Canini
Shiwei Liu
LRM
109
1
0
25 Oct 2025
Mapping Faithful Reasoning in Language Models
Jiazheng Li
Andreas Damianou
J Rosser
José Luis Redondo García
Konstantina Palla
LRM
88
0
0
25 Oct 2025
Weak-to-Strong Generalization under Distribution Shifts
Myeongho Jeon
Jan Sobotka
Suhwan Choi
Maria Brbić
OOD
184
0
0
24 Oct 2025
Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning
Ravindra Aribowo Tarunokusumo
Rafael Fernandes Cunha
OffRL
ReLM
LRM
124
0
0
24 Oct 2025
The Universal Landscape of Human Reasoning
Qiguang Chen
Jinhao Liu
L. Qin
Yimeng Zhang
Yihao Liang
...
Mengkang Hu
Yantao Du
Z. Chen
Xie Chen
Wanxiang Che
LRM
84
0
0
24 Oct 2025
Self-Jailbreaking: Language Models Can Reason Themselves Out of Safety Alignment After Benign Reasoning Training
Zheng-Xin Yong
Stephen H. Bach
LRM
216
0
0
23 Oct 2025
Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection
Jack Butler
Nikita Kozodoi
Zainab Afolabi
Brian Tyacke
Gaiar Baimuratov
97
0
0
23 Oct 2025
What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation
Heejin Do
Jaehui Hwang
Dongyoon Han
Seong Joon Oh
Sangdoo Yun
ELM
LRM
132
1
1
23 Oct 2025
Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs
Tristan Cinquin
Geoff Pleiss
Agustinus Kristiadi
AIMat
LRM
235
0
0
23 Oct 2025
LoongRL: Reinforcement Learning for Advanced Reasoning over Long Contexts
S. S. Wang
Gaokai Zhang
Li Zhang
Ning Shang
Fan Yang
Dongyao Chen
M. Yang
OffRL
RALM
ReLM
LRM
225
0
0
22 Oct 2025
WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality
Chunyang Li
Yilun Zheng
Xinting Huang
Tianqing Fang
Jiahao Xu
Yangqiu Song
L. Chen
Han Hu
ELM
104
0
0
21 Oct 2025
Reasoning Language Model Inference Serving Unveiled: An Empirical Study
Qi Li
Junpan Wu
Xiang Liu
Yuxin Wang
Z. Li
Zhenheng Tang
Yuhan Chen
Shaohuai Shi
Xiaowen Chu
ReLM
LRM
232
1
0
21 Oct 2025
Activating Visual Context and Commonsense Reasoning through Masked Prediction in VLMs
Jiaao Yu
Shenwei Li
Mingjie Han
Yifei Yin
Wenzheng Song
Chenghao Jia
Man Lan
OffRL
LRM
92
0
0
21 Oct 2025
CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs
Shaobo Wang
Yongliang Miao
Yuancheng Liu
Qianli Ma
Ning Liao
Linfeng Zhang
LRM
145
1
0
21 Oct 2025
What Makes a Good Curriculum? Disentangling the Effects of Data Ordering on LLM Mathematical Reasoning
Yaning Jia
Chunhui Zhang
Xingjian Diao
Xiangchi Yuan
Z. Ouyang
Chiyu Ma
Soroush Vosoughi
LRM
163
0
0
21 Oct 2025
Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains
Austin Xu
Xuan-Phi Nguyen
Yilun Zhou
Chien-Sheng Wu
Caiming Xiong
Shafiq Joty
OffRL
ALM
LRM
ELM
213
0
0
20 Oct 2025
Inference-Time Compute Scaling For Flow Matching
Adam Stecklov
Noah El Rimawi-Fine
Mathieu Blanchette
104
0
0
20 Oct 2025
Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs
Paula Cordero-Encinar
Andrew Duncan
LRM
185
1
0
20 Oct 2025
Fine-tuning Flow Matching Generative Models with Intermediate Feedback
Jiajun Fan
Chaoran Cheng
Shuaike Shen
Xiangxin Zhou
Ge Liu
EGVM
148
1
0
20 Oct 2025
Soft-Masked Diffusion Language Models
Michael Hersche
Samuel Moor-Smith
Thomas Hofmann
Abbas Rahimi
264
0
0
20 Oct 2025
A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications
Minhua Lin
Zongyu Wu
Zhichao Xu
Hui Liu
Xianfeng Tang
Qi He
Charu C. Aggarwal
Hui Liu
Xiang Zhang
Suhang Wang
AI4TS
LRM
494
1
0
19 Oct 2025
Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling
Erik Riise
Mehmet Onurcan Kaya
Dim P. Papadopoulos
267
0
0
19 Oct 2025
DAG-Math: Graph-Guided Mathematical Reasoning in LLMs
Yuanhe Zhang
Ilja Kuzborskij
Jason D. Lee
Chenlei Leng
Fanghui Liu
LRM
134
1
0
19 Oct 2025
Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?
Junchi Yu
Y. Liu
Jindong Gu
Philip Torr
Dongzhan Zhou
RALM
191
0
0
18 Oct 2025
Previous
1
2
3
4
5
...
26
27
28
Next