Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2305.20050
Cited By

Let's Verify Step by Step

Let's Verify Step by Step

International Conference on Learning Representations (ICLR), 2023

31 May 2023

Hunter Lightman

Harrison Edwards

ArXiv (abs)PDF HTML HuggingFace (10 upvotes)

Papers citing "Let's Verify Step by Step"

50 / 1,441 papers shown

Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space

Test-Time Alignment of LLMs via Sampling-Based Optimal Control in pre-logit space

Sekitoshi Kanai

Tsukasa Yoshida

Hiroshi Takahashi

Kazumune Hashimoto

115

0

0

30 Oct 2025

Kad: A Framework for Proxy-based Test-time Alignment with Knapsack Approximation Deferral

Kad: A Framework for Proxy-based Test-time Alignment with Knapsack Approximation Deferral

Pierre Zweigenbaum

238

0

0

30 Oct 2025

The Oversight Game: Learning to Cooperatively Balance an AI Agent's Safety and Autonomy

The Oversight Game: Learning to Cooperatively Balance an AI Agent's Safety and Autonomy

William Overman

89

0

0

30 Oct 2025

Cross-Platform Evaluation of Reasoning Capabilities in Foundation Models

Cross-Platform Evaluation of Reasoning Capabilities in Foundation Models

207

0

0

30 Oct 2025

Zero Reinforcement Learning Towards General Domains

Zero Reinforcement Learning Towards General Domains

OffRL ReLM LRM AI4CE

165

0

0

29 Oct 2025

Reasoning-Aware GRPO using Process Mining

Reasoning-Aware GRPO using Process Mining

42

0

0

29 Oct 2025

TextualVerifier: Verify TextGrad Step-by-Step

TextualVerifier: Verify TextGrad Step-by-Step

Eugenius Mario Situmorang

Adila Alfa Krisnadhi

102

1

0

29 Oct 2025

Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry

Communication and Verification in LLM Agents towards Collaboration under Information Asymmetry

Cristian-Paul Bara

137

0

0

29 Oct 2025

SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation

SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation

Sina Bagheri Nezhad

95

1

0

29 Oct 2025

Are Language Models Efficient Reasoners? A Perspective from Logic Programming

Are Language Models Efficient Reasoners? A Perspective from Logic Programming

Yanick Zengaffinen

Haruki Shirakami

Mrinmaya Sachan

Abulhair Saparov

Bernhard Schölkopf

158

0

0

29 Oct 2025

Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning

Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning

...

148

0

0

29 Oct 2025

A Survey on Efficient Large Language Model Training: From Data-centric Perspectives

A Survey on Efficient Large Language Model Training: From Data-centric PerspectivesAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

145

4

0

29 Oct 2025

Evaluating the Role of Verifiers in Test-Time Scaling for Legal Reasoning Tasks

Evaluating the Role of Verifiers in Test-Time Scaling for Legal Reasoning Tasks

Jonathan Schwarz

Daniele Giofré

94

0

0

29 Oct 2025

Verifying Large Language Models' Reasoning Paths via Correlation Matrix Rank

Verifying Large Language Models' Reasoning Paths via Correlation Matrix Rank

91

0

0

28 Oct 2025

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?

104

2

0

28 Oct 2025

The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?

The Kinetics of Reasoning: How Chain-of-Thought Shapes Learning in Transformers?

Costas Mavromatis

Huzefa Rangwala

98

0

0

28 Oct 2025

SPICE: Self-Play In Corpus Environments Improves Reasoning

SPICE: Self-Play In Corpus Environments Improves Reasoning

Sainbayar Sukhbaatar

Jack Lanchantin

237

9

0

28 Oct 2025

MASPRM: Multi-Agent System Process Reward Model

MASPRM: Multi-Agent System Process Reward Model

Mahdi Mostajabdaveh

96

0

0

28 Oct 2025

Process Reward Models for Sentence-Level Verification of LVLM Radiology Reports

Process Reward Models for Sentence-Level Verification of LVLM Radiology Reports

Jean-Benoit Delbrouck

Curtis P. Langlotz

96

0

0

27 Oct 2025

PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection

PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection

507

1

0

27 Oct 2025

Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards

Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards

Jan Niklas Groeneveld

Alexander Schaefer

339

0

0

27 Oct 2025

Think before Recommendation: Autonomous Reasoning-enhanced Recommender

Think before Recommendation: Autonomous Reasoning-enhanced Recommender

151

0

0

27 Oct 2025

Adaptive Blockwise Search: Inference-Time Alignment for Large Language Models

Adaptive Blockwise Search: Inference-Time Alignment for Large Language Models

Mohammad Atif Quamar

Ananth Shreekumar

Jonathan Rosenthal

Muslum Ozgur Ozmen

Mikhail Kuznetsov

Z. Berkay Celik

88

0

0

27 Oct 2025

Once Upon an Input: Reasoning via Per-Instance Program Synthesis

Once Upon an Input: Reasoning via Per-Instance Program Synthesis

Neelay Velingker

173

0

0

26 Oct 2025

FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning

FAPO: Flawed-Aware Policy Optimization for Efficient and Reliable Reasoning

161

1

0

26 Oct 2025

Mapping Faithful Reasoning in Language Models

Mapping Faithful Reasoning in Language Models

Andreas Damianou

José Luis Redondo García

Konstantina Palla

104

0

0

25 Oct 2025

When Fewer Layers Break More Chains: Layer Pruning Harms Test-Time Scaling in LLMs

When Fewer Layers Break More Chains: Layer Pruning Harms Test-Time Scaling in LLMs

118

1

0

25 Oct 2025

Weak-to-Strong Generalization under Distribution Shifts

Weak-to-Strong Generalization under Distribution Shifts

195

0

0

24 Oct 2025

Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning

Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning

Ravindra Aribowo Tarunokusumo

Rafael Fernandes Cunha

142

0

0

24 Oct 2025

The Universal Landscape of Human Reasoning

The Universal Landscape of Human Reasoning

...

95

1

0

24 Oct 2025

Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models

Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models

OffRL CLL KELM VLM LRM

135

0

0

24 Oct 2025

Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection

Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection

Gaiar Baimuratov

102

0

0

23 Oct 2025

Self-Jailbreaking: Language Models Can Reason Themselves Out of Safety Alignment After Benign Reasoning Training

Self-Jailbreaking: Language Models Can Reason Themselves Out of Safety Alignment After Benign Reasoning Training

Stephen H. Bach

250

0

0

23 Oct 2025

Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs

Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs

Tristan Cinquin

Agustinus Kristiadi

243

0

0

23 Oct 2025

What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation

What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation

161

1

1

23 Oct 2025

LoongRL: Reinforcement Learning for Advanced Reasoning over Long Contexts

LoongRL: Reinforcement Learning for Advanced Reasoning over Long Contexts

OffRL RALM ReLM LRM

242

3

0

22 Oct 2025

CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs

CircuitSeer: Mining High-Quality Data by Probing Mathematical Reasoning Circuits in LLMs

165

1

0

21 Oct 2025

Activating Visual Context and Commonsense Reasoning through Masked Prediction in VLMs

Activating Visual Context and Commonsense Reasoning through Masked Prediction in VLMs

112

0

0

21 Oct 2025

Reasoning Language Model Inference Serving Unveiled: An Empirical Study

Reasoning Language Model Inference Serving Unveiled: An Empirical Study

256

1

0

21 Oct 2025

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

118

0

0

21 Oct 2025

What Makes a Good Curriculum? Disentangling the Effects of Data Ordering on LLM Mathematical Reasoning

What Makes a Good Curriculum? Disentangling the Effects of Data Ordering on LLM Mathematical Reasoning

Soroush Vosoughi

198

1

0

21 Oct 2025

Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains

Foundational Automatic Evaluators: Scaling Multi-Task Generative Evaluator Training for Reasoning-Centric Domains

Xuan-Phi Nguyen

OffRL ALM LRM ELM

225

0

0

20 Oct 2025

Soft-Masked Diffusion Language Models

Soft-Masked Diffusion Language Models

Michael Hersche

Samuel Moor-Smith

314

1

0

20 Oct 2025

Inference-Time Compute Scaling For Flow Matching

Inference-Time Compute Scaling For Flow Matching

Noah El Rimawi-Fine

Mathieu Blanchette

116

0

0

20 Oct 2025

Fine-tuning Flow Matching Generative Models with Intermediate Feedback

Fine-tuning Flow Matching Generative Models with Intermediate Feedback

161

1

0

20 Oct 2025

Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs

Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs

Paula Cordero-Encinar

196

1

0

20 Oct 2025

Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling

Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling

Mehmet Onurcan Kaya

Dim P. Papadopoulos

308

0

0

19 Oct 2025

DAG-Math: Graph-Guided Mathematical Reasoning in LLMs

DAG-Math: Graph-Guided Mathematical Reasoning in LLMs

Ilja Kuzborskij

154

1

0

19 Oct 2025

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

A Comprehensive Survey on Reinforcement Learning-based Agentic Search: Foundations, Roles, Optimizations, Evaluations, and Applications

Charu C. Aggarwal

560

2

0

19 Oct 2025

Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?

Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?

211

1

0

18 Oct 2025

1 2 3 4 5 6...27 28 29