Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2305.00633
Cited By

Self-Evaluation Guided Beam Search for Reasoning

v1v2v3 (latest)

Self-Evaluation Guided Beam Search for Reasoning

Neural Information Processing Systems (NeurIPS), 2023

1 May 2023

Kenji Kawaguchi

ArXiv (abs)PDF HTML Github

Papers citing "Self-Evaluation Guided Beam Search for Reasoning"

50 / 148 papers shown

ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling

ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling

...

153

1

0

30 Apr 2026

Efficiency Will Not Lead to Sustainable Reasoning AI

Efficiency Will Not Lead to Sustainable Reasoning AI

Philipp Wiesner

Daniel W. OÑeill

Francesca Larosa

251

2

0

19 Nov 2025

Confidence-Guided Stepwise Model Routing for Cost-Efficient Reasoning

Confidence-Guided Stepwise Model Routing for Cost-Efficient Reasoning

199

2

0

09 Nov 2025

Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective

Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective

Boyang Albert Li

209

2

0

01 Nov 2025

RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models

RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models

Rongjunchen Zhang

218

0

0

24 Oct 2025

Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs

Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs

Tristan Cinquin

Agustinus Kristiadi

308

0

0

23 Oct 2025

Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents

Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents

226

1

0

21 Oct 2025

Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs

Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs

Paula Cordero-Encinar

247

4

0

20 Oct 2025

MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning

MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning

...

175

1

0

11 Oct 2025

Logit Arithmetic Elicits Long Reasoning Capabilities Without Training

Logit Arithmetic Elicits Long Reasoning Capabilities Without Training

Muhammad Khalifa

Xinliang Frederick Zhang

Farima Fatahi Bayat

150

6

0

10 Oct 2025

Increasing LLM response trustworthiness using voting ensembles

Increasing LLM response trustworthiness using voting ensembles

Aparna Nair-Kanneganti

Shir Goldfinger

Alison M. Pouch

173

1

0

05 Oct 2025

PatternKV: Flattening KV Representation Expands Quantization Headroom

PatternKV: Flattening KV Representation Expands Quantization Headroom

...

223

0

0

05 Oct 2025

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information

Ninghao Liu

183

0

0

04 Oct 2025

Efficient Test-Time Scaling for Small Vision-Language Models

Efficient Test-Time Scaling for Small Vision-Language Models

Mehmet Onurcan Kaya

Desmond Elliott

Dim P. Papadopoulos

272

3

0

03 Oct 2025

Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs

Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs

Shahzaib Saqib Warraich

Dhruv Tarsadiya

Swabha Swayamdipta

230

0

0

03 Oct 2025

From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models

From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models

...

638

15

0

29 Sep 2025

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts

230

2

0

26 Sep 2025

Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time

Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time

140

0

0

26 Sep 2025

Think Right, Not More: Test-Time Scaling for Numerical Claim Verification

Think Right, Not More: Test-Time Scaling for Numerical Claim Verification

Primakov Chungkham

150

1

0

26 Sep 2025

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

256

26

0

25 Sep 2025

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

322

6

0

19 Sep 2025

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

...

191

41

0

10 Sep 2025

From Long to Short: LLMs Excel at Trimming Own Reasoning Chains

From Long to Short: LLMs Excel at Trimming Own Reasoning Chains

212

1

0

07 Sep 2025

CoVeR: Conformal Calibration for Versatile and Reliable Autoregressive Next-Token Prediction

CoVeR: Conformal Calibration for Versatile and Reliable Autoregressive Next-Token Prediction

268

0

0

05 Sep 2025

Towards Reasoning for PDE Foundation Models: A Reward-Model-Driven Inference-Time-Scaling Algorithm

Towards Reasoning for PDE Foundation Models: A Reward-Model-Driven Inference-Time-Scaling Algorithm

Siddharth Mansingh

Kamaljeet Singh

...

Nathan DeBardeleben

AI4TS AI4CE LRM

273

1

0

02 Sep 2025

LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation

LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation

236

0

0

27 Aug 2025

Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS

Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS

...

Dimitris N. Metaxas

Dimitris N. Metaxas

368

10

0

19 Aug 2025

FedCoT: Communication-Efficient Federated Reasoning Enhancement for Large Language Models

FedCoT: Communication-Efficient Federated Reasoning Enhancement for Large Language Models

196

1

0

07 Aug 2025

AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks

AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks

...

347

11

0

26 Jul 2025

MindJourney: Test-Time Scaling with World Models for Spatial Reasoning

MindJourney: Test-Time Scaling with World Models for Spatial Reasoning

370

2

0

16 Jul 2025

Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning

Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning

227

1

0

14 Jul 2025

Making VLMs More Robot-Friendly: Self-Critical Distillation of Low-Level Procedural Reasoning

Making VLMs More Robot-Friendly: Self-Critical Distillation of Low-Level Procedural Reasoning

Chan Young Park

Jillian R. Fisher

406

3

0

11 Jul 2025

Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models

Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models

350

4

0

03 Jul 2025

VIDEE: Visual and Interactive Decomposition, Execution, and Evaluation of Text Analytics with Intelligent Agents

VIDEE: Visual and Interactive Decomposition, Execution, and Evaluation of Text Analytics with Intelligent Agents

386

0

0

17 Jun 2025

Learning to Reason Across Parallel Samples for LLM Reasoning

353

19

0

10 Jun 2025

HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains

265

0

0

09 Jun 2025

From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium

326

16

0

09 Jun 2025

LLM-First Search: Self-Guided Exploration of the Solution Space

LLM-First Search: Self-Guided Exploration of the Solution Space

Tim Rocktaschel

Roberta Raileanu

408

3

0

05 Jun 2025

Incentivizing LLMs to Self-Verify Their Answers

Incentivizing LLMs to Self-Verify Their Answers

527

6

0

02 Jun 2025

Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling

Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling

353

8

0

30 May 2025

Control-R: Towards controllable test-time scaling

Control-R: Towards controllable test-time scaling

...

258

0

0

30 May 2025

Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

310

12

0

29 May 2025

Temporal Sampling for Forgotten Reasoning in LLMs

Temporal Sampling for Forgotten Reasoning in LLMs

Bhaskar Ramasubramanian

Bill Yuchen Lin

Radha Poovendran

423

11

0

26 May 2025

Large Language Models for Planning: A Comprehensive and Systematic Survey

Large Language Models for Planning: A Comprehensive and Systematic Survey

LLMAG LM&Ro OffRL ELM LRM

574

28

0

26 May 2025

Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration

Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration

707

25

0

23 May 2025

Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence

Amirhosein Ghasemabadi

319

8

0

23 May 2025

Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation

Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation

Seamus Somerstep

323

0

0

22 May 2025

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning

When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning

621

24

0

21 May 2025

Output Scaling: YingLong-Delayed Chain of Thought in a Large Pretrained Time Series Forecasting Model

Output Scaling: YingLong-Delayed Chain of Thought in a Large Pretrained Time Series Forecasting Model

AI4TS AI4CE LRM

304

11

0

20 May 2025

MR. Judge: Multimodal Reasoner as a Judge

MR. Judge: Multimodal Reasoner as a Judge

432

5

0

19 May 2025

Page 1 of 3