Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2305.00633
Cited By
v1
v2
v3 (latest)
Self-Evaluation Guided Beam Search for Reasoning
Neural Information Processing Systems (NeurIPS), 2023
1 May 2023
Yuxi Xie
Kenji Kawaguchi
Yiran Zhao
Xu Zhao
MingSung Kan
Junxian He
Qizhe Xie
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Self-Evaluation Guided Beam Search for Reasoning"
50 / 148 papers shown
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
Jianghao Lin
Yuanyuan Shi
Xin Peng
Renjie Ding
Hairui Wang
...
Fengshuo Bai
Huacan Chai
Weinan Zhang
Fei Huang
Y. Wen
153
1
0
30 Apr 2026
Efficiency Will Not Lead to Sustainable Reasoning AI
Philipp Wiesner
Daniel W. OÑeill
Francesca Larosa
O. Kao
LRM
251
2
0
19 Nov 2025
Confidence-Guided Stepwise Model Routing for Cost-Efficient Reasoning
Sangmook Lee
Dohyung Kim
Hyukhun Koh
Nakyeong Yang
Kyomin Jung
LRM
199
2
0
09 Nov 2025
Test-time Scaling of LLMs: A Survey from A Subproblem Structure Perspective
Zhuoyi Yang
Xu Guo
Tong Zhang
Huijuan Xu
Boyang Albert Li
LRM
209
2
0
01 Nov 2025
RETuning: Upgrading Inference-Time Scaling for Stock Movement Prediction with Large Language Models
Xueyuan Lin
Cehao Yang
Ye Ma
Ming Li
Rongjunchen Zhang
Yang Ni
Xiaojun Wu
Chengjin Xu
Jian Guo
Hui Xiong
AIFin
LRM
218
0
0
24 Oct 2025
Limits of PRM-Guided Tree Search for Mathematical Reasoning with LLMs
Tristan Cinquin
Geoff Pleiss
Agustinus Kristiadi
AIMat
LRM
308
0
0
23 Oct 2025
Med-VRAgent: A Framework for Medical Visual Reasoning-Enhanced Agents
Guangfu Guo
Xiaoqian Lu
Yue Feng
LRM
226
1
0
21 Oct 2025
Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs
Paula Cordero-Encinar
Andrew Duncan
LRM
247
4
0
20 Oct 2025
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
Hongwei Chen
Yishu Lei
Dan Zhang
Bo Ke
Danxiang Zhu
...
Shikun Feng
Jingzhou He
Yu Sun
Hua Wu
Haifeng Wang
ReLM
LRM
175
1
0
11 Oct 2025
Logit Arithmetic Elicits Long Reasoning Capabilities Without Training
Y. Zhang
Muhammad Khalifa
Lechen Zhang
Xin Liu
Ayoung Lee
Xinliang Frederick Zhang
Farima Fatahi Bayat
L. Wang
RALM
LRM
150
6
0
10 Oct 2025
Increasing LLM response trustworthiness using voting ensembles
Aparna Nair-Kanneganti
Trevor J. Chan
Shir Goldfinger
Emily Mackay
Brian Anthony
Alison M. Pouch
173
1
0
05 Oct 2025
PatternKV: Flattening KV Representation Expands Quantization Headroom
Ji Zhang
Yiwei Li
Shaoxiong Feng
Peiwen Yuan
Xinglin Wang
...
Y. Zhang
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
MQ
223
0
0
05 Oct 2025
MITS: Enhanced Tree Search Reasoning for LLMs via Pointwise Mutual Information
Jiaxi Li
Yucheng Shi
Jin Lu
Ninghao Liu
Ninghao Liu
LRM
183
0
0
04 Oct 2025
Efficient Test-Time Scaling for Small Vision-Language Models
Mehmet Onurcan Kaya
Desmond Elliott
Dim P. Papadopoulos
VLM
272
3
0
03 Oct 2025
Sample, Align, Synthesize: Graph-Based Response Synthesis with ConGrs
Sayan Ghosh
Shahzaib Saqib Warraich
Dhruv Tarsadiya
Gregory Yauney
Swabha Swayamdipta
230
0
0
03 Oct 2025
From Perception to Cognition: A Survey of Vision-Language Interactive Reasoning in Multimodal Large Language Models
Chenyue Zhou
Mingxuan Wang
Yanbiao Ma
Chenxu Wu
Wanyi Chen
...
Guoli Jia
Lingling Li
Z. Lu
Y. Lu
Wenhan Luo
LRM
638
15
0
29 Sep 2025
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts
Ammar Ahmed
A. Khan
Ayaan Ahmad
Sheng Di
Zirui Liu
Ali Anwar
ReLM
LRM
230
2
0
26 Sep 2025
Dynamic Experts Search: Enhancing Reasoning in Mixture-of-Experts LLMs at Test Time
Yixuan Han
Fan Ma
Ruijie Quan
Yi Yang
MoE
LRM
140
0
0
26 Sep 2025
Think Right, Not More: Test-Time Scaling for Numerical Claim Verification
Primakov Chungkham
Venktesh V
Vinay Setty
Avishek Anand
LRM
150
1
0
26 Sep 2025
VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
Ziang Yan
Xinhao Li
Yinan He
Zhengrong Yue
Xiangyu Zeng
Yali Wang
Yu Qiao
Limin Wang
Yi Wang
MLLM
VLM
LRM
256
26
0
25 Sep 2025
Distribution-Aligned Decoding for Efficient LLM Task Adaptation
Senkang Hu
Xudong Han
Jinqi Jiang
Yihang Tao
Zihan Fang
Yong Dai
Sam Kwong
Yuguang Fang
322
6
0
19 Sep 2025
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning
Zhiheng Xi
J. Huang
Chenyang Liao
Baodai Huang
Honglin Guo
...
Tao Gui
Zuxuan Wu
Qi Zhang
Xuanjing Huang
Yu-Gang Jiang
191
41
0
10 Sep 2025
From Long to Short: LLMs Excel at Trimming Own Reasoning Chains
Wei Han
Geng Zhan
Sicheng Yu
Chenyu Wang
Bryan Hooi
LRM
212
1
0
07 Sep 2025
CoVeR: Conformal Calibration for Versatile and Reliable Autoregressive Next-Token Prediction
Yuzhu Chen
Yingjie Wang
Shunyu Liu
Yongcheng Jing
Dacheng Tao
268
0
0
05 Sep 2025
Towards Reasoning for PDE Foundation Models: A Reward-Model-Driven Inference-Time-Scaling Algorithm
Siddharth Mansingh
James Amarel
Ragib Arnab
Arvind Mohan
Kamaljeet Singh
...
Emily Casleton
Nathan DeBardeleben
Ayan Biswas
Diane Oyen
Earl Lawrence
AI4TS
AI4CE
LRM
273
1
0
02 Sep 2025
LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation
Yang Sun
Lixin Zou
Dan Luo
Zhiyong Xie
Liming Dong
Liming Dong
Yunwei Zhao
Y. Lu
Y. Lu
Chenliang Li
OffRL
236
0
0
27 Aug 2025
Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS
Can Jin
Yang Zhou
Qixin Zhang
Hongwu Peng
Di Zhang
...
Ligong Han
Zhang-Wei Hong
Tong Che
Dimitris N. Metaxas
Dimitris N. Metaxas
OffRL
LRM
368
10
0
19 Aug 2025
FedCoT: Communication-Efficient Federated Reasoning Enhancement for Large Language Models
Chuan Li
Qianyi Zhao
Fengran Mo
Cen Chen
LRM
196
1
0
07 Aug 2025
AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks
Fali Wang
Hui Liu
Zhenwei Dai
Jingying Zeng
Zhiwei Zhang
...
Chen Luo
Zhen Li
Xianfeng Tang
Qi He
Suhang Wang
LLMAG
347
11
0
26 Jul 2025
MindJourney: Test-Time Scaling with World Models for Spatial Reasoning
Yuncong Yang
Jiageng Liu
Zheyuan Zhang
Siyuan Zhou
Reuben Tan
Jianwei Yang
Yilun Du
Chuang Gan
VGen
LRM
370
2
0
16 Jul 2025
Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning
Zijun Chen
Wenbo Hu
Richang Hong
LRM
227
1
0
14 Jul 2025
Making VLMs More Robot-Friendly: Self-Critical Distillation of Low-Level Procedural Reasoning
Chan Young Park
Jillian R. Fisher
Marius Memmel
Dipika Khullar
Seoho Yun
Abhishek Gupta
Yejin Choi
LRM
406
3
0
11 Jul 2025
Think How to Think: Mitigating Overthinking with Autonomous Difficulty Cognition in Large Reasoning Models
Y. Liu
Haoxi Li
Xiaosong Ma
Jie Zhang
Song Guo
LRM
350
4
0
03 Jul 2025
VIDEE: Visual and Interactive Decomposition, Execution, and Evaluation of Text Analytics with Intelligent Agents
Sam Yu-Te Lee
Chenyang Ji
Shicheng Wen
Lifu Huang
Dongyu Liu
Kwan-Liu Ma
386
0
0
17 Jun 2025
Learning to Reason Across Parallel Samples for LLM Reasoning
Jianing Qi
Xi Ye
Hao Tang
Zhigang Zhu
Eunsol Choi
ReLM
LRM
353
19
0
10 Jun 2025
HAIBU-ReMUD: Reasoning Multimodal Ultrasound Dataset and Model Bridging to General Specific Domains
Shijie Wang
Yilun Zhang
Zeyu Lai
Dexing Kong
265
0
0
09 Jun 2025
From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium
Xie Yi
Zhanke Zhou
Chentao Cao
Qiyu Niu
Tongliang Liu
Bo Han
326
16
0
09 Jun 2025
LLM-First Search: Self-Guided Exploration of the Solution Space
Nathan Herr
Tim Rocktaschel
Roberta Raileanu
LRM
408
3
0
05 Jun 2025
Incentivizing LLMs to Self-Verify Their Answers
Fuxiang Zhang
Jiacheng Xu
Chaojie Wang
Ce Cui
Yang Liu
Rui Hu
ReLM
LRM
527
6
0
02 Jun 2025
Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling
Xinglin Wang
Yiwei Li
Shaoxiong Feng
Peiwen Yuan
Y. Zhang
Jiayi Shi
Chuyi Tan
Boyuan Pan
Yao Hu
Kan Li
LRM
353
8
0
30 May 2025
Control-R: Towards controllable test-time scaling
Di Zhang
Weida Wang
Junxian Li
Xunzhi Wang
Jiatong Li
...
Peng Ye
Shufei Zhang
Xuming He
Yuqiang Li
Dongzhan Zhou
LRM
258
0
0
30 May 2025
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness
Yongjin Yang
Euiin Yi
Jongwoo Ko
Kimin Lee
Zhijing Jin
Se-Young Yun
LLMAG
310
12
0
29 May 2025
Temporal Sampling for Forgotten Reasoning in LLMs
Yuetai Li
Zhangchen Xu
Fengqing Jiang
Bhaskar Ramasubramanian
Luyao Niu
Bill Yuchen Lin
Xiang Yue
Radha Poovendran
CLL
KELM
LRM
423
11
0
26 May 2025
Large Language Models for Planning: A Comprehensive and Systematic Survey
Pengfei Cao
Tianyi Men
Wencan Liu
Jingwen Zhang
Xuzhao Li
Xixun Lin
Dianbo Sui
Yanan Cao
Kang Liu
Jun Zhao
LLMAG
LM&Ro
OffRL
ELM
LRM
574
28
0
26 May 2025
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
Jingtong Gao
Ling Pan
Yejing Wang
Rui Zhong
Chi Lu
Qingpeng Cai
Peng Jiang
Xiangyu Zhao
LRM
707
25
0
23 May 2025
Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence
Amirhosein Ghasemabadi
Keith G. Mills
Baochun Li
Di Niu
LRM
319
8
0
23 May 2025
Learning to Choose or Choosing to Learn: Best-of-N vs. Supervised Fine-Tuning for Bit String Generation
Seamus Somerstep
Vinod Raman
Unique Subedi
Yuekai Sun
323
0
0
22 May 2025
When to Continue Thinking: Adaptive Thinking Mode Switching for Efficient Reasoning
Xiaoyun Zhang
Jingqing Ruan
Xing Ma
Yawen Zhu
Haodong Zhao
Hao Li
Jiansong Chen
Ke Zeng
Xunliang Cai
LRM
621
24
0
21 May 2025
Output Scaling: YingLong-Delayed Chain of Thought in a Large Pretrained Time Series Forecasting Model
Qingsong Wen
Tian Zhou
Jinyang Gao
Bolin Ding
Jingren Zhou
AI4TS
AI4CE
LRM
304
11
0
20 May 2025
MR. Judge: Multimodal Reasoner as a Judge
Renjie Pi
Felix Bai
Qibin Chen
Simon Wang
Jiulong Shan
Kieran Liu
Meng Cao
ELM
LRM
432
5
0
19 May 2025
1
2
3
Next
Page 1 of 3