Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2502.18581
Cited By
v1
v2
v3 (latest)
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
25 February 2025
Zhewei Kang
Xuandong Zhao
Kurt Thomas
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (28★)
Papers citing
"Scalable Best-of-N Selection for Large Language Models via Self-Certainty"
50 / 50 papers shown
Title
Distance Is All You Need: Radial Dispersion for Uncertainty Estimation in Large Language Models
Manh Trong Nguyen
Sunil Gupta
Hung Tuan Le
28
0
0
04 Dec 2025
MathSight: A Benchmark Exploring Have Vision-Language Models Really Seen in University-Level Mathematical Reasoning?
Yuandong Wang
Yao Cui
Yuxin Zhao
Zhen Yang
Yangfu Zhu
Zhenzhou Shao
CoGe
VLM
LRM
220
0
0
28 Nov 2025
Self-Evaluating LLMs for Multi-Step Tasks: Stepwise Confidence Estimation for Failure Detection
Vaibhav Mavi
Shubh Jaroria
Weiqi Sun
LRM
88
1
0
10 Nov 2025
Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads
Jingwei Ni
Ekaterina Fadeeva
Tianyi Wu
Mubashara Akhtar
Jiaheng Zhang
...
Markus Leippold
Timothy Baldwin
See-Kiong Ng
Artem Shelmanov
Mrinmaya Sachan
LRM
222
0
0
09 Nov 2025
Klear-AgentForge: Forging Agentic Intelligence through Posttraining Scaling
Qi Wang
Hongzhi Zhang
Jia-Yi Fu
Kai Fu
Yahui Liu
...
Yang Yue
J. Zhang
Fuzheng Zhang
Kun Gai
Guorui Zhou
86
1
0
08 Nov 2025
When, What, and How: Rethinking Retrieval-Enhanced Speculative Decoding
Min Fang
Zhihui Fu
Qibin Zhao
Jun Wang
84
0
0
03 Nov 2025
Inference-Time Chain-of-Thought Pruning with Latent Informativeness Signals
Sophie Li
Nicholas Huang
Nayan Saxena
Nina Luo
Vincent Lin
Kevin Zhu
Sunishchal Dev
BDL
LRM
379
0
0
01 Nov 2025
Do LLMs Signal When They're Right? Evidence from Neuron Agreement
Kang Chen
Yaoning Wang
Kai Xiong
Zhuoka Feng
Wenhe Sun
Haotian Chen
Yixin Cao
76
1
0
30 Oct 2025
Latent Chain-of-Thought for Visual Reasoning
Guohao Sun
Hang Hua
Jian Wang
Jiebo Luo
S. Dianat
Majid Rabbani
Raghuveer Rao
Zhiqiang Tao
BDL
OffRL
LRM
263
6
0
27 Oct 2025
Automated HIV Screening on Dutch Electronic Health Records with Large Language Models
Lang Zhou
Amrish Jhingoer
Yinghao Luo
Klaske Vliegenthart--Jongbloed
Carlijn Jordans
Ben Werkhoven
T. Seinen
E. V. Mulligen
Casper Rokx
Yunlei Li
109
0
0
22 Oct 2025
See, Think, Act: Online Shopper Behavior Simulation with VLM Agents
Yimeng Zhang
Jiri Gesi
Ran Xue
Tian Wang
Ziyi Wang
...
Qingjun Cui
Yufan Guo
Jing Huang
Mubarak Shah
Dakuo Wang
OffRL
164
0
0
22 Oct 2025
Certified Self-Consistency: Statistical Guarantees and Test-Time Training for Reliable Reasoning in LLMs
Paula Cordero-Encinar
Andrew Duncan
LRM
193
1
0
20 Oct 2025
Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations
Sunny Yu
Ahmad Jabbar
Robert Hawkins
Dan Jurafsky
Myra Cheng
180
1
0
14 Oct 2025
EAGER: Entropy-Aware GEneRation for Adaptive Inference-Time Scaling
Daniel Scalena
Leonidas Zotos
Elisabetta Fersini
Malvina Nissim
Ahmet Üstün
LRM
80
1
0
13 Oct 2025
A Survey on Agentic Multimodal Large Language Models
Huanjin Yao
Ruifei Zhang
Jiaxing Huang
Jingyi Zhang
Yibo Wang
...
Ruolin Zhu
Yongcheng Jing
Shunyu Liu
Guanbin Li
Dacheng Tao
LM&Ro
AIFin
AI4TS
LRM
AI4CE
245
4
0
13 Oct 2025
Enhancing LLM Reasoning via Non-Human-Like Reasoning Path Preference Optimization
Junjie Lu
Yuliang Liu
Chaofeng Qu
Wei Shen
Zhouhan Lin
Min Xu
LRM
148
0
0
13 Oct 2025
Unlocking Exploration in RLVR: Uncertainty-aware Advantage Shaping for Deeper Reasoning
Can Xie
Ruotong Pan
Xiangyu Wu
Y. Zhang
Jiayi Fu
Tingting Gao
G. Zhou
OffRL
LRM
124
1
0
12 Oct 2025
Harnessing Consistency for Robust Test-Time LLM Ensemble
Zhichen Zeng
Qi Yu
Xiao Lin
Ruizhong Qiu
Xuying Ning
Tianxin Wei
Yuchen Yan
Jingrui He
Hanghang Tong
124
1
0
12 Oct 2025
Tracing the Traces: Latent Temporal Signals for Efficient and Accurate Reasoning
Martina G. Vilas
Safoora Yousefi
Besmira Nushi
Eric Horvitz
Vidhisha Balachandran
LRM
108
0
0
12 Oct 2025
Revisiting the UID Hypothesis in LLM Reasoning Traces
Minju Gwak
Guijin Son
Jaehyung Kim
LRM
92
0
0
11 Oct 2025
MatryoshkaThinking: Recursive Test-Time Scaling Enables Efficient Reasoning
Hongwei Chen
Yishu Lei
Dan Zhang
Bo Ke
Danxiang Zhu
...
Shikun Feng
Jingzhou He
Yu Sun
Hua Wu
Haifeng Wang
ReLM
LRM
132
0
0
11 Oct 2025
When Retrieval Succeeds and Fails: Rethinking Retrieval-Augmented Generation for LLMs
Yongjie Wang
Yue Yu
Kaisong Song
Jun Lin
Zhiqi Shen
RALM
3DV
198
0
0
10 Oct 2025
FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs
Yan Wang
Penglei Gao
Shengyuan Lin
Jaisal Patel
Jeff Zhao
...
Lingfei Qian
J. Huang
Efstathia Soufleri
Xiao-Yang Liu
J. Nie
108
1
0
10 Oct 2025
Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces
Minju Gwak
Guijin Son
Jaehyung Kim
100
0
0
08 Oct 2025
VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization
Xinye Cao
Hongcan Guo
Jiawen Qian
Guoshun Nan
Chao Wang
Yuqi Pan
Tianhao Hou
X. Wang
Yutong Gao
VGen
136
0
0
07 Oct 2025
Sample Smart, Not Hard: Correctness-First Decoding for Better Reasoning in LLMs
Xueyan Li
Guinan Su
Mrinmaya Sachan
Jonas Geiping
LRM
97
0
0
07 Oct 2025
Verifier-free Test-Time Sampling for Vision Language Action Models
Suhyeok Jang
Dongyoung Kim
Changyeon Kim
Youngsuk Kim
Jinwoo Shin
108
0
0
07 Oct 2025
Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization
Wengao Ye
Yan Liang
Lianlei Shan
OffRL
LRM
382
2
0
05 Oct 2025
Best of mini-N in-loop Sampling: A Contextual Quality Reward Model for Reliable and Efficient Best-of-N Sampling
Hyung Gyu Rho
Sian Lee
149
0
0
05 Oct 2025
Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
Xu Wang
Yan Hu
Benyou Wang
Difan Zou
LLMSV
208
1
0
04 Oct 2025
On the Role of Temperature Sampling in Test-Time Scaling
Yuheng Wu
Azalia Mirhoseini
Thierry Tambe
ALM
LRM
93
1
1
02 Oct 2025
Go with Your Gut: Scaling Confidence for Autoregressive Image Generation
Harold Haodong Chen
Xianfeng Wu
Wen-Jie Shu
Rongjin Guo
Disen Lan
Harry Yang
Ying-Cong Chen
128
1
0
30 Sep 2025
Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
Aakriti Agrawal
R. Aralikatti
Anirudh Satheesh
Souradip Chakraborty
Amrit Singh Bedi
Furong Huang
LRM
116
2
0
30 Sep 2025
IRIS: Intrinsic Reward Image Synthesis
Yihang Chen
Yuanhao Ban
Yunqi Hong
Cho-Jui Hsieh
69
1
0
29 Sep 2025
Reinforcement Learning Meets Large Language Models: A Survey of Advancements and Applications Across the LLM Lifecycle
Keliang Liu
Dingkang Yang
Ziyun Qian
Weijie Yin
Y. Wang
Hongsheng Li
Jun Liu
Peng Zhai
Y. Liu
Lihua Zhang
OffRL
LRM
226
6
0
20 Sep 2025
Connections between reinforcement learning with feedback,test-time scaling, and diffusion guidance: An anthology
Yuchen Jiao
Yuxin Chen
Gen Li
OffRL
124
0
0
04 Sep 2025
Know When to Explore: Difficulty-Aware Certainty as a Guide for LLM Reinforcement Learning
Ang Li
Zhihang Yuan
Yang Zhang
Shouda Liu
Yisen Wang
124
4
0
29 Aug 2025
PiCSAR: Probabilistic Confidence Selection And Ranking
Joshua Ong Jun Leang
Zheng Zhao
Aryo Pradipta Gema
Sohee Yang
Wai-Chung Kwan
Xuanli He
Wenda Li
Pasquale Minervini
Eleonora Giunchiglia
Shay B. Cohen
ReLM
BDL
LRM
205
3
0
29 Aug 2025
Deep Think with Confidence
Yichao Fu
Xuewei Wang
Yuandong Tian
Jiawei Zhao
ReLM
BDL
LRM
229
57
0
21 Aug 2025
Maximizing Prefix-Confidence at Test-Time Efficiently Improves Mathematical Reasoning
Matthias Otth
Jonas Hübotter
Ido Hakimi
Andreas Krause
ReLM
LRM
212
2
0
24 Jul 2025
Confident RAG: Enhancing the Performance of LLMs for Mathematics Question Answering through Multi-Embedding and Confidence Scoring
S. Chen
Zijian Zhao
Jinsong Chen
203
2
0
23 Jul 2025
Shop-R1: Rewarding LLMs to Simulate Human Behavior in Online Shopping via Reinforcement Learning
Yimeng Zhang
Tian Wang
Jiri Gesi
Liang Luo
Yuxuan Lu
...
Ran Xue
Houyu Zhang
Qingjun Cui
Yufan Guo
Dakuo Wang
OffRL
RALM
LRM
269
5
0
23 Jul 2025
DynScaling: Efficient Verifier-free Inference Scaling via Dynamic and Integrated Sampling
Fei Wang
Xingchen Wan
Ruoxi Sun
Jiefeng Chen
Sercan Ö. Arık
LRM
178
1
0
19 Jun 2025
Reinforcing Video Reasoning with Focused Thinking
Jisheng Dang
Jingze Wu
T. Wang
Xuanhui Lin
Nannan Zhu
Hongbo Chen
Wei-Shi Zheng
Meng Wang
Tat-Seng Chua
OffRL
LRM
339
12
0
30 May 2025
Learning to Reason without External Rewards
Xuandong Zhao
Zhewei Kang
Aosong Feng
Sergey Levine
Dawn Song
OffRL
ReLM
LRM
403
92
0
26 May 2025
Batched Self-Consistency Improves LLM Relevance Assessment and Ranking
Anton Korikov
Pan Du
Scott Sanner
Navid Rekabsaz
186
1
0
18 May 2025
TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning
Tunyu Zhang
Haizhou Shi
Yibin Wang
Hengyi Wang
Xiaoxiao He
...
Ligong Han
Kai Xu
Huatian Zhang
Dimitris N. Metaxas
Hao Wang
LRM
460
5
0
16 May 2025
Reasoning Models Can Be Effective Without Thinking
Wenjie Ma
Jingxuan He
Charlie Snell
Tyler Griggs
Sewon Min
Matei A. Zaharia
ReLM
LRM
398
109
1
14 Apr 2025
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Robert Z. Sparks
Charlie Snell
Kanishk Gandhi
Alon Albalak
Anikait Singh
...
Dakota Mahan
Louis Castricato
Jan-Philipp Fränken
Nick Haber
Chelsea Finn
LRM
366
80
0
08 Jan 2025
Cool-Fusion: Fuse Large Language Models without Training
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Cong Liu
Xiaojun Quan
Yan Pan
Liangzhi Li
Weigang Wu
Xu Chen
MoMe
VLM
378
9
0
29 Jul 2024
1