Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2504.07527
Cited By
v1
v2 (latest)
Supervised Optimism Correction: Be Confident When LLMs Are Sure
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
10 April 2025
Jing Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yongqian Li
Dacheng Tao
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Supervised Optimism Correction: Be Confident When LLMs Are Sure"
3 / 3 papers shown
Title
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
Kai Yang
Xin Xu
Yangkun Chen
Weijie Liu
Jiafei Lyu
Zichuan Lin
Deheng Ye
Saiyong Yang
221
1
0
19 Nov 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
888
561
0
03 Jan 2025
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Zikang Shan
Guhao Feng
Wei Xiong
Xinle Cheng
Li Zhao
Di He
Jiang Bian
Liwei Wang
585
96
0
29 Apr 2024
1