ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.07527
  4. Cited By
Supervised Optimism Correction: Be Confident When LLMs Are Sure
v1v2 (latest)

Supervised Optimism Correction: Be Confident When LLMs Are Sure

Annual Meeting of the Association for Computational Linguistics (ACL), 2025
10 April 2025
Jing Zhang
Rushuai Yang
Shunyu Liu
Ting-En Lin
Fei Huang
Yi Chen
Yongqian Li
Dacheng Tao
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Supervised Optimism Correction: Be Confident When LLMs Are Sure"

3 / 3 papers shown
Title
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control
Kai Yang
Xin Xu
Yangkun Chen
Weijie Liu
Jiafei Lyu
Zichuan Lin
Deheng Ye
Saiyong Yang
221
1
0
19 Nov 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALMLRM
888
561
0
03 Jan 2025
DPO Meets PPO: Reinforced Token Optimization for RLHF
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Zikang Shan
Guhao Feng
Wei Xiong
Xinle Cheng
Li Zhao
Di He
Jiang Bian
Liwei Wang
585
96
0
29 Apr 2024
1