Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2503.23913
Cited By
Entropy-Based Adaptive Weighting for Self-Training
31 March 2025
Xiaoxuan Wang
Yihe Deng
Mingyu Derek Ma
Wei Wang
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (4 upvotes)
Github (19★)
Papers citing
"Entropy-Based Adaptive Weighting for Self-Training"
5 / 5 papers shown
From Solving to Verifying: A Unified Objective for Robust Reasoning in LLMs
Xiaoxuan Wang
Bo Liu
Song Jiang
Jingzhou Liu
Jingyuan Qi
Xia Chen
Baosheng He
LRM
211
3
0
19 Nov 2025
On Memorization of Large Language Models in Logical Reasoning
Chulin Xie
Yangsibo Huang
Chiyuan Zhang
Da Yu
Xinyun Chen
Bill Yuchen Lin
Bo Li
Badih Ghazi
Ravi Kumar
LRM
539
104
0
30 Oct 2024
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Zikang Shan
Guhao Feng
Wei Xiong
Xinle Cheng
Li Zhao
Di He
Jiang Bian
Liwei Wang
781
114
0
29 Apr 2024
Insights into Alignment: Evaluating DPO and its Variants Across Multiple Tasks
Amir Saeidi
Shivanshu Verma
Chitta Baral
Chitta Baral
ALM
492
40
0
23 Apr 2024
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
International Conference on Learning Representations (ICLR), 2023
Haipeng Luo
Qingfeng Sun
Can Xu
Lu Wang
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRM
OSLM
974
678
0
18 Aug 2023
1
Page 1 of 1