Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2512.03847
Cited By
DVPO: Distributional Value Modeling-based Policy Optimization for LLM Post-Training
3 December 2025
Dingwei Zhu
Zhiheng Xi
Shihan Dou
Yuhui Wang
Sixian Li
Junjie Ye
Honglin Guo
Shichun Liu
Chenhao Huang
Yajie Yang
Junlin Shang
Senjie Jin
Ming Zhang
Jiazheng Zhang
Caishuang Huang
Yunke Zhang
Demei Yan
Yuran Wang
Tao Gui
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DVPO: Distributional Value Modeling-based Policy Optimization for LLM Post-Training"
0 / 0 papers shown
Title
No papers found