Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2509.07430
Cited By
v1
v2 (latest)
The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
9 September 2025
Long Li
Jiaran Hao
Jason Klein Liu
Zhijian Zhou
Yanting Miao
Wei Pang
Xiaoyu Tan
Wei Chu
Zhe Wang
Shirui Pan
Chao Qu
Yuan Qi
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Github (7★)
Papers citing
"The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward"
4 / 4 papers shown
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents
Guoqing Wang
Sunhao Dai
Guangze Ye
Zeyu Gan
Wei Yao
Yong Deng
Xiaofeng Wu
ZhenZhe Ying
OffRL
175
3
0
16 Oct 2025
Unlocking Exploration in RLVR: Uncertainty-aware Advantage Shaping for Deeper Reasoning
Can Xie
Ruotong Pan
Xiangyu Wu
Y. Zhang
Jiayi Fu
Tingting Gao
G. Zhou
OffRL
LRM
143
3
0
12 Oct 2025
Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration
Wenhao Deng
Long Wei
Chenglei Yu
Tailin Wu
OffRL
ReLM
LRM
261
2
0
04 Oct 2025
ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training
Feijiang Han
Xiaodong Yu
Jianheng Tang
Delip Rao
Weihua Du
Lyle Ungar
355
6
0
16 May 2025
1