Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2509.07430
Cited By
v1
v2
v3
v4 (latest)
The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
9 September 2025
Long Li
Jiaran Hao
Jason Klein Liu
Zhijian Zhou
Yanting Miao
Wei Pang
Xiaoyu Tan
Wei Chu
Zhe Wang
Shirui Pan
Chao Qu
Yuan Qi
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Github (11★)
Papers citing
"The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward"
5 / 5 papers shown
Whatever Remains Must Be True: Filtering Drives Reasoning in LLMs, Shaping Diversity
Germán Kruszewski
Pierre Erbacher
Jos Rozen
Marc Dymetman
272
1
0
05 Dec 2025
Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn Search Agents
Guoqing Wang
Sunhao Dai
Guangze Ye
Zeyu Gan
Wei Yao
Yong Deng
Xiaofeng Wu
ZhenZhe Ying
LRM
234
9
0
16 Oct 2025
Unlocking Exploration in RLVR: Uncertainty-aware Advantage Shaping for Deeper Reasoning
Can Xie
Ruotong Pan
Xiangyu Wu
Y. Zhang
Jiayi Fu
Tingting Gao
G. Zhou
LRM
179
10
0
12 Oct 2025
Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration
Wenhao Deng
Long Wei
Chenglei Yu
Tailin Wu
OffRL
ReLM
LRM
331
3
0
04 Oct 2025
ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training
Feijiang Han
Xiaodong Yu
Jianheng Tang
Delip Rao
Weihua Du
Lyle Ungar
522
10
0
16 May 2025
1
Page 1 of 1