Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2510.01161
Cited By
v1
v2 (latest)
Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?
1 October 2025
Haizhong Zheng
Jiawei Zhao
Bedi Chen
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (12 upvotes)
Github (21★)
Papers citing
"Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?"
2 / 2 papers shown
How LLMs Learn to Reason: A Complex Network Perspective
Sihan Hu
X-D Cai
Yuan Huang
Zhiyuan Yao
Linfeng Zhang
Pan Zhang
Youjin Deng
Kun Chen
LRM
241
1
0
28 Sep 2025
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
Brian Bartoldson
S. Venkatraman
James Diffenderfer
Moksh Jain
Tal Ben-Nun
Seanie Lee
Minsu Kim
J. Obando-Ceron
Yoshua Bengio
B. Kailkhura
OffRL
334
12
0
24 Mar 2025
1
Page 1 of 1