ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2510.01161
  4. Cited By
Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?
v1v2 (latest)

Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

1 October 2025
Haizhong Zheng
Jiawei Zhao
Bedi Chen
    OffRL
ArXiv (abs)PDFHTMLHuggingFace (12 upvotes)Github (21★)

Papers citing "Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?"

2 / 2 papers shown
How LLMs Learn to Reason: A Complex Network Perspective
How LLMs Learn to Reason: A Complex Network Perspective
Sihan Hu
X-D Cai
Yuan Huang
Zhiyuan Yao
Linfeng Zhang
Pan Zhang
Youjin Deng
Kun Chen
LRM
241
1
0
28 Sep 2025
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training
Brian Bartoldson
S. Venkatraman
James Diffenderfer
Moksh Jain
Tal Ben-Nun
Seanie Lee
Minsu Kim
J. Obando-Ceron
Yoshua Bengio
B. Kailkhura
OffRL
334
12
0
24 Mar 2025
1
Page 1 of 1