Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2510.01161
Cited By

Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

v1v2 (latest)

Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?

1 October 2025

ArXiv (abs)PDF HTML HuggingFace (12 upvotes)Github (21★)

Papers citing "Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?"

2 / 2 papers shown

How LLMs Learn to Reason: A Complex Network Perspective

How LLMs Learn to Reason: A Complex Network Perspective

241

1

0

28 Sep 2025

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Brian Bartoldson

James Diffenderfer

J. Obando-Ceron

334

12

0

24 Mar 2025

Page 1 of 1