The Art of Scaling Reinforcement Learning Compute for LLMs

15 October 2025

Papers citing "The Art of Scaling Reinforcement Learning Compute for LLMs"

1 / 1 papers shown

Title
Advantage Shaping as Surrogate Reward Maximization: Unifying Pass@K Policy Gradients Christos Thrampoulidis Sadegh Mahdavi Wenlong Deng OffRL 32 0 0 27 Oct 2025