Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning

18 April 2025

Papers citing "Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning"

1 / 1 papers shown

Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models Xiaobao Wu LRM 62 0 0 05 May 2025