ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.18588
  4. Cited By
History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL

History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL

26 August 2025
Jingkai He
Tianjian Li
Erhu Feng
Dong Du
Qian Liu
Tao Liu
Yubin Xia
Haibo Chen
ArXiv (abs)PDFHTMLGithub (15633★)

Papers citing "History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRL"

9 / 9 papers shown
Fast LLM Post-training via Decoupled and Fastest-of-N Speculation
Fast LLM Post-training via Decoupled and Fastest-of-N Speculation
Rongxin Cheng
Kai Zhou
Xingda Wei
Siyuan Liu
Mingcong Han
...
Yeju Zhou
Baoquan Zhong
W. L. Xiao
Rong Chen
Haibo Chen
OffRLLRM
436
0
0
24 Dec 2025
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
Yilong Zhao
Jiaming Tang
Kan Zhu
Zihao Ye
Chi-chih Chang
...
Mohamed S. Abdelfattah
Mingyu Gao
Baris Kasikci
Song Han
Ion Stoica
ReLMLRM
186
1
0
01 Dec 2025
CoPRIS: Efficient and Stable Reinforcement Learning via Concurrency-Controlled Partial Rollout with Importance Sampling
CoPRIS: Efficient and Stable Reinforcement Learning via Concurrency-Controlled Partial Rollout with Importance Sampling
Zekai Qu
Yinxu Pan
Ao Sun
Chaojun Xiao
Xu Han
81
0
0
05 Nov 2025
ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems
Qiaoling Chen
Zijun Liu
Peng Sun
Shenggui Li
Guoteng Wang
Ziming Liu
Yonggang Wen
Siyuan Feng
Tianwei Zhang
104
2
0
30 Oct 2025
LANPO: Bootstrapping Language and Numerical Feedback for Reinforcement Learning in LLMs
LANPO: Bootstrapping Language and Numerical Feedback for Reinforcement Learning in LLMs
Ang Li
Yifei Wang
Zhihang Yuan
Stefanie Jegelka
Y. X. R. Wang
ALMKELM
176
0
0
18 Oct 2025
Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony
Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony
H. Lu
Zichen Liu
Shaopan Xiong
Yancheng He
W. Gao
...
Wei Wang
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
OffRL
98
1
0
13 Oct 2025
Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
Ziyan Wang
Zheng Wang
Jie Fu
Xingwei Qu
Qi Cheng
Shengpu Tang
Minjia Zhang
Xiaoming Huo
LRM
244
1
0
05 Oct 2025
Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?
Prosperity before Collapse: How Far Can Off-Policy RL Reach with Stale Data on LLMs?
Haizhong Zheng
Jiawei Zhao
Bedi Chen
OffRL
154
5
0
01 Oct 2025
RollPacker: Mitigating Long-Tail Rollouts for Fast, Synchronous RL Post-Training
RollPacker: Mitigating Long-Tail Rollouts for Fast, Synchronous RL Post-Training
Wei Gao
Yuheng Zhao
Dakai An
Tianyuan Wu
Lunxi Cao
...
Yuchi Xu
Jiamang Wang
Lin Qu
B. Zheng
Wei Wang
OffRLVLM
208
9
0
25 Sep 2025
1