Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.22230
Cited By
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
28 March 2025
Wei Shen
Guanlin Liu
Zheng Wu
Ruofei Zhu
Qingping Yang
Chao Xin
Yu Yue
Lin Yan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback"
6 / 6 papers shown
Title
Sailing AI by the Stars: A Survey of Learning from Rewards in Post-Training and Test-Time Scaling of Large Language Models
Xiaobao Wu
LRM
60
0
0
05 May 2025
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Junshu Pan
Wei Shen
Shulin Huang
Qiji Zhou
Yue Zhang
69
0
0
22 Apr 2025
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs
Yunhui Xia
Wei Shen
Yan Wang
Jason Klein Liu
Huifeng Sun
Siyue Wu
Jian Hu
Xiaolong Xu
AI4TS
21
1
0
20 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
58
4
0
09 Apr 2025
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Taiwei Shi
Yiyang Wu
Linxin Song
Tianyi Zhou
Jieyu Zhao
LRM
76
1
0
07 Apr 2025
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
Yu Yue
Yufeng Yuan
Qiying Yu
Xiaochen Zuo
Ruofei Zhu
...
Ru Zhang
Xin Liu
Mingxuan Wang
Yonghui Wu
Lin Yan
OffRL
LRM
19
5
0
07 Apr 2025
1