Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2508.14094
Cited By
v1
v2
v3
v4 (latest)
Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets
15 August 2025
Benjamin Pikus
Pratyush Ranjan Tiwari
Burton Ye
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets"
3 / 3 papers shown
Title
BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning
Qianli Shen
Daoyuan Chen
Yilun Huang
Zhenqing Ling
Yaliang Li
Bolin Ding
Jingren Zhou
OffRL
156
0
0
30 Oct 2025
ToMPO: Training LLM Strategic Decision Making from a Multi-Agent Perspective
Yiwen Zhang
Ziang Chen
Fanqi Kong
Yizhe Huang
Xue Feng
LLMAG
156
0
0
25 Sep 2025
MAPO: Mixed Advantage Policy Optimization
Wenke Huang
Quan Zhang
Yiyang Fang
Jian Liang
Xuankun Rong
...
Mingjun Li
Leszek Rutkowski
Mang Ye
Bo Du
Dacheng Tao
175
4
0
23 Sep 2025
1