ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2508.14094
  4. Cited By
Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets
v1v2v3v4 (latest)

Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets

15 August 2025
Benjamin Pikus
Pratyush Ranjan Tiwari
Burton Ye
ArXiv (abs)PDFHTMLGithub

Papers citing "Hard Examples Are All You Need: Maximizing GRPO Post-Training Under Annotation Budgets"

3 / 3 papers shown
Title
BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning
BOTS: A Unified Framework for Bayesian Online Task Selection in LLM Reinforcement Finetuning
Qianli Shen
Daoyuan Chen
Yilun Huang
Zhenqing Ling
Yaliang Li
Bolin Ding
Jingren Zhou
OffRL
156
0
0
30 Oct 2025
ToMPO: Training LLM Strategic Decision Making from a Multi-Agent Perspective
ToMPO: Training LLM Strategic Decision Making from a Multi-Agent Perspective
Yiwen Zhang
Ziang Chen
Fanqi Kong
Yizhe Huang
Xue Feng
LLMAG
156
0
0
25 Sep 2025
MAPO: Mixed Advantage Policy Optimization
MAPO: Mixed Advantage Policy Optimization
Wenke Huang
Quan Zhang
Yiyang Fang
Jian Liang
Xuankun Rong
...
Mingjun Li
Leszek Rutkowski
Mang Ye
Bo Du
Dacheng Tao
175
4
0
23 Sep 2025
1