Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2509.22047
Cited By
v1
v2 (latest)
MO-GRPO: Mitigating Reward Hacking of Group Relative Policy Optimization on Multi-Objective Problems
26 September 2025
Yuki Ichihara
Yuu Jinnai
Tetsuro Morimura
Mitsuki Sakamoto
Ryota Mitsuhashi
Eiji Uchibe
Re-assign community
ArXiv (abs)
PDF
HTML
Github (22★)
Papers citing
"MO-GRPO: Mitigating Reward Hacking of Group Relative Policy Optimization on Multi-Objective Problems"
0 / 0 papers shown
No papers found