Communities
Connect sessions
AI calendar
Organizations
Contact Sales
Search
Open menu
Home
Papers
2508.02618
Cited By
v1
v2 (latest)
Mitigating Attention Hacking in Preference-Based Reward Modeling via Interaction Distillation
4 August 2025
Jianxiang Zang
Meiling Ning
Shihan Dou
Jiazheng Zhang
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
Re-assign community
ArXiv (abs)
PDF
HTML
Github
Papers citing
"Mitigating Attention Hacking in Preference-Based Reward Modeling via Interaction Distillation"
Title
No papers found