Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.11751
Cited By
reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs
14 March 2025
Zhaofeng Wu
Michihiro Yasunaga
Andrew Cohen
Yoon Kim
Asli Celikyilmaz
Marjan Ghazvininejad
Re-assign community
ArXiv
PDF
HTML
Papers citing
"reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs"
1 / 1 papers shown
Title
Adversarial Training of Reward Models
Alexander Bukharin
Haifeng Qian
Shengyang Sun
Adithya Renduchintala
Soumye Singhal
Z. Wang
Oleksii Kuchaiev
Olivier Delalleau
T. Zhao
AAML
27
0
0
08 Apr 2025
1