v1v2 (latest)

Rethinking the Role of Proxy Rewards in Language Model Alignment

2 February 2024

ArXiv (abs)PDF HTML Github (2★)

Papers citing "Rethinking the Role of Proxy Rewards in Language Model Alignment"

5 / 5 papers shown

Textual Self-attention Network: Test-Time Preference Optimization through Textual Gradient-based Attention

280

10 Nov 2025

CancerGUIDE: Cancer Guideline Understanding via Internal Disagreement Estimation

...

314

09 Sep 2025

Towards Reward Fairness in RLHF: From a Resource Allocation PerspectiveAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

283

29 May 2025

MPO: Multilingual Safety Alignment via Reward Gap OptimizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

393

22 May 2025

Yi: Open Foundation Models by 01.AI

...

1.1K

828

07 Mar 2024