Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.14874
Cited By
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data
27 August 2024
Han Xia
Songyang Gao
Qiming Ge
Zhiheng Xi
Qi Zhang
Xuanjing Huang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data"
2 / 2 papers shown
Title
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
31
1
0
08 Nov 2024
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts
Haoxiang Wang
Wei Xiong
Tengyang Xie
Han Zhao
Tong Zhang
44
13
0
18 Jun 2024
1