Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both

Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both

11 October 2024

Nikhil Krishnaswamy

Papers citing "Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both"

Title
No papers