Anchored Preference Optimization and Contrastive Revisions: Addressing
Underspecification in Alignment

Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

12 August 2024

Karel DÓosterlinck

Thomas Demeester

Christopher Potts

Papers citing "Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment"

5 / 5 papers shown

Title
A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More Zhichao Wang Bin Bi Shiva K. Pentyala Kiran Ramnath Sougata Chaudhuri ... Z. Zhu Xiang-Bo Mao S. Asur Na Na Cheng OffRL 31 38 0 23 Jul 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Corby Rosset Ching-An Cheng Arindam Mitra Michael Santacroce Ahmed Hassan Awadallah Tengyang Xie 144 113 0 04 Apr 2024
Direct Preference Optimization with an Offset Afra Amini Tim Vieira Ryan Cotterell 71 54 0 16 Feb 2024
KTO: Model Alignment as Prospect Theoretic Optimization Kawin Ethayarajh Winnie Xu Niklas Muennighoff Dan Jurafsky Douwe Kiela 159 437 0 02 Feb 2024
$Understanding Dataset Difficulty with $\mathcal{V}$-Usable Information$ Understanding Dataset Difficulty with $\mathcal{V}$ -Usable Information Kawin Ethayarajh Yejin Choi Swabha Swayamdipta 154 157 0 16 Oct 2021