Information Theoretic Guarantees For Policy Alignment In Large Language
Models

Information Theoretic Guarantees For Policy Alignment In Large Language Models

9 June 2024

Youssef Mroueh

Papers citing "Information Theoretic Guarantees For Policy Alignment In Large Language Models"

5 / 5 papers shown

Title
Soft Best-of-n Sampling for Model Alignment C. M. Verdun Alex Oesterling Himabindu Lakkaraju Flavio du Pin Calmon BDL 44 0 0 06 May 2025
Variational Best-of-N Alignment Afra Amini Tim Vieira Ryan Cotterell Ryan Cotterell BDL 37 17 0 08 Jul 2024
KTO: Model Alignment as Prospect Theoretic Optimization Kawin Ethayarajh Winnie Xu Niklas Muennighoff Dan Jurafsky Douwe Kiela 153 437 0 02 Feb 2024
Theoretical guarantees on the best-of-n alignment policy Ahmad Beirami Alekh Agarwal Jonathan Berant Alex DÁmour Jacob Eisenstein Chirag Nagpal A. Suresh 42 42 0 03 Jan 2024
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022