From Distributional to Overton Pluralism: Investigating Large Language Model Alignment

25 June 2024

Papers citing "From Distributional to Overton Pluralism: Investigating Large Language Model Alignment"

16 / 16 papers shown

Title
What do Language Model Probabilities Represent? From Distribution Estimation to Response Prediction Eitan Wagner Omri Abend 19 0 0 04 May 2025
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions Joseph Suh Erfan Jahanparast Suhong Moon Minwoo Kang Serina Chang ALM LM&MA 39 1 0 24 Feb 2025
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements Jingyu Zhang Ahmed Elgohary Ahmed Magooda Daniel Khashabi Benjamin Van Durme 33 2 0 11 Oct 2024
Diversity-Rewarded CFG Distillation Geoffrey Cideron A. Agostinelli Johan Ferret Sertan Girgin Romuald Elie Olivier Bachem Sarah Perrin Alexandre Ramé 29 2 0 08 Oct 2024
CS4: Measuring the Creativity of Large Language Models Automatically by Controlling the Number of Story-Writing Constraints Anirudh Atmakuru Jatin Nainani Rohith Siddhartha Reddy Bheemreddy Anirudh Lakkaraju Zonghai Yao Hamed Zamani Haw-Shiuan Chang 34 2 0 05 Oct 2024
Can Language Models Reason about Individualistic Human Values and Preferences? Liwei Jiang Taylor Sorensen Sydney Levine Yejin Choi 23 7 0 04 Oct 2024
DiverseDialogue: A Methodology for Designing Chatbots with Human-Like Diversity Xiaoyu Lin Xinkai Yu Ankit Aich Salvatore Giorgi Lyle Ungar ALM 24 0 0 30 Aug 2024
Predicting vs. Acting: A Trade-off Between World Modeling & Agent Modeling Margaret Li Weijia Shi Artidoro Pagnoni Peter West Ari Holtzman 27 4 0 02 Jul 2024
Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores Chantal Shaib Joe Barrow Jiuding Sun Alexa F. Siu Byron C. Wallace A. Nenkova 56 31 0 01 Mar 2024
What Evidence Do Language Models Find Convincing? Alexander Wan Eric Wallace Dan Klein 196 28 0 19 Feb 2024
Exploring Precision and Recall to assess the quality and diversity of LLMs Florian Le Bronnec Alexandre Verine Benjamin Négrevergne Y. Chevaleyre Alexandre Allauzen 29 3 0 16 Feb 2024
A Roadmap to Pluralistic Alignment Taylor Sorensen Jared Moore Jillian R. Fisher Mitchell L. Gordon Niloofar Mireshghallah ... Liwei Jiang Ximing Lu Nouha Dziri Tim Althoff Yejin Choi 59 75 0 07 Feb 2024
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning Hao Zhao Maksym Andriushchenko Francesco Croce Nicolas Flammarion ALM 89 41 0 07 Feb 2024
Understanding the Effects of RLHF on LLM Generalisation and Diversity Robert Kirk Ishita Mediratta Christoforos Nalmpantis Jelena Luketina Eric Hambro Edward Grefenstette Roberta Raileanu AI4CE ALM 95 63 0 10 Oct 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 301 11,730 0 04 Mar 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization Victor Sanh Albert Webson Colin Raffel Stephen H. Bach Lintang Sutawika ... T. Bers Stella Biderman Leo Gao Thomas Wolf Alexander M. Rush LRM 203 1,651 0 15 Oct 2021