v1v2 (latest)

A Roadmap to Pluralistic Alignment

7 February 2024

Niloofar Mireshghallah

Yejin Choi

Papers citing "A Roadmap to Pluralistic Alignment"

47 / 47 papers shown

TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories

108

26 Nov 2025

Personalized Reward Modeling for Text-to-Image Generation

156

21 Nov 2025

Pluralistic Behavior Suite: Stress-Testing Multi-Turn Adherence to Custom Behavioral Policies

Prasoon Varshney

Makesh Narsimhan Sreedhar

Liwei Jiang

Traian Rebedea

Christopher Parisien

114

07 Nov 2025

Human-AI Collaboration with Misaligned Preferences

Jiaxin Song

Parnian Shahkar

Kate Donahue

Bhaskar Ray Chaudhury

HAI

169

04 Nov 2025

Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback

Chu Fei Luo

Samuel Dahan

Xiaodan Zhu

102

17 Oct 2025

MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

154

30 Sep 2025

Not My Agent, Not My Boundary? Elicitation of Personal Privacy Boundaries in AI-Delegated Information Sharing

131

26 Sep 2025

The Alignment Bottleneck

Wenjun Cao

217

19 Sep 2025

Decoding Alignment: A Critical Survey of LLM Development Initiatives through Value-setting and Data-centric Lens

Ilias Chalkidis

OffRL ALM

154

23 Aug 2025

CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions

232

03 Aug 2025

The Homogenizing Effect of Large Language Models on Human Expression and Thought

Zhivar Sourati

Alireza S. Ziabari

Morteza Dehghani

171

02 Aug 2025

Learning to summarize user information for personalized reinforcement learning from human feedback

229

17 Jul 2025

Reward Model Interpretability via Optimal and Pessimal TokensConference on Fairness, Accountability and Transparency (FAccT), 2025

Brian Christian

Hannah Rose Kirk

Jessica A.F. Thompson

Christopher Summerfield

Tsvetomira Dumbalska

AAML

237

08 Jun 2025

QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question AnsweringAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

231

04 Jun 2025

Aligning VLM Assistants with Personalized Situated CognitionAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

...

242

01 Jun 2025

Meaning Is Not A Metric: Using LLMs to make cultural context legible at scale

214

23 May 2025

AI-Augmented LLMs Achieve Therapist-Level Responses in Motivational Interviewing

485

23 May 2025

Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?

394

19 May 2025

Pairwise Calibrated Rewards for Pluralistic Alignment

220

17 May 2025

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

293

20 Apr 2025

Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgmentAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

376

17 Apr 2025

DICE: A Framework for Dimensional and Contextual Evaluation of Language Models

Aryan Shrivastava

Paula Akemi Aoyagui

303

14 Apr 2025

Societal Impacts Research Requires Benchmarks for Creative Composition Tasks

Judy Hanwen Shen

Carlos Guestrin

612

09 Apr 2025

Strategyproof Reinforcement Learning from Human Feedback

Thomas Kleine Buening

Jiarui Gan

Debmalya Mandal

Marta Z. Kwiatkowska

280

12 Mar 2025

CoPL: Collaborative Preference Learning for Personalizing LLMs

379

03 Mar 2025

On Benchmarking Human-Like Intelligence in Machines

913

27 Feb 2025

Is Free Self-Alignment Possible?

426

24 Feb 2025

The Call for Socially Aware Language Technologies

397

24 Feb 2025

C3AI: Crafting and Evaluating Constitutions for Constitutional AIThe Web Conference (WWW), 2025

228

21 Feb 2025

AI Alignment at Your DiscretionConference on Fairness, Accountability and Transparency (FAccT), 2025

320

10 Feb 2025

Clone-Robust AI Alignment

Ariel D. Procaccia

Benjamin G. Schiffer

Shirley Zhang

210

17 Jan 2025

Evaluating the Prompt Steerability of Large Language Models

Erik Miehling

Michael Desmond

Karthikeyan N. Ramamurthy

435

19 Nov 2024

Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models

262

17 Oct 2024

Large Language Models, and LLM-Based Agents, Should Be Used to Enhance the Digital Public Sphere

264

15 Oct 2024

Varying Shades of Wrong: Aligning LLMs with Wrong Answers OnlyInternational Conference on Learning Representations (ICLR), 2024

236

14 Oct 2024

Intuitions of Compromise: Utilitarianism vs. Contractualism

Jared Moore

Yejin Choi

Sydney Levine

230

07 Oct 2024

Moral Alignment for LLM AgentsInternational Conference on Learning Representations (ICLR), 2024

Elizaveta Tennant

Stephen Hailes

Mirco Musolesi

507

02 Oct 2024

Policy Maps: Tools for Guiding the Unbounded Space of LLM BehaviorsACM Symposium on User Interface Software and Technology (UIST), 2024

263

26 Sep 2024

Open-World Evaluation for Retrieving Diverse PerspectivesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Hung-Ting Chen

Eunsol Choi

354

26 Sep 2024

Policy Prototyping for LLMs: Pluralistic Alignment via Interactive and Collaborative Policymaking

K. J. Kevin Feng

Inyoung Cheong

Quan Ze Chen

Amy X. Zhang

338

13 Sep 2024

Programming Refusal with Conditional Activation SteeringInternational Conference on Learning Representations (ICLR), 2024

Bruce W. Lee

Inkit Padhi

Karthikeyan N. Ramamurthy

502

06 Sep 2024

User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI CompanionsInternational Conference on Human Factors in Computing Systems (CHI), 2024

Xuhui Zhou

314

01 Sep 2024

Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive PromptsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Tingchen Fu

Yupeng Hou

Julian McAuley

Rui Yan

309

09 Aug 2024

Generative Monoculture in Large Language Models

Fan Wu

Emily Black

Varun Chandrasekaran

SyDa

204

02 Jul 2024

From Distributional to Overton Pluralism: Investigating Large Language Model Alignment

Thom Lake

Eunsol Choi

Greg Durrett

420

25 Jun 2024

Alignment Studio: Aligning Large Language Models to Particular Contextual RegulationsIEEE Internet Computing (IEEE Internet Comput.), 2024

...

Rosario A. Uceda-Sosa

Kush R. Varshney

226

08 Mar 2024

Black-Box Access is Insufficient for Rigorous AI AuditsConference on Fairness, Accountability and Transparency (FAccT), 2024

...

Dylan Hadfield-Menell

AAML

557

131

25 Jan 2024