Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2402.05070
Cited By
v1
v2 (latest)
A Roadmap to Pluralistic Alignment
7 February 2024
Taylor Sorensen
Jared Moore
Jillian R. Fisher
Mitchell L. Gordon
Niloofar Mireshghallah
Christopher Rytting
Andre Ye
Liwei Jiang
Ximing Lu
Nouha Dziri
Tim Althoff
Yejin Choi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Roadmap to Pluralistic Alignment"
47 / 47 papers shown
TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories
Kirti Bhagat
Shaily Bhatt
Athul Velagapudi
Aditya Vashistha
Shachi Dave
Danish Pruthi
108
0
0
26 Nov 2025
Personalized Reward Modeling for Text-to-Image Generation
Jeongeun Lee
Ryang Heo
Dongha Lee
EGVM
156
0
0
21 Nov 2025
Pluralistic Behavior Suite: Stress-Testing Multi-Turn Adherence to Custom Behavioral Policies
Prasoon Varshney
Makesh Narsimhan Sreedhar
Liwei Jiang
Traian Rebedea
Christopher Parisien
114
0
0
07 Nov 2025
Human-AI Collaboration with Misaligned Preferences
Jiaxin Song
Parnian Shahkar
Kate Donahue
Bhaskar Ray Chaudhury
HAI
169
0
0
04 Nov 2025
Towards Low-Resource Alignment to Diverse Perspectives with Sparse Feedback
Chu Fei Luo
Samuel Dahan
Xiaodan Zhu
102
0
0
17 Oct 2025
MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages
Chenxi Whitehouse
Sebastian Ruder
Tony Lin
Oksana Kurylo
Haruka Takagi
Janice Lam
Nicolò Busetto
Denise Diaz
Francisco Guzmán
154
1
0
30 Sep 2025
Not My Agent, Not My Boundary? Elicitation of Personal Privacy Boundaries in AI-Delegated Information Sharing
Bingcan Guo
Eryue Xu
Zhiping Zhang
T. Li
131
3
0
26 Sep 2025
The Alignment Bottleneck
Wenjun Cao
217
0
0
19 Sep 2025
Decoding Alignment: A Critical Survey of LLM Development Initiatives through Value-setting and Data-centric Lens
Ilias Chalkidis
OffRL
ALM
154
1
0
23 Aug 2025
CUPID: Evaluating Personalized and Contextualized Alignment of LLMs from Interactions
Tae Soo Kim
Yoonjoo Lee
Yoonah Park
Jiho Kim
Young-Ho Kim
Juho Kim
232
1
0
03 Aug 2025
The Homogenizing Effect of Large Language Models on Human Expression and Thought
Zhivar Sourati
Alireza S. Ziabari
Morteza Dehghani
171
3
0
02 Aug 2025
Learning to summarize user information for personalized reinforcement learning from human feedback
Hyunji Nam
Yanming Wan
Mickel Liu
Jianxun Lian
Peter Ahnn
Natasha Jaques
229
0
0
17 Jul 2025
Reward Model Interpretability via Optimal and Pessimal Tokens
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Brian Christian
Hannah Rose Kirk
Jessica A.F. Thompson
Christopher Summerfield
Tsvetomira Dumbalska
AAML
237
6
0
08 Jun 2025
QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question Answering
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
A. Tang
Xiuzhen Zhang
M. Dinh
Zhuang Li
RALM
231
0
0
04 Jun 2025
Aligning VLM Assistants with Personalized Situated Cognition
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Yongqi Li
Shen Zhou
Xiaohu Li
Xin Miao
Jintao Wen
...
Birong Pan
Hankun Kang
Yuanyuan Zhu
Ming Zhong
T. Qian
242
1
0
01 Jun 2025
Meaning Is Not A Metric: Using LLMs to make cultural context legible at scale
Cody Kommers
Drew Hemment
Maria Antoniak
Joel Z. Leibo
Hoyt Long
Emily Robinson
Adam Sobey
214
6
0
23 May 2025
AI-Augmented LLMs Achieve Therapist-Level Responses in Motivational Interviewing
Yinghui Huang
Yuxuan Jiang
Hui Liu
Yixin Cai
Weiqing Li
Xiangen Hu
AI4MH
485
0
0
23 May 2025
Is Active Persona Inference Necessary for Aligning Small Models to Personal Preferences?
Zilu Tang
Afra Feyza Akyürek
Ekin Akyürek
Derry Wijaya
394
0
0
19 May 2025
Pairwise Calibrated Rewards for Pluralistic Alignment
Daniel Halpern
Evi Micha
Ariel D. Procaccia
Itai Shapira
220
0
0
17 May 2025
LoRe: Personalizing LLMs via Low-Rank Reward Modeling
Avinandan Bose
Zhihan Xiong
Yuejie Chi
Simon S. Du
Lin Xiao
Maryam Fazel
293
9
0
20 Apr 2025
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Xiaotian Zhang
Ruizhe Chen
Yang Feng
Zuozhu Liu
376
4
0
17 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models
Aryan Shrivastava
Paula Akemi Aoyagui
303
1
0
14 Apr 2025
Societal Impacts Research Requires Benchmarks for Creative Composition Tasks
Judy Hanwen Shen
Carlos Guestrin
612
2
0
09 Apr 2025
Strategyproof Reinforcement Learning from Human Feedback
Thomas Kleine Buening
Jiarui Gan
Debmalya Mandal
Marta Z. Kwiatkowska
280
2
0
12 Mar 2025
CoPL: Collaborative Preference Learning for Personalizing LLMs
Youngbin Choi
Seunghyuk Cho
M. Lee
Moonjeong Park
Yesong Ko
Jungseul Ok
Dongwoo Kim
379
0
0
03 Mar 2025
On Benchmarking Human-Like Intelligence in Machines
Lance Ying
Katherine M. Collins
L. Wong
Ilia Sucholutsky
Ryan Liu
Adrian Weller
Tianmin Shu
Thomas Griffiths
Joshua B. Tenenbaum
ALM
ELM
913
20
0
27 Feb 2025
Is Free Self-Alignment Possible?
Dyah Adila
Changho Shin
Yijing Zhang
Frederic Sala
MoMe
426
2
0
24 Feb 2025
The Call for Socially Aware Language Technologies
Diyi Yang
Dirk Hovy
David Jurgens
Barbara Plank
VLM
397
14
0
24 Feb 2025
C3AI: Crafting and Evaluating Constitutions for Constitutional AI
The Web Conference (WWW), 2025
Yara Kyrychenko
Ke Zhou
Edyta Bogucka
Daniele Quercia
ELM
228
12
0
21 Feb 2025
AI Alignment at Your Discretion
Conference on Fairness, Accountability and Transparency (FAccT), 2025
Maarten Buyl
Hadi Khalaf
C. M. Verdun
Lucas Monteiro Paes
Caio Vieira Machado
Flavio du Pin Calmon
320
10
0
10 Feb 2025
Clone-Robust AI Alignment
Ariel D. Procaccia
Benjamin G. Schiffer
Shirley Zhang
210
5
0
17 Jan 2025
Evaluating the Prompt Steerability of Large Language Models
Erik Miehling
Michael Desmond
Karthikeyan N. Ramamurthy
Elizabeth M. Daly
Pierre Dognin
Jesus Rios
Djallel Bouneffouf
Miao Liu
LLMSV
435
14
0
19 Nov 2024
Ethics Whitepaper: Whitepaper on Ethical Research into Large Language Models
Eddie L. Ungless
Nikolas Vitsakis
Zeerak Talat
James Garforth
Bjorn Ross
Arno Onken
Atoosa Kasirzadeh
Alexandra Birch
262
3
0
17 Oct 2024
Large Language Models, and LLM-Based Agents, Should Be Used to Enhance the Digital Public Sphere
Seth Lazar
Luke Thorburn
Tian Jin
Luca Belli
264
4
0
15 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
International Conference on Learning Representations (ICLR), 2024
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
236
4
0
14 Oct 2024
Intuitions of Compromise: Utilitarianism vs. Contractualism
Jared Moore
Yejin Choi
Sydney Levine
230
1
0
07 Oct 2024
Moral Alignment for LLM Agents
International Conference on Learning Representations (ICLR), 2024
Elizaveta Tennant
Stephen Hailes
Mirco Musolesi
507
23
0
02 Oct 2024
Policy Maps: Tools for Guiding the Unbounded Space of LLM Behaviors
ACM Symposium on User Interface Software and Technology (UIST), 2024
Michelle S. Lam
Fred Hohman
Dominik Moritz
Jeffrey P. Bigham
Kenneth Holstein
Mary Beth Kery
263
1
0
26 Sep 2024
Open-World Evaluation for Retrieving Diverse Perspectives
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Hung-Ting Chen
Eunsol Choi
354
3
0
26 Sep 2024
Policy Prototyping for LLMs: Pluralistic Alignment via Interactive and Collaborative Policymaking
K. J. Kevin Feng
Inyoung Cheong
Quan Ze Chen
Amy X. Zhang
338
6
0
13 Sep 2024
Programming Refusal with Conditional Activation Steering
International Conference on Learning Representations (ICLR), 2024
Bruce W. Lee
Inkit Padhi
Karthikeyan N. Ramamurthy
Erik Miehling
Pierre Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
502
70
0
06 Sep 2024
User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
International Conference on Human Factors in Computing Systems (CHI), 2024
Xianzhe Fan
Qing Xiao
Xuhui Zhou
Jiaxin Pei
Maarten Sap
Zhicong Lu
Hong Shen
314
23
0
01 Sep 2024
Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Tingchen Fu
Yupeng Hou
Julian McAuley
Rui Yan
309
6
0
09 Aug 2024
Generative Monoculture in Large Language Models
Fan Wu
Emily Black
Varun Chandrasekaran
SyDa
204
10
0
02 Jul 2024
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment
Thom Lake
Eunsol Choi
Greg Durrett
420
27
0
25 Jun 2024
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
IEEE Internet Computing (IEEE Internet Comput.), 2024
Swapnaja Achintalwar
Ioana Baldini
Djallel Bouneffouf
Joan Byamugisha
Maria Chang
...
P. Sattigeri
Moninder Singh
S. Thwala
Rosario A. Uceda-Sosa
Kush R. Varshney
226
12
0
08 Mar 2024
Black-Box Access is Insufficient for Rigorous AI Audits
Conference on Fairness, Accountability and Transparency (FAccT), 2024
Stephen Casper
Carson Ezell
Charlotte Siegmann
Noam Kolt
Taylor Lynn Curtis
...
Michael Gerovitch
David Bau
Max Tegmark
David M. Krueger
Dylan Hadfield-Menell
AAML
557
131
0
25 Jan 2024
1