Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.17319
Cited By
Moral Machine or Tyranny of the Majority?
27 May 2023
Michael Feffer
Hoda Heidari
Zachary Chase Lipton
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Moral Machine or Tyranny of the Majority?"
15 / 15 papers shown
Title
Can AI Model the Complexities of Human Moral Decision-Making? A Qualitative Study of Kidney Allocation Decisions
Vijay Keswani
Vincent Conitzer
Walter Sinnott-Armstrong
Breanna K. Nguyen
Hoda Heidari
Jana Schaich Borg
46
0
0
02 Mar 2025
Societal Alignment Frameworks Can Improve LLM Alignment
Karolina Stañczak
Nicholas Meade
Mehar Bhatia
Hattie Zhou
Konstantin Böttinger
...
Timothy P. Lillicrap
Ana Marasović
Sylvie Delacroix
Gillian K. Hadfield
Siva Reddy
254
0
0
27 Feb 2025
AI Alignment at Your Discretion
Maarten Buyl
Hadi Khalaf
C. M. Verdun
Lucas Monteiro Paes
Caio Vieira Machado
Flavio du Pin Calmon
48
0
0
10 Feb 2025
Intuitions of Compromise: Utilitarianism vs. Contractualism
Jared Moore
Yejin Choi
Sydney Levine
43
0
0
07 Oct 2024
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models
Ned Cooper
Alexandra Zafiroglu
44
0
0
27 Aug 2024
On The Stability of Moral Preferences: A Problem with Computational Elicitation Methods
Kyle Boerstler
Vijay Keswani
Lok Chan
Jana Schaich Borg
Vincent Conitzer
Hoda Heidari
Walter Sinnott-Armstrong
44
2
0
05 Aug 2024
Pareto-Optimal Learning from Preferences with Hidden Context
Ryan Boldi
Li Ding
Lee Spector
S. Niekum
72
6
0
21 Jun 2024
Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
Vincent Conitzer
Rachel Freedman
J. Heitzig
Wesley H. Holliday
Bob M. Jacobs
...
Eric Pacuit
Stuart Russell
Hailey Schoelkopf
Emanuel Tewolde
W. Zwicker
53
30
0
16 Apr 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
Haoxiang Wang
Yong Lin
Wei Xiong
Rui Yang
Shizhe Diao
Shuang Qiu
Han Zhao
Tong Zhang
45
72
0
28 Feb 2024
Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia
Tzu-Sheng Kuo
Aaron L Halfaker
Zirui Cheng
Jiwoo Kim
Meng-Hsin Wu
Tongshuang Wu
Kenneth Holstein
Haiyi Zhu
67
21
0
21 Feb 2024
Personalized Language Modeling from Personalized Human Feedback
Xinyu Li
Zachary C. Lipton
Liu Leqi
ALM
76
48
0
06 Feb 2024
Red-Teaming for Generative AI: Silver Bullet or Security Theater?
Michael Feffer
Anusha Sinha
Wesley Hanwen Deng
Zachary Chase Lipton
Hoda Heidari
AAML
47
68
0
29 Jan 2024
Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges
Giorgio Franceschelli
Mirco Musolesi
AI4CE
42
20
0
31 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Stephen Casper
Xander Davies
Claudia Shi
T. Gilbert
Jérémy Scheurer
...
Erdem Biyik
Anca Dragan
David M. Krueger
Dorsa Sadigh
Dylan Hadfield-Menell
ALM
OffRL
52
481
0
27 Jul 2023
Proportional Aggregation of Preferences for Sequential Decision Making
Nikhil Chandak
Shashwat Goel
Dominik Peters
49
9
0
26 Jun 2023
1