Moral Machine or Tyranny of the Majority?

Moral Machine or Tyranny of the Majority?

27 May 2023

Hoda Heidari

Zachary Chase Lipton

Papers citing "Moral Machine or Tyranny of the Majority?"

15 / 15 papers shown

Title
Can AI Model the Complexities of Human Moral Decision-Making? A Qualitative Study of Kidney Allocation Decisions Vijay Keswani Vincent Conitzer Walter Sinnott-Armstrong Breanna K. Nguyen Hoda Heidari Jana Schaich Borg 46 0 0 02 Mar 2025
Societal Alignment Frameworks Can Improve LLM Alignment Karolina Stañczak Nicholas Meade Mehar Bhatia Hattie Zhou Konstantin Böttinger ... Timothy P. Lillicrap Ana Marasović Sylvie Delacroix Gillian K. Hadfield Siva Reddy 254 0 0 27 Feb 2025
AI Alignment at Your Discretion Maarten Buyl Hadi Khalaf C. M. Verdun Lucas Monteiro Paes Caio Vieira Machado Flavio du Pin Calmon 48 0 0 10 Feb 2025
Intuitions of Compromise: Utilitarianism vs. Contractualism Jared Moore Yejin Choi Sydney Levine 43 0 0 07 Oct 2024
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models Ned Cooper Alexandra Zafiroglu 44 0 0 27 Aug 2024
On The Stability of Moral Preferences: A Problem with Computational Elicitation Methods Kyle Boerstler Vijay Keswani Lok Chan Jana Schaich Borg Vincent Conitzer Hoda Heidari Walter Sinnott-Armstrong 44 2 0 05 Aug 2024
Pareto-Optimal Learning from Preferences with Hidden Context Ryan Boldi Li Ding Lee Spector S. Niekum 72 6 0 21 Jun 2024
Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback Vincent Conitzer Rachel Freedman J. Heitzig Wesley H. Holliday Bob M. Jacobs ... Eric Pacuit Stuart Russell Hailey Schoelkopf Emanuel Tewolde W. Zwicker 53 30 0 16 Apr 2024
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards Haoxiang Wang Yong Lin Wei Xiong Rui Yang Shizhe Diao Shuang Qiu Han Zhao Tong Zhang 45 72 0 28 Feb 2024
Wikibench: Community-Driven Data Curation for AI Evaluation on Wikipedia Tzu-Sheng Kuo Aaron L Halfaker Zirui Cheng Jiwoo Kim Meng-Hsin Wu Tongshuang Wu Kenneth Holstein Haiyi Zhu 67 21 0 21 Feb 2024
Personalized Language Modeling from Personalized Human Feedback Xinyu Li Zachary C. Lipton Liu Leqi ALM 76 48 0 06 Feb 2024
Red-Teaming for Generative AI: Silver Bullet or Security Theater? Michael Feffer Anusha Sinha Wesley Hanwen Deng Zachary Chase Lipton Hoda Heidari AAML 47 68 0 29 Jan 2024
Reinforcement Learning for Generative AI: State of the Art, Opportunities and Open Research Challenges Giorgio Franceschelli Mirco Musolesi AI4CE 42 20 0 31 Jul 2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Stephen Casper Xander Davies Claudia Shi T. Gilbert Jérémy Scheurer ... Erdem Biyik Anca Dragan David M. Krueger Dorsa Sadigh Dylan Hadfield-Menell ALM OffRL 52 481 0 27 Jul 2023
Proportional Aggregation of Preferences for Sequential Decision Making Nikhil Chandak Shashwat Goel Dominik Peters 49 9 0 26 Jun 2023