Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory

20 June 2023

Papers citing "Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory"

6 / 6 papers shown

Title
The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas Giovanni Franco Gabriel Marraffini Andrés Cotton Noe Fabian Hsueh Axel Fridman Juan Wisznia Luciano Del Corro 31 0 0 25 Mar 2025
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models Yuxi Sun Wei Gao Jing Ma Hongzhan Lin Ziyang Luo Wenxuan Zhang ELM 82 0 0 17 Dec 2024
Intuitions of Compromise: Utilitarianism vs. Contractualism Jared Moore Yejin Choi Sydney Levine 33 0 0 07 Oct 2024
An Evaluation of GPT-4 on the ETHICS Dataset Sergey Rodionov Z. Goertzel Ben Goertzel 27 4 0 19 Sep 2023
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned Deep Ganguli Liane Lovitt John Kernion Amanda Askell Yuntao Bai ... Nicholas Joseph Sam McCandlish C. Olah Jared Kaplan Jack Clark 225 444 0 23 Aug 2022
Can Machines Learn Morality? The Delphi Experiment Liwei Jiang Jena D. Hwang Chandra Bhagavatula Ronan Le Bras Jenny T Liang ... Yulia Tsvetkov Oren Etzioni Maarten Sap Regina A. Rini Yejin Choi FaML 127 111 0 14 Oct 2021