Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2210.01478
Cited By
v1
v2
v3 (latest)
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
Neural Information Processing Systems (NeurIPS), 2022
4 October 2022
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Amélie Reymond
J. Tenenbaum
Bernhard Schölkopf
ELM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (38★)
Papers citing
"When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment"
50 / 75 papers shown
Fairness Metric Design Exploration in Multi-Domain Moral Sentiment Classification using Transformer-Based Models
Battemuulen Naranbat
Seyed Sahand Mohammadi Ziabari
Yousuf Nasser Al Husaini
Ali Mohammed Mansoor Alsahag
68
0
0
13 Oct 2025
Reasoning for Hierarchical Text Classification: The Case of Patents
Lekang Jiang
Wenjun Sun
Stephan Goetz
BDL
143
7
0
08 Oct 2025
EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences
Kshitish Ghate
Andy Liu
Devansh Jain
Taylor Sorensen
Atoosa Kasirzadeh
Aylin Caliskan
Mona Diab
Maarten Sap
LLMSV
305
0
0
07 Oct 2025
RoleConflictBench: A Benchmark of Role Conflict Scenarios for Evaluating LLMs' Contextual Sensitivity
Jisu Shin
Hoyun Song
Juhyun Oh
Changgeon Ko
Eunsu Kim
Chani Jung
Alice Oh
164
0
0
30 Sep 2025
Think Twice, Generate Once: Safeguarding by Progressive Self-Reflection
Hoang Phan
Victor Li
Qi Lei
KELM
CLL
178
0
0
29 Sep 2025
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning
Sualeha Farid
Jayden Lin
Zean Chen
Shivani Kumar
David Jurgens
LRM
140
1
0
25 Sep 2025
Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants
Alessio Galatolo
Luca Alberto Rappuoli
Katie Winkle
Meriem Beloucif
ELM
138
1
0
18 Aug 2025
Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Chenchen Yuan
Zheyu Zhang
Shuo Yang
Bardh Prenkaj
Gjergji Kasneci
254
1
0
17 Jun 2025
Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs
Daniel Kilov
Caroline Hendy
Secil Yanik Guyot
Aaron J. Snoswell
Seth Lazar
ELM
283
3
0
16 Jun 2025
Multi-level Value Alignment in Agentic AI Systems: Survey and Perspectives
Wei Zeng
Hengshu Zhu
Chuan Qin
Han Wu
Yihang Cheng
...
Xiaowei Jin
Yinuo Shen
Zhenxing Wang
Feimin Zhong
Hui Xiong
AI4TS
433
0
0
11 Jun 2025
Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths
Inderjeet Nair
Lu Wang
187
1
0
03 Jun 2025
Large Language Models Often Know When They Are Being Evaluated
Joe Needham
Giles Edkins
Govind Pimpale
Henning Bartsch
Marius Hobbhahn
LLMAG
ELM
ALM
344
23
0
28 May 2025
When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas
Steffen Backmann
David Guzman Piedrahita
Emanuel Tewolde
Amélie Reymond
Bernhard Schölkopf
Zhijing Jin
291
4
0
25 May 2025
The Staircase of Ethics: Probing LLM Value Priorities through Multi-Step Induction to Complex Moral Dilemmas
Ya Wu
Qiang Sheng
Danding Wang
Guang Yang
Yifan Sun
Zhengjia Wang
Yuyan Bu
Juan Cao
198
4
0
23 May 2025
Visual moral inference and communication
Warren Zhu
Aida Ramezani
Yang Xu
150
0
0
12 Apr 2025
RESPONSE: Benchmarking the Ability of Language Models to Undertake Commonsense Reasoning in Crisis Situation
Aissatou Diallo
Antonis Bikakis
Luke Dickens
Anthony Hunter
Rob Miller
ReLM
LRM
267
1
0
14 Mar 2025
Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment
Matthew DosSantos DiSorbo
Harang Ju
Sinan Aral
ELM
LRM
259
4
0
04 Mar 2025
Can AI Model the Complexities of Human Moral Decision-Making? A Qualitative Study of Kidney Allocation Decisions
International Conference on Human Factors in Computing Systems (CHI), 2025
Vijay Keswani
Vincent Conitzer
Walter Sinnott-Armstrong
Breanna K. Nguyen
Hoda Heidari
Jana Schaich Borg
288
2
0
02 Mar 2025
Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Shivani Kumar
David Jurgens
LRM
292
5
0
21 Feb 2025
Representation in large language models
Cameron C. Yetman
255
2
0
03 Jan 2025
M
3
^3
3
oralBench: A MultiModal Moral Benchmark for LVLMs
Bei Yan
Jie M. Zhang
Zhiyuan Chen
Shiguang Shan
Xilin Chen
ELM
275
6
0
31 Dec 2024
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
Yuxi Sun
Wei Gao
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
ELM
400
0
0
17 Dec 2024
Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?
Yue Huang
Zhengqing Yuan
Yujun Zhou
Kehan Guo
Xiangqi Wang
...
Weixiang Sun
Lichao Sun
Jindong Wang
Yanfang Ye
Wei Wei
LLMAG
167
23
0
30 Oct 2024
Who is Undercover? Guiding LLMs to Explore Multi-Perspective Team Tactic in the Game
Ruiqi Dong
Zhixuan Liao
Guangwei Lai
Yuhan Ma
Danni Ma
Chenyou Fan
LLMAG
202
1
0
20 Oct 2024
SocialGaze: Improving the Integration of Human Social Norms in Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Anvesh Rao Vijjini
Rakesh R Menon
Jiayi Fu
Shashank Srivastava
Snigdha Chaturvedi
ALM
214
4
0
11 Oct 2024
DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life
International Conference on Learning Representations (ICLR), 2024
Yu Ying Chiu
Liwei Jiang
Yejin Choi
304
25
0
03 Oct 2024
Recent Advancement of Emotion Cognition in Large Language Models
Yuyan Chen
Yanghua Xiao
OffRL
216
11
0
20 Sep 2024
Beyond Preferences in AI Alignment
Philosophical Studies (Philos. Stud.), 2024
Tan Zhi-Xuan
Micah Carroll
Matija Franklin
Hal Ashton
343
38
0
30 Aug 2024
CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Linhao Yu
Yongqi Leng
Yufei Huang
Shang Wu
Haixin Liu
...
Jinwang Song
Tingting Cui
Xiaoqing Cheng
Tao Liu
Deyi Xiong
ELM
126
9
0
19 Aug 2024
CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference
Erxin Yu
Jing Li
Ming Liao
Siqi Wang
Zuchen Gao
Fei Mi
Lanqing Hong
ELM
LRM
258
38
0
25 Jun 2024
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models
Bolei Ma
Xinpeng Wang
Tiancheng Hu
Anna Haensch
Michael A. Hedderich
Barbara Plank
Frauke Kreuter
ALM
294
16
0
16 Jun 2024
GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science?
Desmond C. Ong
299
5
0
13 Jun 2024
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
Neural Information Processing Systems (NeurIPS), 2024
Jingnan Zheng
Han Wang
An Zhang
Tai D. Nguyen
Jun Sun
Tat-Seng Chua
LLMAG
331
39
0
23 May 2024
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents
Giorgio Piatti
Zhijing Jin
Max Kleiman-Weiner
Bernhard Schölkopf
Mrinmaya Sachan
Amélie Reymond
LLMAG
387
53
0
25 Apr 2024
Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models
Jan-Philipp Fränken
Kanishk Gandhi
Tori Qiu
Ayesha Khawaja
Noah D. Goodman
Tobias Gerstenberg
ELM
298
1
0
17 Apr 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
360
66
0
08 Apr 2024
Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future
Minzhi Li
Weiyan Shi
Caleb Ziems
Diyi Yang
258
11
0
28 Feb 2024
Eagle: Ethical Dataset Given from Real Interactions
Masahiro Kaneko
Danushka Bollegala
Timothy Baldwin
191
4
0
22 Feb 2024
SaGE: Evaluating Moral Consistency in Large Language Models
Vamshi Krishna Bonagiri
Sreeram Vennam
Priyanshul Govil
Ponnurangam Kumaraguru
Manas Gaur
ELM
189
0
0
21 Feb 2024
Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs
Naihao Deng
Zhenjie Sun
Ruiqi He
Aman Sikka
Yulong Chen
Lin Ma
Yue Zhang
Amélie Reymond
LMTD
360
38
0
19 Feb 2024
Integration of cognitive tasks into artificial general intelligence test for large models
Youzhi Qu
Chen Wei
Penghui Du
Wenxin Che
Chi Zhang
...
Bin Hu
Kai Du
Haiyan Wu
Jia Liu
Quanying Liu
ELM
174
12
0
04 Feb 2024
Morality is Non-Binary: Building a Pluralist Moral Sentence Embedding Space using Contrastive Learning
Jeongwoo Park
Enrico Liscio
P. Murukannaiah
AILaw
269
7
0
30 Jan 2024
AI for social science and social science of AI: A Survey
Information Processing & Management (IPM), 2024
Ruoxi Xu
Yingfei Sun
Mengjie Ren
Shiguang Guo
Ruotong Pan
Hongyu Lin
Le Sun
Xianpei Han
251
92
0
22 Jan 2024
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments
Liesbeth Allein
Maria Mihaela Trucscva
Marie-Francine Moens
210
2
0
27 Nov 2023
MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks
Neural Information Processing Systems (NeurIPS), 2023
Allen Nie
Yuhui Zhang
Atharva Amdekar
Chris Piech
Tatsunori Hashimoto
Tobias Gerstenberg
266
55
0
30 Oct 2023
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
International Conference on Learning Representations (ICLR), 2023
Niloofar Mireshghallah
Hyunwoo J. Kim
Xuhui Zhou
Yulia Tsvetkov
Maarten Sap
Reza Shokri
Yejin Choi
PILM
340
151
0
27 Oct 2023
What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kavel Rao
Liwei Jiang
Valentina Pyatkin
Yuling Gu
Niket Tandon
Nouha Dziri
Faeze Brahman
Yejin Choi
198
22
0
24 Oct 2023
Values, Ethics, Morals? On the Use of Moral Concepts in NLP Research
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Karina Vida
Judith Simon
Anne Lauscher
243
21
0
21 Oct 2023
Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning
International Conference on Learning Representations (ICLR), 2023
Shitong Duan
Xiaoyuan Yi
Peng Zhang
Tun Lu
Xing Xie
Ning Gu
226
23
0
17 Oct 2023
Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense Norms
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Seungju Han
Junhyeok Kim
Jack Hessel
Liwei Jiang
Jiwan Chung
Yejin Son
Yejin Choi
Youngjae Yu
166
5
0
16 Oct 2023
1
2
Next