v1v2v3 (latest)

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Neural Information Processing Systems (NeurIPS), 2022

4 October 2022

ArXiv (abs)PDF HTML Github (38★)

Papers citing "When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment"

50 / 75 papers shown

Fairness Metric Design Exploration in Multi-Domain Moral Sentiment Classification using Transformer-Based Models

Battemuulen Naranbat

Seyed Sahand Mohammadi Ziabari

Yousuf Nasser Al Husaini

Ali Mohammed Mansoor Alsahag

13 Oct 2025

Reasoning for Hierarchical Text Classification: The Case of Patents

143

08 Oct 2025

EVALUESTEER: Measuring Reward Model Steerability Towards Values and Preferences

305

07 Oct 2025

RoleConflictBench: A Benchmark of Role Conflict Scenarios for Evaluating LLMs' Contextual Sensitivity

164

30 Sep 2025

Think Twice, Generate Once: Safeguarding by Progressive Self-Reflection

178

29 Sep 2025

One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning

140

25 Sep 2025

Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants

Alessio Galatolo

Luca Alberto Rappuoli

Katie Winkle

Meriem Beloucif

ELM

138

18 Aug 2025

Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

254

17 Jun 2025

Discerning What Matters: A Multi-Dimensional Assessment of Moral Competence in LLMs

283

16 Jun 2025

Multi-level Value Alignment in Agentic AI Systems: Survey and Perspectives

...

433

11 Jun 2025

Do Language Models Think Consistently? A Study of Value Preferences Across Varying Response Lengths

Inderjeet Nair

Lu Wang

187

03 Jun 2025

Large Language Models Often Know When They Are Being Evaluated

344

28 May 2025

When Ethics and Payoffs Diverge: LLM Agents in Morally Charged Social Dilemmas

Steffen Backmann

David Guzman Piedrahita

291

25 May 2025

The Staircase of Ethics: Probing LLM Value Priorities through Multi-Step Induction to Complex Moral Dilemmas

198

23 May 2025

Visual moral inference and communication

Warren Zhu

Aida Ramezani

Yang Xu

150

12 Apr 2025

RESPONSE: Benchmarking the Ability of Language Models to Undertake Commonsense Reasoning in Crisis Situation

267

14 Mar 2025

Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

Matthew DosSantos DiSorbo

Harang Ju

Sinan Aral

ELM LRM

267

04 Mar 2025

Can AI Model the Complexities of Human Moral Decision-Making? A Qualitative Study of Kidney Allocation DecisionsInternational Conference on Human Factors in Computing Systems (CHI), 2025

Vijay Keswani

Vincent Conitzer

Walter Sinnott-Armstrong

Breanna K. Nguyen

Hoda Heidari

Jana Schaich Borg

288

02 Mar 2025

Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoralAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Shivani Kumar

David Jurgens

LRM

296

21 Feb 2025

Representation in large language models

Cameron C. Yetman

255

03 Jan 2025

^3

oralBench: A MultiModal Moral Benchmark for LVLMs

275

31 Dec 2024

ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models

400

17 Dec 2024

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

Xiangqi Wang

...

Jindong Wang

167

30 Oct 2024

Who is Undercover? Guiding LLMs to Explore Multi-Perspective Team Tactic in the Game

202

20 Oct 2024

SocialGaze: Improving the Integration of Human Social Norms in Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

214

11 Oct 2024

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily LifeInternational Conference on Learning Representations (ICLR), 2024

Yu Ying Chiu

Liwei Jiang

Yejin Choi

304

03 Oct 2024

Recent Advancement of Emotion Cognition in Large Language Models

Yuyan Chen

Yanghua Xiao

OffRL

216

20 Sep 2024

Beyond Preferences in AI AlignmentPhilosophical Studies (Philos. Stud.), 2024

Tan Zhi-Xuan

Micah Carroll

Matija Franklin

Hal Ashton

343

30 Aug 2024

CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2024

Yufei Huang

...

Tao Liu

Deyi Xiong

ELM

126

19 Aug 2024

CoSafe: Evaluating Large Language Model Safety in Multi-Turn Dialogue Coreference

258

25 Jun 2024

The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models

Barbara Plank

294

16 Jun 2024

GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science?

Desmond C. Ong

299

13 Jun 2024

ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based EvaluationNeural Information Processing Systems (NeurIPS), 2024

343

23 May 2024

Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents

387

25 Apr 2024

Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models

298

17 Apr 2024

SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety

Paul Röttger

360

08 Apr 2024

Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future

Minzhi Li

Weiyan Shi

Caleb Ziems

Diyi Yang

258

28 Feb 2024

Eagle: Ethical Dataset Given from Real Interactions

Masahiro Kaneko

Danushka Bollegala

Timothy Baldwin

191

22 Feb 2024

SaGE: Evaluating Moral Consistency in Large Language Models

Vamshi Krishna Bonagiri

Sreeram Vennam

Priyanshul Govil

Ponnurangam Kumaraguru

Manas Gaur

ELM

189

21 Feb 2024

Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs

360

19 Feb 2024

Integration of cognitive tasks into artificial general intelligence test for large models

...

174

04 Feb 2024

Morality is Non-Binary: Building a Pluralist Moral Sentence Embedding Space using Contrastive Learning

269

30 Jan 2024

AI for social science and social science of AI: A SurveyInformation Processing & Management (IPM), 2024

Xianpei Han

251

22 Jan 2024

Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments

Liesbeth Allein

Maria Mihaela Trucscva

Marie-Francine Moens

210

27 Nov 2023

MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment TasksNeural Information Processing Systems (NeurIPS), 2023

Tatsunori Hashimoto

266

30 Oct 2023

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity TheoryInternational Conference on Learning Representations (ICLR), 2023

Niloofar Mireshghallah

Xuhui Zhou

Yejin Choi

340

151

27 Oct 2023

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral SituationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yuling Gu

Faeze Brahman

Yejin Choi

198

24 Oct 2023

Values, Ethics, Morals? On the Use of Moral Concepts in NLP ResearchConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Karina Vida

Judith Simon

Anne Lauscher

243

21 Oct 2023

Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction LearningInternational Conference on Learning Representations (ICLR), 2023

Xing Xie

226

17 Oct 2023

Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Commonsense NormsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yejin Choi

166

16 Oct 2023