v1v2v3 (latest)

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

Neural Information Processing Systems (NeurIPS), 2022

4 October 2022

ArXiv (abs)PDF HTML Github (38★)

Papers citing "When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment"

25 / 75 papers shown

Prompt Packer: Deceiving LLMs through Compositional Instruction with Hidden Attacks

Shuyu Jiang

Xingshu Chen

Rui Tang

275

16 Oct 2023

The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and ValuesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Paul Röttger

355

11 Oct 2023

STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI ModelsAi & Society (AI & Society), 2023

Yi Zeng

209

09 Oct 2023

ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models

320

30 Sep 2023

Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI

Zahra Shakeri Hossein Abad

...

497

119

21 Sep 2023

Cognitive Architectures for Language Agents

615

275

05 Sep 2023

Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models

178

24 Aug 2023

From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models

Xing Xie

385

23 Aug 2023

Evaluating the Moral Beliefs Encoded in LLMsNeural Information Processing Systems (NeurIPS), 2023

247

201

26 Jul 2023

Minimum Levels of Interpretability for Artificial Moral AgentsAI and Ethics (AE), 2023

Avish Vijayaraghavan

C. Badea

AI4CE

163

02 Jul 2023

Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory

Masashi Takeshita

Rafal Rzepka

K. Araki

183

20 Jun 2023

Toward Grounded Commonsense ReasoningIEEE International Conference on Robotics and Automation (ICRA), 2023

Dorsa Sadigh

270

14 Jun 2023

Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence

306

12 Jun 2023

Interpretable Math Word Problem Solution Generation Via Step-by-step PlanningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

158

01 Jun 2023

ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing

Ryan Liu

Nihar B. Shah

ELM

188

106

01 Jun 2023

Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023

...

316

21 May 2023

"Oops, Did I Just Say That?" Testing and Repairing Unethical Suggestions of Large Language Models with Suggest-Critique-Reflect Process

191

04 May 2023

Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods

Thilo Hagendorff

LLMAG

390

24 Mar 2023

Language Model Behavior: A Comprehensive SurveyInternational Conference on Computational Logic (ICCL), 2023

Tyler A. Chang

Benjamin Bergen

VLM LRM LM&MA

372

139

20 Mar 2023

Susceptibility to Influence of Large Language Models

Augustine N. Mavor-Parker

199

10 Mar 2023

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback

Hannah Rose Kirk

Bertie Vidgen

Paul Röttger

Scott A. Hale

277

123

09 Mar 2023

SemEval-2023 Task 10: Explainable Detection of Online SexismInternational Workshop on Semantic Evaluation (SemEval), 2023

Hannah Rose Kirk

Wenjie Yin

Bertie Vidgen

Paul Röttger

291

144

07 Mar 2023

Revision Transformers: Instructing Language Models to Change their ValuesEuropean Conference on Artificial Intelligence (ECAI), 2022

254

19 Oct 2022

Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political IdentityAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

Gabriel Simmons

357

24 Sep 2022

LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning TasksNeural Information Processing Systems (NeurIPS), 2022

Dimitris Papailiopoulos

Kangwook Lee

LMTD

555

167

14 Jun 2022