Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2210.01478
Cited By
v1
v2
v3 (latest)
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment
Neural Information Processing Systems (NeurIPS), 2022
4 October 2022
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Amélie Reymond
J. Tenenbaum
Bernhard Schölkopf
ELM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (38★)
Papers citing
"When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment"
25 / 75 papers shown
Prompt Packer: Deceiving LLMs through Compositional Instruction with Hidden Attacks
Shuyu Jiang
Xingshu Chen
Rui Tang
275
33
0
16 Oct 2023
The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hannah Rose Kirk
Andrew M. Bean
Bertie Vidgen
Paul Röttger
Scott A. Hale
ALM
355
61
0
11 Oct 2023
STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models
Ai & Society (AI & Society), 2023
Yuwei Wang
Enmeng Lu
Zizhe Ruan
Yao Liang
Yi Zeng
AI4TS
209
5
0
09 Oct 2023
ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models
Zhaowei Zhang
Fengshuo Bai
Jun Gao
Yaodong Yang
PILM
ELM
320
5
0
30 Sep 2023
Foundation Metrics for Evaluating Effectiveness of Healthcare Conversations Powered by Generative AI
Mahyar Abbasian
Elahe Khatibi
Iman Azimi
David Oniani
Zahra Shakeri Hossein Abad
...
Bryant Lin
Olivier Gevaert
Li-Jia Li
Ramesh C. Jain
Amir M. Rahmani
LM&MA
ELM
AI4MH
497
119
0
21 Sep 2023
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik Narasimhan
Thomas Griffiths
LLMAG
LM&Ro
615
275
0
05 Sep 2023
Mind vs. Mouth: On Measuring Re-judge Inconsistency of Social Bias in Large Language Models
Yachao Zhao
Bo Wang
Dongming Zhao
Kun Huang
Yan Wang
Ruifang He
Yuexian Hou
178
4
0
24 Aug 2023
From Instructions to Intrinsic Human Values -- A Survey of Alignment Goals for Big Models
Jing Yao
Xiaoyuan Yi
Xiting Wang
Yongfeng Zhang
Xing Xie
ALM
385
56
0
23 Aug 2023
Evaluating the Moral Beliefs Encoded in LLMs
Neural Information Processing Systems (NeurIPS), 2023
Nino Scherrer
Claudia Shi
Amir Feder
David M. Blei
247
201
0
26 Jul 2023
Minimum Levels of Interpretability for Artificial Moral Agents
AI and Ethics (AE), 2023
Avish Vijayaraghavan
C. Badea
AI4CE
163
6
0
02 Jul 2023
Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory
Masashi Takeshita
Rafal Rzepka
K. Araki
183
11
0
20 Jun 2023
Toward Grounded Commonsense Reasoning
IEEE International Conference on Robotics and Automation (ICRA), 2023
Minae Kwon
Hengyuan Hu
Vivek Myers
Siddharth Karamcheti
Anca Dragan
Dorsa Sadigh
LM&Ro
ReLM
LRM
270
14
0
14 Jun 2023
Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence
John J. Nay
David Karamardian
Sarah Lawsky
Wenting Tao
Meghana Moorthy Bhat
Raghav Jain
Aaron Travis Lee
Jonathan H. Choi
Jungo Kasai
ELM
AILaw
306
83
0
12 Jun 2023
Interpretable Math Word Problem Solution Generation Via Step-by-step Planning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Mengxue Zhang
Zichao Wang
Zhichao Yang
Weiqi Feng
Andrew Lan
LRM
158
23
0
01 Jun 2023
ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing
Ryan Liu
Nihar B. Shah
ELM
188
106
0
01 Jun 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language Models
International Conference on Language Resources and Evaluation (LREC), 2023
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Amélie Reymond
LRM
316
8
0
21 May 2023
"Oops, Did I Just Say That?" Testing and Repairing Unethical Suggestions of Large Language Models with Suggest-Critique-Reflect Process
Anna Glazkova
Zongjie Li
Michael Kadantsev
Maksim Glazkov
KELM
191
15
0
04 May 2023
Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods
Thilo Hagendorff
LLMAG
390
72
0
24 Mar 2023
Language Model Behavior: A Comprehensive Survey
International Conference on Computational Logic (ICCL), 2023
Tyler A. Chang
Benjamin Bergen
VLM
LRM
LM&MA
372
139
0
20 Mar 2023
Susceptibility to Influence of Large Language Models
Lewis D. Griffin
Bennett Kleinberg
Maximilian Mozes
Kimberly T. Mai
Maria Vau
M. Caldwell
Augustine N. Mavor-Parker
199
17
0
10 Mar 2023
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback
Hannah Rose Kirk
Bertie Vidgen
Paul Röttger
Scott A. Hale
277
123
0
09 Mar 2023
SemEval-2023 Task 10: Explainable Detection of Online Sexism
International Workshop on Semantic Evaluation (SemEval), 2023
Hannah Rose Kirk
Wenjie Yin
Bertie Vidgen
Paul Röttger
291
144
0
07 Mar 2023
Revision Transformers: Instructing Language Models to Change their Values
European Conference on Artificial Intelligence (ECAI), 2022
Felix Friedrich
Wolfgang Stammer
P. Schramowski
Kristian Kersting
KELM
254
11
0
19 Oct 2022
Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Gabriel Simmons
357
84
0
24 Sep 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Neural Information Processing Systems (NeurIPS), 2022
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
555
167
0
14 Jun 2022
Previous
1
2