Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12043
Cited By
The Art of Saying No: Contextual Noncompliance in Language Models
2 July 2024
Faeze Brahman
Sachin Kumar
Vidhisha Balachandran
Pradeep Dasigi
Valentina Pyatkin
Abhilasha Ravichander
Sarah Wiegreffe
Nouha Dziri
Khyathi Raghavi Chandu
Jack Hessel
Yulia Tsvetkov
Noah A. Smith
Yejin Choi
Hannaneh Hajishirzi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Art of Saying No: Contextual Noncompliance in Language Models"
9 / 9 papers shown
Title
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Inkit Padhi
K. Ramamurthy
Erik Miehling
Pierre L. Dognin
Manish Nagireddy
Amit Dhurandhar
LLMSV
87
13
0
06 Sep 2024
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao
Xiang Ren
Jack Hessel
Claire Cardie
Yejin Choi
Yuntian Deng
37
44
0
02 May 2024
Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs
Michael J.Q. Zhang
Eunsol Choi
24
6
0
16 Nov 2023
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
49
59
0
14 Oct 2022
Reducing conversational agents' overconfidence through linguistic calibration
Sabrina J. Mielke
Arthur Szlam
Emily Dinan
Y-Lan Boureau
193
108
0
30 Dec 2020
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
261
1,386
0
14 Dec 2020
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI
Alon Jacovi
Ana Marasović
Tim Miller
Yoav Goldberg
236
417
0
15 Oct 2020
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
229
288
0
17 Mar 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1