Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.06800
Cited By
LLM-Generated Black-box Explanations Can Be Adversarially Helpful
10 May 2024
R. Ajwani
Shashidhar Reddy Javaji
Frank Rudzicz
Zining Zhu
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LLM-Generated Black-box Explanations Can Be Adversarially Helpful"
5 / 5 papers shown
Title
REV: Information-Theoretic Evaluation of Free-Text Rationales
Hanjie Chen
Faeze Brahman
Xiang Ren
Yangfeng Ji
Yejin Choi
Swabha Swayamdipta
79
22
0
10 Oct 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
216
327
0
23 Aug 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Explainability Pitfalls: Beyond Dark Patterns in Explainable AI
Upol Ehsan
Mark O. Riedl
XAI
SILM
44
56
0
26 Sep 2021
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI
Alon Jacovi
Ana Marasović
Tim Miller
Yoav Goldberg
236
417
0
15 Oct 2020
1