Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.07927
Cited By
Are self-explanations from Large Language Models faithful?
15 January 2024
Andreas Madsen
Sarath Chandar
Siva Reddy
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are self-explanations from Large Language Models faithful?"
10 / 10 papers shown
Title
SEER: Self-Explainability Enhancement of Large Language Models' Representations
Guanxu Chen
Dongrui Liu
Tao Luo
Jing Shao
LRM
MILM
59
1
0
07 Feb 2025
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
48
0
0
22 Oct 2024
Evaluating the Reliability of Self-Explanations in Large Language Models
Korbinian Randl
John Pavlopoulos
Aron Henriksson
Tony Lindgren
LRM
33
0
0
19 Jul 2024
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Mohsen Fayyaz
Fan Yin
Jiao Sun
Nanyun Peng
37
3
0
28 Jun 2024
CELL your Model: Contrastive Explanations for Large Language Models
Ronny Luss
Erik Miehling
Amit Dhurandhar
26
0
0
17 Jun 2024
SCOTT: Self-Consistent Chain-of-Thought Distillation
Jamie Yap
Zhengyang Wang
Zheng Li
K. Lynch
Bing Yin
Xiang Ren
LRM
57
91
0
03 May 2023
"Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
Jasmijn Bastings
Sebastian Ebert
Polina Zablotskaia
Anders Sandholm
Katja Filippova
102
75
0
14 Nov 2021
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Andreas Madsen
Nicholas Meade
Vaibhav Adlakha
Siva Reddy
96
35
0
15 Oct 2021
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
249
618
0
04 Dec 2018
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
219
3,658
0
28 Feb 2017
1