Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.11207
Cited By
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations
17 October 2023
Shiyuan Huang
Siddarth Mamidanna
Shreedhar Jangam
Yilun Zhou
Leilani H. Gilpin
LRM
MILM
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations"
14 / 14 papers shown
Title
Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models
X. Wang
Haoyang Li
Zeyang Zhang
H. Chen
Wenwu Zhu
LRM
77
0
0
28 Apr 2025
SEER: Self-Explainability Enhancement of Large Language Models' Representations
Guanxu Chen
Dongrui Liu
Tao Luo
Jing Shao
LRM
MILM
62
1
0
07 Feb 2025
ProSLM : A Prolog Synergized Language Model for explainable Domain Specific Knowledge Based Question Answering
Priyesh Vakharia
Abigail Kufeldt
Max Meyers
Ian Lane
Leilani H. Gilpin
19
0
0
17 Sep 2024
DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction
John Wu
David Wu
Jimeng Sun
65
0
0
16 Sep 2024
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Mohsen Fayyaz
Fan Yin
Jiao Sun
Nanyun Peng
48
3
0
28 Jun 2024
Unveiling LLM Mechanisms Through Neural ODEs and Control Theory
Yukun Zhang
Qi Dong
34
0
0
23 Jun 2024
The Solvability of Interpretability Evaluation Metrics
Yilun Zhou
J. Shah
62
8
0
18 May 2022
Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations
Jessica Dai
Sohini Upadhyay
Ulrich Aivodji
Stephen H. Bach
Himabindu Lakkaraju
35
56
0
15 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective
Satyapriya Krishna
Tessa Han
Alex Gu
Steven Wu
S. Jabbari
Himabindu Lakkaraju
172
183
0
03 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
"Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
Jasmijn Bastings
Sebastian Ebert
Polina Zablotskaia
Anders Sandholm
Katja Filippova
107
75
0
14 Nov 2021
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
252
620
0
04 Dec 2018
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
199
199
0
02 May 2018
1