Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.02201
Cited By
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
3 February 2021
Peter Hase
Mohit Bansal
XAI
Re-assign community
ArXiv
PDF
HTML
Papers citing
"When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data"
23 / 23 papers shown
Title
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
32
1
0
10 Apr 2025
Policy-to-Language: Train LLMs to Explain Decisions with Flow-Matching Generated Rewards
Xinyi Yang
Liang Zeng
Heng Dong
C. Yu
X. Wu
H. Yang
Yu Wang
Milind Tambe
Tonghan Wang
68
2
0
18 Feb 2025
Chain-of-Translation Prompting (CoTR): A Novel Prompting Technique for Low Resource Languages
Tejas Deshpande
Nidhi Kowtal
Raviraj Joshi
LRM
47
1
0
31 Dec 2024
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
26
2
0
19 Sep 2024
Explanation Regularisation through the Lens of Attributions
Pedro Ferreira
Wilker Aziz
Ivan Titov
33
1
0
23 Jul 2024
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Mohsen Fayyaz
Fan Yin
Jiao Sun
Nanyun Peng
48
3
0
28 Jun 2024
ALMANACS: A Simulatability Benchmark for Language Model Explainability
Edmund Mills
Shiye Su
Stuart J. Russell
Scott Emmons
46
7
0
20 Dec 2023
Passive learning of active causal strategies in agents and language models
Andrew Kyle Lampinen
Stephanie C. Y. Chan
Ishita Dasgupta
A. Nam
Jane X. Wang
27
15
0
25 May 2023
Training Language Models with Language Feedback at Scale
Jérémy Scheurer
Jon Ander Campos
Tomasz Korbak
Jun Shern Chan
Angelica Chen
Kyunghyun Cho
Ethan Perez
ALM
29
101
0
28 Mar 2023
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
69
988
0
17 Oct 2022
Learning to Reason With Relational Abstractions
A. Nam
Mengye Ren
Chelsea Finn
James L. McClelland
ReLM
LRM
21
4
0
06 Oct 2022
Mediators: Conversational Agents Explaining NLP Model Behavior
Nils Feldhus
A. Ravichandran
Sebastian Möller
25
16
0
13 Jun 2022
Learning to Ignore Adversarial Attacks
Yiming Zhang
Yan Zhou
Samuel Carton
Chenhao Tan
46
2
0
23 May 2022
Can language models learn from explanations in context?
Andrew Kyle Lampinen
Ishita Dasgupta
Stephanie C. Y. Chan
Kory Matthewson
Michael Henry Tessler
Antonia Creswell
James L. McClelland
Jane X. Wang
Felix Hill
LRM
ReLM
31
283
0
05 Apr 2022
What to Learn, and How: Toward Effective Learning from Rationales
Samuel Carton
Surya Kanoria
Chenhao Tan
16
22
0
30 Nov 2021
Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs
Peter Hase
Mona T. Diab
Asli Celikyilmaz
Xian Li
Zornitsa Kozareva
Veselin Stoyanov
Mohit Bansal
Srini Iyer
KELM
LRM
17
79
0
26 Nov 2021
Supervising Model Attention with Human Explanations for Robust Natural Language Inference
Joe Stacey
Yonatan Belinkov
Marek Rei
23
45
0
16 Apr 2021
Local Interpretations for Explainable Natural Language Processing: A Survey
Siwen Luo
Hamish Ivison
S. Han
Josiah Poon
MILM
17
48
0
20 Mar 2021
Measuring Association Between Labels and Free-Text Rationales
Sarah Wiegreffe
Ana Marasović
Noah A. Smith
274
170
0
24 Oct 2020
ExpBERT: Representation Engineering with Natural Language Explanations
Shikhar Murty
Pang Wei Koh
Percy Liang
38
43
0
05 May 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
406
2,584
0
03 Sep 2019
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
255
620
0
04 Dec 2018
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
225
3,672
0
28 Feb 2017
1