Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.00603
Cited By
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals
1 October 2023
Y. Gat
Nitay Calderon
Amir Feder
Alexander Chapanin
Amit Sharma
Roi Reichart
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals"
32 / 32 papers shown
Title
Can LLMs Explain Themselves Counterfactually?
Zahra Dehghanighobadi
Asja Fischer
Muhammad Bilal Zafar
LRM
38
0
0
25 Feb 2025
Interpreting Language Reward Models via Contrastive Explanations
Junqi Jiang
Tom Bewley
Saumitra Mishra
Freddy Lecue
Manuela Veloso
74
0
0
25 Nov 2024
Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance
Omer Nahum
Nitay Calderon
Orgad Keller
Idan Szpektor
Roi Reichart
23
2
0
24 Oct 2024
Causality for Large Language Models
Anpeng Wu
Kun Kuang
Minqin Zhu
Yingrong Wang
Yujia Zheng
Kairong Han
B. Li
Guangyi Chen
Fei Wu
Kun Zhang
LRM
46
7
0
20 Oct 2024
TAGExplainer: Narrating Graph Explanations for Text-Attributed Graph Learning Models
Bo Pan
Zhen Xiong
Guanchen Wu
Zheng Zhang
Yifei Zhang
Liang Zhao
FAtt
36
1
0
20 Oct 2024
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models
Wei Jie Yeo
Ranjan Satapathy
Erik Cambria
25
0
0
18 Oct 2024
NL-Eye: Abductive NLI for Images
Mor Ventura
Michael Toker
Nitay Calderon
Zorik Gekhman
Yonatan Bitton
Roi Reichart
28
1
0
03 Oct 2024
Counterfactual Token Generation in Large Language Models
Ivi Chatzi
N. C. Benz
Eleni Straitouri
Stratis Tsirtsis
Manuel Gomez Rodriguez
LRM
34
3
0
25 Sep 2024
Causal Inference with Large Language Model: A Survey
Jing Ma
CML
LRM
91
8
0
15 Sep 2024
Enhancing adversarial robustness in Natural Language Inference using explanations
Alexandros Koulakos
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
SILM
AAML
35
0
0
11 Sep 2024
Using LLMs for Explaining Sets of Counterfactual Examples to Final Users
Arturo Fredes
Jordi Vitria
CML
LRM
28
3
0
27 Aug 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
36
10
0
27 Jul 2024
A Survey on Natural Language Counterfactual Generation
Yongjie Wang
Xiaoqi Qiu
Yu Yue
Xu Guo
Zhiwei Zeng
Yuhong Feng
Zhiqi Shen
34
5
0
04 Jul 2024
Distributional reasoning in LLMs: Parallel reasoning processes in multi-hop reasoning
Yuval Shalev
Amir Feder
Ariel Goldstein
LRM
39
4
0
19 Jun 2024
Large Language Models for Constrained-Based Causal Discovery
Kai-Hendrik Cohrs
Gherardo Varando
Emiliano Díaz
Vasileios Sitokonstantinou
Gustau Camps-Valls
41
9
0
11 Jun 2024
Beyond Agreement: Diagnosing the Rationale Alignment of Automated Essay Scoring Methods based on Linguistically-informed Counterfactuals
Yupei Wang
Renfen Hu
Zhe Zhao
32
2
0
29 May 2024
Large Language Models and Causal Inference in Collaboration: A Survey
Xiaoyu Liu
Paiheng Xu
Junda Wu
Jiaxin Yuan
Yifan Yang
...
Haoliang Wang
Tong Yu
Julian McAuley
Wei Ai
Furong Huang
ELM
LRM
77
5
0
14 Mar 2024
Large Scale Foundation Models for Intelligent Manufacturing Applications: A Survey
Haotian Zhang
S. D. Semujju
Zhicheng Wang
Xianwei Lv
Kang Xu
...
Jing Wu
Zhuo Long
Wensheng Liang
Xiaoguang Ma
Ruiyan Zhuang
UQCV
AI4TS
AI4CE
27
4
0
11 Dec 2023
On Measuring Faithfulness or Self-consistency of Natural Language Explanations
Letitia Parcalabescu
Anette Frank
LRM
69
20
0
13 Nov 2023
T-COL: Generating Counterfactual Explanations for General User Preferences on Variable Machine Learning Systems
Yiming Li
Daling Wang
Wenfang Wu
Shi Feng
Yifei Zhang
CML
40
1
0
28 Sep 2023
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Rachneet Sachdeva
Martin Tutek
Iryna Gurevych
OODD
22
10
0
14 Sep 2023
Measuring the Robustness of NLP Models to Domain Shifts
Nitay Calderon
Naveh Porat
Eyal Ben-David
Alexander Chapanin
Zorik Gekhman
Nadav Oved
Vitaly Shalumov
Roi Reichart
16
6
0
31 May 2023
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
31
17
0
03 May 2023
Causal Proxy Models for Concept-Based Model Explanations
Zhengxuan Wu
Karel DÓosterlinck
Atticus Geiger
Amir Zur
Christopher Potts
MILM
75
35
0
28 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey
Qing Lyu
Marianna Apidianaki
Chris Callison-Burch
XAI
106
107
0
22 Sep 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
311
11,915
0
04 Mar 2022
Framework for Evaluating Faithfulness of Local Explanations
S. Dasgupta
Nave Frost
Michal Moshkovitz
FAtt
111
61
0
01 Feb 2022
Rethinking Attention-Model Explainability through Faithfulness Violation Test
Y. Liu
Haoliang Li
Yangyang Guo
Chen Kong
Jing Li
Shiqi Wang
FAtt
116
42
0
28 Jan 2022
Tailor: Generating and Perturbing Text with Semantic Controls
Alexis Ross
Tongshuang Wu
Hao Peng
Matthew E. Peters
Matt Gardner
136
77
0
15 Jul 2021
A Survey on Stance Detection for Mis- and Disinformation Identification
Momchil Hardalov
Arnav Arora
Preslav Nakov
Isabelle Augenstein
109
132
0
27 Feb 2021
Measuring Association Between Labels and Free-Text Rationales
Sarah Wiegreffe
Ana Marasović
Noah A. Smith
274
170
0
24 Oct 2020
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
199
882
0
03 May 2018
1