Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.03717
Cited By
Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations
10 March 2017
A. Ross
M. C. Hughes
Finale Doshi-Velez
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations"
50 / 75 papers shown
Title
Large Language Models as Attribution Regularizers for Efficient Model Training
Davor Vukadin
Marin Šilić
Goran Delač
36
0
0
27 Feb 2025
Diagnosing COVID-19 Severity from Chest X-Ray Images Using ViT and CNN Architectures
Luis Lara
Lucia Eve Berger
Rajesh Raju
ViT
30
0
0
23 Feb 2025
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Shreyash Arya
Sukrut Rao
Moritz Bohle
Bernt Schiele
68
2
0
28 Jan 2025
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers
Lam Nguyen Tung
Steven Cho
Xiaoning Du
Neelofar Neelofar
Valerio Terragni
Stefano Ruberto
Aldeida Aleti
71
2
0
30 Oct 2024
Problem Solving Through Human-AI Preference-Based Cooperation
Subhabrata Dutta
Timo Kaufmann
Goran Glavas
Ivan Habernal
Kristian Kersting
Frauke Kreuter
Mira Mezini
Iryna Gurevych
Eyke Hüllermeier
Hinrich Schuetze
82
1
0
14 Aug 2024
Language-guided Detection and Mitigation of Unknown Dataset Bias
Zaiying Zhao
Soichiro Kumano
Toshihiko Yamasaki
34
2
0
05 Jun 2024
AI, Meet Human: Learning Paradigms for Hybrid Decision Making Systems
Clara Punzi
Roberto Pellungrini
Mattia Setzu
F. Giannotti
D. Pedreschi
15
5
0
09 Feb 2024
Identifying Spurious Correlations using Counterfactual Alignment
Joseph Paul Cohen
Louis Blankemeier
Akshay S. Chaudhari
CML
49
1
0
01 Dec 2023
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
15
5
0
29 Nov 2023
Explaining high-dimensional text classifiers
Odelia Melamed
Rich Caruana
6
0
0
22 Nov 2023
SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training
Rui Xu
Wenkang Qin
Peixiang Huang
Hao Wang
Lin Luo
FAtt
AAML
14
2
0
09 Nov 2023
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
77
7
0
14 Sep 2023
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods
Robin Hesse
Simone Schaub-Meyer
Stefan Roth
AAML
27
32
0
11 Aug 2023
Unlearning Spurious Correlations in Chest X-ray Classification
Misgina Tsighe Hagos
Kathleen M. Curran
Brian Mac Namee
CML
OOD
10
0
0
02 Aug 2023
Mitigating Bias: Enhancing Image Classification by Improving Model Explanations
Raha Ahmadi
Mohammad Javad Rajabi
Mohammad Khalooiem
Mohammad Sabokrou
19
0
0
04 Jul 2023
Learning Differentiable Logic Programs for Abstract Visual Reasoning
Hikaru Shindo
Viktor Pfanschilling
D. Dhami
Kristian Kersting
NAI
19
6
0
03 Jul 2023
One Explanation Does Not Fit XIL
Felix Friedrich
David Steinmann
Kristian Kersting
LRM
32
2
0
14 Apr 2023
Are Data-driven Explanations Robust against Out-of-distribution Data?
Tang Li
Fengchun Qiao
Mengmeng Ma
Xiangkai Peng
OODD
OOD
28
10
0
29 Mar 2023
Learning with Explanation Constraints
Rattana Pukdee
Dylan Sam
J. Zico Kolter
Maria-Florina Balcan
Pradeep Ravikumar
FAtt
19
6
0
25 Mar 2023
Towards Learning and Explaining Indirect Causal Effects in Neural Networks
Abbaavaram Gowtham Reddy
Saketh Bachu
Harsh Nilesh Pathak
Ben Godfrey
V. Balasubramanian
V. Varshaneya
Satya Narayanan Kar
CML
26
0
0
24 Mar 2023
Towards Explaining Subjective Ground of Individuals on Social Media
Younghun Lee
Dan Goldwasser
12
1
0
18 Nov 2022
XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models
Dong-Ho Lee
Akshen Kadakia
Brihi Joshi
Aaron Chan
Ziyi Liu
...
Takashi Shibuya
Ryosuke Mitani
Toshiyuki Sekiya
Jay Pujara
Xiang Ren
LRM
22
9
0
30 Oct 2022
Revision Transformers: Instructing Language Models to Change their Values
Felix Friedrich
Wolfgang Stammer
P. Schramowski
Kristian Kersting
KELM
14
6
0
19 Oct 2022
Equivariant and Invariant Grounding for Video Question Answering
Yicong Li
Xiang Wang
Junbin Xiao
Tat-Seng Chua
14
25
0
26 Jul 2022
RES: A Robust Framework for Guiding Visual Explanation
Yuyang Gao
Tong Sun
Guangji Bai
Siyi Gu
S. Hong
Liang Zhao
FAtt
AAML
XAI
16
31
0
27 Jun 2022
The Importance of Background Information for Out of Distribution Generalization
Jupinder Parmar
Khaled Kamal Saab
Brian Pogatchnik
D. Rubin
Christopher Ré
OOD
11
0
0
17 Jun 2022
Optimizing Relevance Maps of Vision Transformers Improves Robustness
Hila Chefer
Idan Schwartz
Lior Wolf
ViT
14
37
0
02 Jun 2022
Learning to Ignore Adversarial Attacks
Yiming Zhang
Yan Zhou
Samuel Carton
Chenhao Tan
41
2
0
23 May 2022
Perspectives on Incorporating Expert Feedback into Model Updates
Valerie Chen
Umang Bhatt
Hoda Heidari
Adrian Weller
Ameet Talwalkar
30
11
0
13 May 2022
Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection
Tulika Bose
Nikolaos Aletras
Irina Illina
Dominique Fohr
40
5
0
23 Mar 2022
Aligning Eyes between Humans and Deep Neural Network through Interactive Attention Alignment
Yuyang Gao
Tong Sun
Liang Zhao
Sungsoo Ray Hong
HAI
13
37
0
06 Feb 2022
Right for the Right Latent Factors: Debiasing Generative Models via Disentanglement
Xiaoting Shao
Karl Stelzner
Kristian Kersting
CML
DRL
22
3
0
01 Feb 2022
Debiased-CAM to mitigate systematic error with faithful visual explanations of machine learning
Wencan Zhang
Mariella Dimiccoli
Brian Y. Lim
FAtt
6
1
0
30 Jan 2022
Making a (Counterfactual) Difference One Rationale at a Time
Michael J. Plyler
Michal Green
Min Chi
16
10
0
13 Jan 2022
Towards Relatable Explainable AI with the Perceptual Process
Wencan Zhang
Brian Y. Lim
AAML
XAI
9
61
0
28 Dec 2021
What to Learn, and How: Toward Effective Learning from Rationales
Samuel Carton
Surya Kanoria
Chenhao Tan
10
22
0
30 Nov 2021
Improving Deep Learning Interpretability by Saliency Guided Training
Aya Abdelsalam Ismail
H. C. Bravo
S. Feizi
FAtt
16
78
0
29 Nov 2021
Matching Learned Causal Effects of Neural Networks with Domain Priors
Sai Srinivas Kancheti
Abbavaram Gowtham Reddy
V. Balasubramanian
Amit Sharma
CML
6
11
0
24 Nov 2021
Toward Learning Human-aligned Cross-domain Robust Models by Countering Misaligned Features
Haohan Wang
Zeyi Huang
Hanlin Zhang
Yong Jae Lee
Eric P. Xing
OOD
122
16
0
05 Nov 2021
Modeling Techniques for Machine Learning Fairness: A Survey
Mingyang Wan
Daochen Zha
Ninghao Liu
Na Zou
SyDa
FaML
17
36
0
04 Nov 2021
Equality of opportunity in travel behavior prediction with deep neural networks and discrete choice models
Yunhan Zheng
Shenhao Wang
Jinhuan Zhao
HAI
9
26
0
25 Sep 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
13
16
0
31 Aug 2021
Improving the trustworthiness of image classification models by utilizing bounding-box annotations
K. Dharma
Chicheng Zhang
17
5
0
15 Aug 2021
Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization
Damien Teney
Ehsan Abbasnejad
Simon Lucey
A. Hengel
17
86
0
12 May 2021
Shapley Explanation Networks
Rui Wang
Xiaoqian Wang
David I. Inouye
TDI
FAtt
9
44
0
06 Apr 2021
Efficient Explanations from Empirical Explainers
Robert Schwarzenberg
Nils Feldhus
Sebastian Möller
FAtt
15
9
0
29 Mar 2021
Large Pre-trained Language Models Contain Human-like Biases of What is Right and Wrong to Do
P. Schramowski
Cigdem Turan
Nico Andersen
Constantin Rothkopf
Kristian Kersting
17
280
0
08 Mar 2021
EnD: Entangling and Disentangling deep representations for bias correction
Enzo Tartaglione
C. Barbano
Marco Grangetto
11
124
0
02 Mar 2021
Contrastive Explanations for Model Interpretability
Alon Jacovi
Swabha Swayamdipta
Shauli Ravfogel
Yanai Elazar
Yejin Choi
Yoav Goldberg
17
94
0
02 Mar 2021
Gifsplanation via Latent Shift: A Simple Autoencoder Approach to Counterfactual Generation for Chest X-rays
Joseph Paul Cohen
Rupert Brooks
Sovann En
Evan Zucker
Anuj Pareek
M. Lungren
Akshay S. Chaudhari
FAtt
MedIm
15
4
0
18 Feb 2021
1
2
Next