Robust and Stable Black Box Explanations

International Conference on Machine Learning (ICML), 2020

12 November 2020

Himabindu Lakkaraju

Papers citing "Robust and Stable Black Box Explanations"

47 / 47 papers shown

Axiomatic Explainer Globalness via Optimal TransportInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

558

13 Mar 2025

Interpretable Model Drift Detection

Pranoy Panda

Kancheti Sai Srinivas

V. Balasubramanian

Gaurav Sinha

370

09 Mar 2025

An Evaluation of Explanation Methods for Black-Box Detectors of Machine-Generated Text

304

26 Aug 2024

On the Robustness of Global Feature Effect Explanations

378

13 Jun 2024

Robust Explainable Recommendation

Sairamvinay Vijayaraghavan

Prasant Mohapatra

AAML

352

03 May 2024

T-Explainer: A Model-Agnostic Explainability Framework Based on Gradients

Claudio T. Silva

543

25 Apr 2024

Revealing Vulnerabilities of Neural Networks in Parameter Learning and Defense Against Explanation-Aware Backdoors

354

25 Mar 2024

X Hacking: The Threat of Misguided AutoML

556

16 Jan 2024

Advancing Ante-Hoc Explainable Models through Generative Adversarial Networks

Tanmay Garg

Deepika Vemuri

Vineeth N. Balasubramanian

GAN

262

09 Jan 2024

Rethinking Robustness of Model AttributionsAAAI Conference on Artificial Intelligence (AAAI), 2023

Sandesh Kamath

Sankalp Mittal

Amit Deshpande

Vineeth N. Balasubramanian

276

16 Dec 2023

How Well Do Feature-Additive Explainers Explain Feature-Additive Predictors?

Zachariah Carmichael

Walter J. Scheirer

FAtt

306

27 Oct 2023

Confident Feature RankingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023

Bitya Neuhof

Y. Benjamini

FAtt

373

28 Jul 2023

Explainable AI using expressive Boolean formulasMachine Learning and Knowledge Extraction (MLKE), 2023

262

06 Jun 2023

Adversarial attacks and defenses in explainable artificial intelligence: A surveyInformation Fusion (Inf. Fusion), 2023

Hubert Baniecki

P. Biecek

AAML

634

141

06 Jun 2023

Post Hoc Explanations of Language Models Can Improve Language ModelsNeural Information Processing Systems (NeurIPS), 2023

Jiaqi Ma

Himabindu Lakkaraju

285

19 May 2023

Robust Explanation Constraints for Neural NetworksInternational Conference on Learning Representations (ICLR), 2022

264

16 Dec 2022

Understanding and Enhancing Robustness of Concept-based ModelsAAAI Conference on Artificial Intelligence (AAAI), 2022

Aidong Zhang

278

29 Nov 2022

A.I. Robustness: a Human-Centered Perspective on Technological Challenges and OpportunitiesACM Computing Surveys (ACM CSUR), 2022

371

17 Oct 2022

Inferring Sensitive Attributes from Model ExplanationsInternational Conference on Information and Knowledge Management (CIKM), 2022

Vasisht Duddu

A. Boutet

MIACV SILM

335

21 Aug 2022

A Query-Optimal Algorithm for Finding CounterfactualsInternational Conference on Machine Learning (ICML), 2022

329

14 Jul 2022

Explaining the root causes of unit-level changes

226

26 Jun 2022

Analyzing Explainer Robustness via Probabilistic Lipschitzness of Prediction FunctionsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022

Jennifer Dy

532

24 Jun 2022

An empirical study of the effect of background data size on the stability of SHapley Additive exPlanations (SHAP) for deep learning models

300

24 Apr 2022

Framework for Evaluating Faithfulness of Local ExplanationsInternational Conference on Machine Learning (ICML), 2022

482

01 Feb 2022

Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant LearningNeural Information Processing Systems (NeurIPS), 2022

Amit Dhurandhar

Karthikeyan N. Ramamurthy

Kartik Ahuja

Vijay Arya

FAtt

356

28 Jan 2022

From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AIACM Computing Surveys (ACM CSUR), 2022

810

640

20 Jan 2022

Provably efficient, succinct, and precise explanationsNeural Information Processing Systems (NeurIPS), 2021

324

01 Nov 2021

A Survey on the Robustness of Feature Importance and Counterfactual Explanations

337

30 Oct 2021

Making Corgis Important for Honeycomb Classification: Adversarial Attacks on Concept-based Explainability Tools

Davis Brown

Henry Kvinge

AAML

316

14 Oct 2021

Self-learn to Explain Siamese Networks Robustly

220

15 Sep 2021

A Framework for Learning Ante-hoc Explainable Models via ConceptsComputer Vision and Pattern Recognition (CVPR), 2021

292

25 Aug 2021

Perturbing Inputs for Fragile Interpretations in Deep Natural Language ProcessingBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2021

345

11 Aug 2021

Extending LIME for Business Process Automation

260

09 Aug 2021

Probing GNN Explainers: A Rigorous Theoretical and Empirical Analysis of GNN Explanation Methods

Chirag Agarwal

Marinka Zitnik

Himabindu Lakkaraju

292

16 Jun 2021

Taxonomy of Machine Learning Safety: A Survey and PrimerACM Computing Surveys (CSUR), 2021

366

09 Jun 2021

On the Lack of Robust Interpretability of Neural Text ClassifiersFindings (Findings), 2021

158

08 Jun 2021

Evaluating Local Explanations using White-box Models

Amir Hossein Akhavan Rahnama

301

04 Jun 2021

Towards Robust and Reliable Algorithmic RecourseNeural Information Processing Systems (NeurIPS), 2021

Sohini Upadhyay

Shalmali Joshi

Himabindu Lakkaraju

385

127

26 Feb 2021

Do Input Gradients Highlight Discriminative Features?Neural Information Processing Systems (NeurIPS), 2021

Harshay Shah

Prateek Jain

Praneeth Netrapalli

AAML FAtt

411

25 Feb 2021

Attribution Mask: Filtering Out Irrelevant Features By Recursively Focusing Attention on Inputs of DNNs

Jaehwan Lee

Joon‐Hyuk Chang

TDI FAtt

248

15 Feb 2021

Connecting Interpretability and Robustness in Decision Trees through SeparationInternational Conference on Machine Learning (ICML), 2021

Michal Moshkovitz

Yao-Yuan Yang

Kamalika Chaudhuri

191

14 Feb 2021

Towards Robust Explanations for Deep Neural NetworksPattern Recognition (Pattern Recognit.), 2020

Ann-Kathrin Dombrowski

Christopher J. Anders

K. Müller

Pan Kessel

FAtt

357

18 Dec 2020

Learning Models for Actionable RecourseNeural Information Processing Systems (NeurIPS), 2020

Alexis Ross

Himabindu Lakkaraju

Osbert Bastani

FaML

314

12 Nov 2020

A Framework to Learn with InterpretationNeural Information Processing Systems (NeurIPS), 2020

471

19 Oct 2020

Counterfactual Explanations for Machine Learning on Multivariate Time Series Data

301

25 Aug 2020

The best way to select features?The Journal of Financial Data Science (JFDS), 2020

Xin Man

Ernest P. Chan

135

26 May 2020

Adversarial Discriminative Domain AdaptationComputer Vision and Pattern Recognition (CVPR), 2017

1.5K

5,156

17 Feb 2017