Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

30 November 2017

Justin Gilmer

Papers citing "Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)"

50 / 1,046 papers shown

Title
Identifying Spurious Correlations using Counterfactual Alignment Joseph Paul Cohen Louis Blankemeier Akshay S. Chaudhari CML 55 1 0 01 Dec 2023
Generative models for visualising abstract social processes: Guiding streetview image synthesis of StyleGAN2 with indices of deprivation Aleksi Knuutila GAN 20 1 0 01 Dec 2023
Benchmarking and Enhancing Disentanglement in Concept-Residual Models Renos Zabounidis Ini Oguntola Konghao Zhao Joseph Campbell Simon Stepputtis Katia P. Sycara 25 1 0 30 Nov 2023
Survey on AI Ethics: A Socio-technical Perspective Dave Mbiazi Meghana Bhange Maryam Babaei Ivaxi Sheth Patrik Joslin Kenfack 23 4 0 28 Nov 2023
Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations Maximilian Dreyer Reduan Achtibat Wojciech Samek Sebastian Lapuschkin 32 10 0 28 Nov 2023
Model-agnostic Body Part Relevance Assessment for Pedestrian Detection Maurice Günder Sneha Banerjee R. Sifa Christian Bauckhage FAtt 26 0 0 27 Nov 2023
Having Second Thoughts? Let's hear it J. H. Lee Sujith Vijayan AAML 11 0 0 26 Nov 2023
Concept Distillation: Leveraging Human-Centered Explanations for Model Improvement Avani Gupta Saurabh Saini P. J. Narayanan 25 6 0 26 Nov 2023
Towards Interpretable Classification of Leukocytes based on Deep Learning S. Röhrl Johannes Groll M. Lengl Simon Schumann C. Klenk D. Heim Martin Knopp Oliver Hayden Klaus Diepold 25 2 0 24 Nov 2023
Labeling Neural Representations with Inverse Recognition Kirill Bykov Laura Kopf Shinichi Nakajima Marius Kloft Marina M.-C. Höhne BDL 21 15 0 22 Nov 2023
Auxiliary Losses for Learning Generalizable Concept-based Models Ivaxi Sheth Samira Ebrahimi Kahou 32 24 0 18 Nov 2023
Representing visual classification as a linear combination of words Shobhit Agarwal Yevgeniy R. Semenov William Lotter 40 1 0 18 Nov 2023
Identifying Linear Relational Concepts in Large Language Models David Chanin Anthony Hunter Oana-Maria Camburu LLMSV KELM 18 4 0 15 Nov 2023
Interpreting Pretrained Language Models via Concept Bottlenecks Zhen Tan Lu Cheng Song Wang Yuan Bo Jundong Li Huan Liu LRM 32 20 0 08 Nov 2023
The Linear Representation Hypothesis and the Geometry of Large Language Models Kiho Park Yo Joong Choe Victor Veitch LLMSV MILM 29 137 0 07 Nov 2023
InterVLS: Interactive Model Understanding and Improvement with Vision-Language Surrogates Jinbin Huang Wenbin He Liangke Gou Liu Ren Chris Bryan VLM 19 1 0 06 Nov 2023
Feature Attribution Explanations for Spiking Neural Networks Elisa Nguyen Meike Nauta G. Englebienne Christin Seifert FAtt AAML LRM 16 0 0 02 Nov 2023
Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions Luca Longo Mario Brcic Federico Cabitza Jaesik Choi Roberto Confalonieri ... Andrés Páez Wojciech Samek Johannes Schneider Timo Speith Simone Stumpf 29 189 0 30 Oct 2023
This Looks Like Those: Illuminating Prototypical Concepts Using Multiple Visualizations Chiyu Ma Brandon Zhao Chaofan Chen Cynthia Rudin 24 26 0 28 Oct 2023
How Well Do Feature-Additive Explainers Explain Feature-Additive Predictors? Zachariah Carmichael Walter J. Scheirer FAtt 39 4 0 27 Oct 2023
Codebook Features: Sparse and Discrete Interpretability for Neural Networks Alex Tamkin Mohammad Taufeeque Noah D. Goodman 32 27 0 26 Oct 2023
This Reads Like That: Deep Learning for Interpretable Natural Language Processing Claudio Fanconi Moritz Vandenhirtz Severin Husmann Julia E. Vogt FAtt 14 2 0 25 Oct 2023
Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving J. Echterhoff An Yan Kyungtae Han Amr Abdelraouf Rohit Gupta Julian McAuley 19 7 0 25 Oct 2023
Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number Sophie Hao Tal Linzen 16 5 0 23 Oct 2023
Preference Elicitation with Soft Attributes in Interactive Recommendation Erdem Biyik Fan Yao Yinlam Chow Alex Haig Chih-Wei Hsu Mohammad Ghavamzadeh Craig Boutilier 19 4 0 22 Oct 2023
Getting aligned on representational alignment Ilia Sucholutsky Lukas Muttenthaler Adrian Weller Andi Peng Andreea Bobu ... Thomas Unterthiner Andrew Kyle Lampinen Klaus-Robert Muller M. Toneva Thomas L. Griffiths 58 74 0 18 Oct 2023
Removing Spurious Concepts from Neural Network Representations via Joint Subspace Estimation Floris Holstege Bram Wouters Noud van Giersbergen C. Diks 34 1 0 18 Oct 2023
From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks Jae Hee Lee Sergio Lanza Stefan Wermter 19 8 0 18 Oct 2023
Explaining Deep Neural Networks for Bearing Fault Detection with Vibration Concepts Thomas Decker Michael Lebacher Volker Tresp FAtt 13 2 0 17 Oct 2023
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations Shiyuan Huang Siddarth Mamidanna Shreedhar Jangam Yilun Zhou Leilani H. Gilpin LRM MILM ELM 37 66 0 17 Oct 2023
Interpreting and Controlling Vision Foundation Models via Text Explanations Haozhe Chen Junfeng Yang Carl Vondrick Chengzhi Mao 19 2 0 16 Oct 2023
Automated Natural Language Explanation of Deep Visual Neurons with Large Models Chenxu Zhao Wei Qian Yucheng Shi Mengdi Huai Ninghao Liu 24 2 0 16 Oct 2023
Transparent Anomaly Detection via Concept-based Explanations Laya Rafiee Sevyeri Ivaxi Sheth Farhood Farahnak Samira Ebrahimi Kahou S. Enger 19 4 0 16 Oct 2023
The Thousand Faces of Explainable AI Along the Machine Learning Life Cycle: Industrial Reality and Current State of Research Thomas Decker Ralf Gross Alexander Koebler Michael Lebacher Ronald Schnitzer Stefan H. Weber 36 2 0 11 Oct 2023
SurroCBM: Concept Bottleneck Surrogate Models for Generative Post-hoc Explanation Bo Pan Zhenke Liu Yifei Zhang Liang Zhao 27 2 0 11 Oct 2023
NeuroInspect: Interpretable Neuron-based Debugging Framework through Class-conditional Visualizations Yeong-Joon Ju Ji-Hoon Park Seong-Whan Lee AAML 20 0 0 11 Oct 2023
Latent Diffusion Counterfactual Explanations Karim Farid Simon Schrodi Max Argus Thomas Brox DiffM 40 12 0 10 Oct 2023
Deep Concept Removal Yegor Klochkov Jean-François Ton Ruocheng Guo Yang Liu Hang Li 13 0 0 09 Oct 2023
Demystifying Embedding Spaces using Large Language Models Guy Tennenholtz Yinlam Chow Chih-Wei Hsu Jihwan Jeong Lior Shani Azamat Tulepbergenov Deepak Ramachandran Martin Mladenov Craig Boutilier 23 11 0 06 Oct 2023
Attributing Learned Concepts in Neural Networks to Training Data N. Konz Charles Godfrey Madelyn Shapiro Jonathan Tu Henry Kvinge Davis Brown TDI FAtt 12 0 0 04 Oct 2023
A Framework for Interpretability in Machine Learning for Medical Imaging Alan Q. Wang Batuhan K. Karaman Heejong Kim Jacob Rosenthal Rachit Saluja Sean I. Young M. Sabuncu AI4CE 17 10 0 02 Oct 2023
Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation Sidney Bender Christopher J. Anders Pattarawat Chormai Heike Marxfeld J. Herrmann G. Montavon CML 25 1 0 02 Oct 2023
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning Xuansheng Wu Wenlin Yao Jianshu Chen Xiaoman Pan Xiaoyang Wang Ninghao Liu Dong Yu LRM 20 26 0 30 Sep 2023
Development of a Deep Learning Method to Identify Acute Ischemic Stroke Lesions on Brain CT Alessandro Fontanella Wenwen Li Grant Mair Antreas Antoniou Eleanor Platt Paul Armitage Emanuele Trucco Joanna M. Wardlaw Amos Storkey OOD 9 3 0 29 Sep 2023
Learning to Receive Help: Intervention-Aware Concept Embedding Models M. Zarlenga Katherine M. Collins Krishnamurthy Dvijotham Adrian Weller Z. Shams M. Jamnik 19 23 0 29 Sep 2023
DeepRepViz: Identifying Confounders in Deep Learning Model Predictions R. Rane JiHoon Kim Arjun Umesha Didem Stark Marc-Andre Schulz K. Ritter BDL 13 0 0 27 Sep 2023
Explaining Deep Face Algorithms through Visualization: A Survey Thrupthi Ann S. M. I. C. V. Balasubramanian M. Jawahar CVBM 32 1 0 26 Sep 2023
I-AI: A Controllable & Interpretable AI System for Decoding Radiologists' Intense Focus for Accurate CXR Diagnoses Trong-Thang Pham Jacob Brecheisen Anh Nguyen Hien Nguyen Ngan Le 19 12 0 24 Sep 2023
State2Explanation: Concept-Based Explanations to Benefit Agent Learning and User Understanding Devleena Das Sonia Chernova Been Kim LRM LLMAG 28 22 0 21 Sep 2023
DiffusionWorldViewer: Exposing and Broadening the Worldview Reflected by Generative Text-to-Image Models Zoe De Simone Angie Boggust Arvindmani Satyanarayan Ashia Wilson 34 1 0 18 Sep 2023