Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

30 November 2017

Justin Gilmer

Papers citing "Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)"

50 / 1,045 papers shown

Title
Mask-Free Neuron Concept Annotation for Interpreting Neural Networks in Medical Domain Hyeon Bae Kim Yong Hyun Ahn Seong Tae Kim 40 1 0 16 Jul 2024
Understanding the Dependence of Perception Model Competency on Regions in an Image Sara Pohland Claire Tomlin 19 1 0 15 Jul 2024
Towards More Trustworthy and Interpretable LLMs for Code through Syntax-Grounded Explanations David Nader-Palacio Daniel Rodríguez-Cárdenas Alejandro Velasco Dipin Khati Kevin Moran Denys Poshyvanyk 50 6 0 12 Jul 2024
Constructing Concept-based Models to Mitigate Spurious Correlations with Minimal Human Effort Jeeyung Kim Ze Wang Qiang Qiu 38 1 0 12 Jul 2024
Understanding Visual Feature Reliance through the Lens of Complexity Thomas Fel Louis Bethune Andrew Kyle Lampinen Thomas Serre Katherine Hermann FAtt CoGe 30 6 0 08 Jul 2024
Identifying the Source of Generation for Large Language Models Bumjin Park Jaesik Choi 29 0 0 05 Jul 2024
Crafting Large Language Models for Enhanced Interpretability Chung-En Sun Tuomas P. Oikarinen Tsui-Wei Weng 35 6 0 05 Jul 2024
Concept Bottleneck Models Without Predefined Concepts Simon Schrodi Julian Schur Max Argus Thomas Brox 35 9 0 04 Jul 2024
DocXplain: A Novel Model-Agnostic Explainability Method for Document Image Classification S. Saifullah S. Agne Andreas Dengel Sheraz Ahmed 29 0 0 04 Jul 2024
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models Jayneel Parekh Quentin Bouniot Pavlo Mozharovskyi A. Newson Florence dÁlché-Buc SSL 61 1 0 01 Jul 2024
Explaining Chest X-ray Pathology Models using Textual Concepts Vijay Sadashivaiah M. Kalra P. Yan James A. Hendler 19 0 0 30 Jun 2024
FI-CBL: A Probabilistic Method for Concept-Based Learning with Expert Rules Lev V. Utkin A. Konstantinov Stanislav R. Kirpichenko 36 0 0 28 Jun 2024
Stochastic Concept Bottleneck Models Moritz Vandenhirtz Sonia Laguna Ricards Marcinkevics Julia E. Vogt 43 9 0 27 Jun 2024
Evidential Concept Embedding Models: Towards Reliable Concept Explanations for Skin Disease Diagnosis Yibo Gao Zheyao Gao Xin Gao Yuanye Liu Bomin Wang Xiahai Zhuang 28 1 0 27 Jun 2024
Towards Compositionality in Concept Learning Adam Stein Aaditya Naik Yinjun Wu Mayur Naik Eric Wong CoGe 39 2 0 26 Jun 2024
InFiConD: Interactive No-code Fine-tuning with Concept-based Knowledge Distillation Jinbin Huang Wenbin He Liang Gou Liu Ren Chris Bryan 47 0 0 25 Jun 2024
Large Language Models are Interpretable Learners Ruochen Wang Si Si Felix X. Yu Dorothea Wiesmann Cho-Jui Hsieh Inderjit Dhillon 24 3 0 25 Jun 2024
LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models Mengdan Zhu Raasikh Kanjiani Jiahui Lu Andrew Choi Qirui Ye Liang Zhao DiffM 39 1 0 21 Jun 2024
This Looks Better than That: Better Interpretable Models with ProtoPNeXt Frank Willard Luke Moffett Emmanuel Mokel Jon Donnelly Stark Guo Julia Yang Giyoung Kim Alina Jade Barnett Cynthia Rudin 34 4 0 20 Jun 2024
Self-supervised Interpretable Concept-based Models for Text Classification Francesco De Santis Philippe Bich Gabriele Ciravegna Pietro Barbiero Danilo Giordano Tania Cerquitelli 31 0 0 20 Jun 2024
Reasoning with trees: interpreting CNNs using hierarchies Caroline Mazini Rodrigues Nicolas Boutry Laurent Najman 18 0 0 19 Jun 2024
Investigating the Role of Explainability and AI Literacy in User Compliance Niklas Kühl Christian Meske Maximilian Nitsche Jodie Lobana 26 4 0 18 Jun 2024
Probabilistic Conceptual Explainers: Trustworthy Conceptual Explanations for Vision Foundation Models Hengyi Wang Shiwei Tan Hao Wang BDL 42 6 0 18 Jun 2024
A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning Lijie Hu Liang Liu Shu Yang Xin Chen Hongru Xiao Mengdi Li Pan Zhou Muhammad Asif Ali Di Wang LRM 37 5 0 18 Jun 2024
GECOBench: A Gender-Controlled Text Dataset and Benchmark for Quantifying Biases in Explanations Rick Wilming Artur Dox Hjalmar Schulz Marta Oliveira Benedict Clark Stefan Haufe 35 2 0 17 Jun 2024
Concept-skill Transferability-based Data Selection for Large Vision-Language Models Jaewoo Lee Boyang Li Sung Ju Hwang VLM 37 8 0 16 Jun 2024
Challenges in explaining deep learning models for data with biological variation Lenka Tětková E. Dreier Robin Malm Lars Kai Hansen AAML 38 1 0 14 Jun 2024
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models Jack Merullo Carsten Eickhoff Ellie Pavlick 58 13 0 13 Jun 2024
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery Kam Woh Ng Xiatian Zhu Yi-Zhe Song Tao Xiang 37 2 0 12 Jun 2024
A Concept-Based Explainability Framework for Large Multimodal Models Jayneel Parekh Pegah Khayatan Mustafa Shukor A. Newson Matthieu Cord 34 16 0 12 Jun 2024
Designing a Dashboard for Transparency and Control of Conversational AI Yida Chen Aoyu Wu Trevor DePodesta Catherine Yeh Kenneth Li ... Jan Riecke Shivam Raval Olivia Seow Martin Wattenberg Fernanda Viégas 44 16 0 12 Jun 2024
Understanding Inhibition Through Maximally Tense Images Chris Hamblin Srijani Saha Talia Konkle George Alvarez FAtt 32 0 0 08 Jun 2024
Attri-Net: A Globally and Locally Inherently Interpretable Model for Multi-Label Classification Using Class-Specific Counterfactuals Susu Sun S. Woerner Andreas Maier Lisa M. Koch Christian F. Baumgartner FAtt 25 1 0 08 Jun 2024
Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents Yoann Poupart 31 0 0 06 Jun 2024
Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience Martina G. Vilas Federico Adolfi David Poeppel Gemma Roig 42 5 0 03 Jun 2024
VOICE: Variance of Induced Contrastive Explanations to quantify Uncertainty in Neural Network Interpretability M. Prabhushankar Ghassan AlRegib FAtt UQCV 29 2 0 01 Jun 2024
Searching for internal symbols underlying deep learning J. H. Lee Sujith Vijayan AI4CE 29 0 0 31 May 2024
I Bet You Did Not Mean That: Testing Semantic Importance via Betting Jacopo Teneggi Jeremias Sulam FAtt 28 1 0 29 May 2024
Understanding Inter-Concept Relationships in Concept-Based Models Naveen Raman M. Zarlenga M. Jamnik 27 4 0 28 May 2024
Interpretable Prognostics with Concept Bottleneck Models Florent Forest Katharina Rombach Olga Fink 25 0 0 27 May 2024
Locally Testing Model Detections for Semantic Global Concepts Franz Motzkus Georgii Mikriukov Christian Hellert Ute Schmid 30 2 0 27 May 2024
Exploring the LLM Journey from Cognition to Expression with Linear Representations Yuzi Yan J. Li Yipin Zhang Dong Yan 41 1 0 27 May 2024
Concept-based Explainable Malignancy Scoring on Pulmonary Nodules in CT Images Rinat I. Dumaev S A Molodyakov Lev V. Utkin 21 0 0 24 May 2024
Controllable Continual Test-Time Adaptation Ziqi Shi Fan Lyu Ye Liu Fanhua Shang Fuyuan Hu Wei Feng Zhang Zhang Liang Wang TTA 60 2 0 23 May 2024
Automatically Identifying Local and Global Circuits with Linear Computation Graphs Xuyang Ge Fukang Zhu Wentao Shu Junxuan Wang Zhengfu He Xipeng Qiu 22 8 0 22 May 2024
Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model Mounes Zaval Sedat Ozer LRM 20 0 0 20 May 2024
Towards Gradient-based Time-Series Explanations through a SpatioTemporal Attention Network Min Hun Lee AI4TS ViT FAtt 24 3 0 18 May 2024
Contestable AI needs Computational Argumentation Francesco Leofante Hamed Ayoobi Adam Dejl Gabriel Freedman Deniz Gorur ... Anna Rapberger Fabrizio Russo Xiang Yin Dekai Zhang Francesca Toni 30 3 0 17 May 2024
Deep Learning in Earthquake Engineering: A Comprehensive Review Yazhou Xie AI4CE 27 5 0 15 May 2024
Error-margin Analysis for Hidden Neuron Activation Labels Abhilekha Dalal R. Rayan Pascal Hitzler FAtt 31 1 0 14 May 2024