v1v2v3 (latest)

Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence

International Conference on Learning Representations (ICLR), 2022

7 February 2022

Christopher J. Anders

Thomas Wiegand

Wojciech Samek

Sebastian Lapuschkin

ArXiv (abs)PDF HTML Github

Papers citing "Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence"

49 / 49 papers shown

Probing the Probes: Methods and Metrics for Concept Alignment

247

06 Nov 2025

Mitigating Clever Hans Strategies in Image Classifiers through Generating Counterexamples

276

20 Oct 2025

TDHook: A Lightweight Framework for Interpretability

Yoann Poupart

AI4CE

199

29 Sep 2025

Concept activation vectors: a unifying view and adversarial attacks

139

26 Sep 2025

In-hoc Concept Representations to Regularise Deep Learning in Medical Imaging

163

19 Aug 2025

From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance

279

26 May 2025

FastCAV: Efficient Computation of Concept Activation Vectors for Explaining Deep Neural Networks

225

23 May 2025

Steering CLIP's vision transformer with sparse autoencoders

357

11 Apr 2025

Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations

416

07 Mar 2025

Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?

Stefan Wermter

378

20 Sep 2024

Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression

274

15 Apr 2024

Manipulating Feature Visualizations with Gradient Slingshots

503

11 Jan 2024

Emergent Linear Representations in World Models of Self-Supervised Sequence ModelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023

383

299

02 Sep 2023

From Hope to Safety: Unlearning Biases of Deep Models via Gradient Penalization in Latent SpaceAAAI Conference on Artificial Intelligence (AAAI), 2023

Maximilian Dreyer

Frederik Pahde

Christopher J. Anders

Wojciech Samek

Sebastian Lapuschkin

AI4CE

302

18 Aug 2023

FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI MethodsIEEE International Conference on Computer Vision (ICCV), 2023

332

11 Aug 2023

Reveal to Revise: An Explainable AI Life Cycle for Iterative Bias Correction of Deep ModelsInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2023

221

22 Mar 2023

Concept Algebra for (Score-Based) Text-Controlled Generative ModelsNeural Information Processing Systems (NeurIPS), 2023

721

07 Feb 2023

Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees

Johanna Vielhaben

Stefan Blücher

Nils Strodthoff

276

27 Jan 2023

CRAFT: Concept Recursive Activation FacTorization for ExplainabilityComputer Vision and Pattern Recognition (CVPR), 2022

438

189

17 Nov 2022

Concept Activation Regions: A Generalized Framework For Concept-Based ExplanationsNeural Information Processing Systems (NeurIPS), 2022

Jonathan Crabbé

M. Schaar

383

22 Sep 2022

Toy Models of Superposition

...

1.9K

703

21 Sep 2022

From Attribution Maps to Human-Understandable Explanations through Concept Relevance PropagationNature Machine Intelligence (Nat. Mach. Intell.), 2022

322

213

07 Jun 2022

Post-hoc Concept Bottleneck ModelsInternational Conference on Learning Representations (ICLR), 2022

Mert Yuksekgonul

Maggie Wang

James Zou

525

286

31 May 2022

Beyond Explaining: Opportunities and Challenges of XAI-Based Model ImprovementInformation Fusion (Inf. Fusion), 2022

303

133

15 Mar 2022

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

Konpat Preechakul

Nattanat Chatthee

Suttisak Wizadwongsa

Supasorn Suwajanakorn

SyDa DiffM

548

573

30 Nov 2021

Acquisition of Chess Knowledge in AlphaZero

536

197

17 Nov 2021

ResNet strikes back: An improved training procedure in timm

692

596

01 Oct 2021

Software for Dataset-wide XAI: From Local Explanations to Global Insights with Zennit, CoRelAy, and ViRelAy

Christopher J. Anders

289

24 Jun 2021

ImageNet-21K Pretraining for the Masses

994

907

22 Apr 2021

Robust Semantic Interpretability: Revisiting Concept Activation Vectors

163

06 Apr 2021

EfficientNetV2: Smaller Models and Faster TrainingInternational Conference on Machine Learning (ICML), 2021

Mingxing Tan

Quoc V. Le

EgoV

1.7K

4,136

01 Apr 2021

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy

...

1.6K

60,663

22 Oct 2020

Understanding the Role of Individual Units in a Deep Neural NetworkProceedings of the National Academy of Sciences of the United States of America (PNAS), 2020

Jun-Yan Zhu

Antonio Torralba

436

514

10 Sep 2020

Rethinking Channel Dimensions for Efficient Model Design

325

108

02 Jul 2020

Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors

Benjamin I. P. Rubinstein

FAtt

500

140

27 Jun 2020

Concept Whitening for Interpretable Image RecognitionNature Machine Intelligence (NMI), 2020

Zhi Chen

Yijie Bei

Cynthia Rudin

FAtt

716

363

05 Feb 2020

Towards Best Practice in Explaining Neural Network Decisions with LRPIEEE International Joint Conference on Neural Network (IJCNN), 2019

483

171

22 Oct 2019

BCN20000: Dermoscopic Lesions in the WildScientific Data (Sci Data), 2019

...

428

609

06 Aug 2019

EfficientNet: Rethinking Model Scaling for Convolutional Neural NetworksInternational Conference on Machine Learning (ICML), 2019

Mingxing Tan

Quoc V. Le

3DV MedIm

899

23,288

28 May 2019

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)

Justin Gilmer

755

2,241

30 Nov 2017

Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC)

...

Konstantinos Liopyris

N. Mishra

Harald Kittler

Allan Halpern

805

2,574

13 Oct 2017

A Unified Approach to Interpreting Model Predictions

Scott M. Lundberg

Su-In Lee

FAtt

5.2K

32,979

22 May 2017

Learning how to explain neural networks: PatternNet and PatternAttribution

Pieter-Jan Kindermans

362

368

16 May 2017

Learning to Generate Reviews and Discovering Sentiment

Alec Radford

Rafal Jozefowicz

Ilya Sutskever

421

542

05 Apr 2017

Aggregated Residual Transformations for Deep Neural Networks

Piotr Dollár

1.3K

11,529

16 Nov 2016

Understanding intermediate layers using linear classifier probesInternational Conference on Learning Representations (ICLR), 2016

Guillaume Alain

Yoshua Bengio

FAtt

881

1,313

05 Oct 2016

Deep Residual Learning for Image Recognition

4.2K

225,080

10 Dec 2015

Deep Learning Face Attributes in the WildIEEE International Conference on Computer Vision (ICCV), 2014

1.7K

9,468

28 Nov 2014

Very Deep Convolutional Networks for Large-Scale Image RecognitionInternational Conference on Learning Representations (ICLR), 2014

Karen Simonyan

Andrew Zisserman

FAtt MDE

4.0K

110,590

04 Sep 2014