Network Dissection: Quantifying Interpretability of Deep Visual Representations

19 April 2017

Antonio Torralba

Papers citing "Network Dissection: Quantifying Interpretability of Deep Visual Representations"

50 / 842 papers shown

Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework

390

18 Jun 2025

NERO: Explainable Out-of-Distribution Detection with Neuron-level RelevanceInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025

337

18 Jun 2025

Vision Transformers Don't Need Trained Registers

489

09 Jun 2025

InverseScope: Scalable Activation Inversion for Interpreting Large Language Models

Yifan Luo

Zhennan Zhou

Bin Dong

167

09 Jun 2025

CASE: Contrastive Activation for Saliency Estimation

366

08 Jun 2025

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

171

06 Jun 2025

FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models

296

27 May 2025

Relevance-driven Input Dropout: an Explanation-guided Regularization Technique

405

27 May 2025

FastCAV: Efficient Computation of Concept Activation Vectors for Explaining Deep Neural Networks

175

23 May 2025

Out-of-Distribution Detection via Channelwise Feature Aggregation in Neural Network-Based Receivers

372

21 May 2025

The Spotlight Resonance Method: Resolving the Alignment of Embedded Activations

George Bird

198

09 May 2025

ChannelExplorer: Exploring Class Separability Through Activation Channel Visualization

Md Rahat-uz- Zaman

Bei Wang

Paul Rosen

212

06 May 2025

Task Reconstruction and Extrapolation for

π_0

using Text Latent

Quanyi Li

641

06 May 2025

Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability

Dip Roy

CML

06 May 2025

The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning

1.0K

30 Apr 2025

Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization

Emiliano Penaloza

Tianyue H. Zhan

Laurent Charlin

Mateo Espinosa Zarlenga

554

25 Apr 2025

Avoiding Leakage Poisoning: Concept Interventions Under Distribution Shifts

1.1K

24 Apr 2025

Decoding Vision Transformers: the Diffusion Steering Lens

375

18 Apr 2025

Measuring the (Un)Faithfulness of Concept-Based Explanations

Shubham Kumar

Dwip Dalal

524

15 Apr 2025

Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning

Saif Punjwani

Larry Heck

LRM

235

14 Apr 2025

On Background Bias of Post-Hoc Concept Embeddings in Computer Vision DNNs

357

11 Apr 2025

From Colors to Classes: Emergence of Concepts in Vision Transformers

219

31 Mar 2025

Towards Human-Understandable Multi-Dimensional Concept DiscoveryComputer Vision and Pattern Recognition (CVPR), 2025

258

24 Mar 2025

Automated Processing of eXplainable Artificial Intelligence Outputs in Deep Learning Models for Fault Diagnostics of Large InfrastructuresEngineering applications of artificial intelligence (EAAI), 2025

218

19 Mar 2025

Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation

370

16 Mar 2025

Learning Interpretable Logic Rules from Deep Vision Models

256

13 Mar 2025

Discovering Influential Neuron Path in Vision TransformersInternational Conference on Learning Representations (ICLR), 2025

605

12 Mar 2025

Backdooring CLIP through Concept Confusion

314

12 Mar 2025

QPM: Discrete Optimization for Globally Interpretable Image ClassificationInternational Conference on Learning Representations (ICLR), 2025

368

27 Feb 2025

Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts

Chaitanya Kapoor

Sudhanshu Srivastava

Meenakshi Khosla

376

26 Feb 2025

Model LakesInternational Conference on Extending Database Technology (EDBT), 2024

Koyena Pal

David Bau

Renée J. Miller

340

24 Feb 2025

LaVCa: LLM-assisted Visual Cortex Captioning

Takuya Matsuyama

Shinji Nishimoto

Yu Takagi

313

20 Feb 2025

Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

298

18 Feb 2025

TinyEmo: Scaling down Emotional Reasoning via Metric Projection

Cristian Gutierrez

LRM

523

17 Feb 2025

We Can't Understand AI Using our Existing Vocabulary

John Hewitt

Robert Geirhos

Been Kim

320

11 Feb 2025

Interpretable and Testable Vision Features via Sparse Autoencoders

397

10 Feb 2025

Deciphering Functions of Neurons in Vision-Language Models

863

10 Feb 2025

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

Konstantinos G. Derpanis

344

06 Feb 2025

Compositional Concept-Based Neuron-Level Interpretability for Deep Reinforcement Learning

201

02 Feb 2025

Dimensions underlying the representational alignment of deep neural networks with humansNature Machine Intelligence (Nat. Mach. Intell.), 2024

388

28 Jan 2025

Faithful Counterfactual Visual Explanations (FCVE)Knowledge-Based Systems (KBS), 2024

232

12 Jan 2025

Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise RatiosIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

327

10 Jan 2025

Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts

311

18 Dec 2024

Concept Learning in the Wild: Towards Algorithmic Understanding of Neural Networks

238

15 Dec 2024

Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey

...

425

03 Dec 2024

GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers

451

23 Nov 2024

Towards Utilising a Range of Neural Activations for Comprehending Representational AssociationsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

Laura O'Mahony

Nikola S. Nikolov

David JP O'Sullivan

445

15 Nov 2024

Local vs distributed representations: What is the right basis for interpretability?

260

06 Nov 2024

FedMoE-DA: Federated Mixture of Experts via Domain Aware Fine-grained AggregationInternational Conference on Mobile Ad-hoc and Sensor Networks (ICMASN), 2024

385

04 Nov 2024

Enhancing Neural Network Interpretability with Feature-Aligned Sparse Autoencoders

191

02 Nov 2024