Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1704.05796
Cited By
Network Dissection: Quantifying Interpretability of Deep Visual Representations
19 April 2017
David Bau
Bolei Zhou
A. Khosla
A. Oliva
Antonio Torralba
MILM
FAtt
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Network Dissection: Quantifying Interpretability of Deep Visual Representations"
50 / 842 papers shown
Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
Laura Kopf
Nils Feldhus
Kirill Bykov
P. Bommer
Anna Hedström
Marina M.-C. Höhne
Oliver Eberle
390
4
0
18 Jun 2025
NERO: Explainable Out-of-Distribution Detection with Neuron-level Relevance
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Anju Chhetri
Jari Korhonen
P. Gyawali
Binod Bhattarai
OODD
337
1
0
18 Jun 2025
Vision Transformers Don't Need Trained Registers
Nick Jiang
Amil Dravid
Alexei A. Efros
Yossi Gandelsman
489
11
0
09 Jun 2025
InverseScope: Scalable Activation Inversion for Interpreting Large Language Models
Yifan Luo
Zhennan Zhou
Bin Dong
167
0
0
09 Jun 2025
CASE: Contrastive Activation for Saliency Estimation
Dane Williamson
Yangfeng Ji
Matthew B. Dwyer
FAtt
AAML
366
0
0
08 Jun 2025
Evaluating Neuron Explanations: A Unified Framework with Sanity Checks
Tuomas P. Oikarinen
Ge Yan
Tsui-Wei Weng
FAtt
XAI
171
7
0
06 Jun 2025
FeatInv: Spatially resolved mapping from feature space to input space using conditional diffusion models
Nils Neukirch
Johanna Vielhaben
Nils Strodthoff
DiffM
296
1
0
27 May 2025
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique
Shreyas Gururaj
Lars Grüne
Wojciech Samek
Sebastian Lapuschkin
Leander Weber
405
1
0
27 May 2025
FastCAV: Efficient Computation of Concept Activation Vectors for Explaining Deep Neural Networks
Laines Schmalwasser
Niklas Penzel
Joachim Denzler
Julia Niebling
175
3
0
23 May 2025
Out-of-Distribution Detection via Channelwise Feature Aggregation in Neural Network-Based Receivers
Marko Tuononen
Duy Vu
Dani Korpi
Vesa Starck
Ville Hautamäki
Ville Hautamäki
372
1
0
21 May 2025
The Spotlight Resonance Method: Resolving the Alignment of Embedded Activations
George Bird
198
2
0
09 May 2025
ChannelExplorer: Exploring Class Separability Through Activation Channel Visualization
Md Rahat-uz- Zaman
Bei Wang
Paul Rosen
212
0
0
06 May 2025
Task Reconstruction and Extrapolation for
π
0
π_0
π
0
using Text Latent
Quanyi Li
641
2
0
06 May 2025
Causal Intervention Framework for Variational Auto Encoder Mechanistic Interpretability
Dip Roy
CML
79
0
0
06 May 2025
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
Siyi Chen
Yimeng Zhang
Sijia Liu
Q. Qu
AAML
1.0K
0
0
30 Apr 2025
Addressing Concept Mislabeling in Concept Bottleneck Models Through Preference Optimization
Emiliano Penaloza
Tianyue H. Zhan
Laurent Charlin
Mateo Espinosa Zarlenga
554
2
0
25 Apr 2025
Avoiding Leakage Poisoning: Concept Interventions Under Distribution Shifts
M. Zarlenga
Gabriele Dominici
Pietro Barbiero
Z. Shams
M. Jamnik
KELM
1.1K
3
0
24 Apr 2025
Decoding Vision Transformers: the Diffusion Steering Lens
Ryota Takatsuki
Sonia Joseph
Ippei Fujisawa
Ryota Kanai
DiffM
375
0
0
18 Apr 2025
Measuring the (Un)Faithfulness of Concept-Based Explanations
Shubham Kumar
Dwip Dalal
524
0
0
15 Apr 2025
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
Saif Punjwani
Larry Heck
LRM
235
0
0
14 Apr 2025
On Background Bias of Post-Hoc Concept Embeddings in Computer Vision DNNs
Gesina Schwalbe
Georgii Mikriukov
Edgar Heinert
Stavros Gerolymatos
Mert Keser
Alois Knoll
Matthias Rottmann
Annika Mütze
357
0
0
11 Apr 2025
From Colors to Classes: Emergence of Concepts in Vision Transformers
Teresa Dorszewski
Lenka Tětková
Robert Jenssen
Lars Kai Hansen
Kristoffer Wickstrøm
219
11
0
31 Mar 2025
Towards Human-Understandable Multi-Dimensional Concept Discovery
Computer Vision and Pattern Recognition (CVPR), 2025
Arne Grobrugge
Niklas Kühl
G. Satzger
Philipp Spitzer
258
2
0
24 Mar 2025
Automated Processing of eXplainable Artificial Intelligence Outputs in Deep Learning Models for Fault Diagnostics of Large Infrastructures
Engineering applications of artificial intelligence (EAAI), 2025
Giovanni Floreale
Piero Baraldi
Enrico Zio
Olga Fink
218
2
0
19 Mar 2025
Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation
Edgar Heinert
Thomas Gottwald
Annika Mütze
Matthias Rottmann
370
1
0
16 Mar 2025
Learning Interpretable Logic Rules from Deep Vision Models
Chuqin Geng
Yuhe Jiang
Ziyu Zhao
Haolin Ye
Zhaoyue Wang
X. Si
NAI
FAtt
VLM
256
1
0
13 Mar 2025
Discovering Influential Neuron Path in Vision Transformers
International Conference on Learning Representations (ICLR), 2025
Yifan Wang
Yifei Liu
Yingdong Shi
Chong Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
605
3
0
12 Mar 2025
Backdooring CLIP through Concept Confusion
Lijie Hu
Junchi Liao
Weimin Lyu
Shaopeng Fu
Tianhao Huang
Shu Yang
Guimin Hu
Di Wang
AAML
314
1
0
12 Mar 2025
QPM: Discrete Optimization for Globally Interpretable Image Classification
International Conference on Learning Representations (ICLR), 2025
Thomas Norrenbrock
Timo Kaiser
Sovan Biswas
R. Manuvinakurike
Bodo Rosenhahn
368
2
0
27 Feb 2025
Bridging Critical Gaps in Convergent Learning: How Representational Alignment Evolves Across Layers, Training, and Distribution Shifts
Chaitanya Kapoor
Sudhanshu Srivastava
Meenakshi Khosla
376
1
0
26 Feb 2025
Model Lakes
International Conference on Extending Database Technology (EDBT), 2024
Koyena Pal
David Bau
Renée J. Miller
340
2
0
24 Feb 2025
LaVCa: LLM-assisted Visual Cortex Captioning
Takuya Matsuyama
Shinji Nishimoto
Yu Takagi
313
3
0
20 Feb 2025
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
Thomas Fel
Ekdeep Singh Lubana
Jacob S. Prince
M. Kowal
Victor Boutin
Isabel Papadimitriou
Binxu Wang
Martin Wattenberg
Demba Ba
Talia Konkle
298
28
0
18 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
523
0
0
17 Feb 2025
We Can't Understand AI Using our Existing Vocabulary
John Hewitt
Robert Geirhos
Been Kim
320
14
0
11 Feb 2025
Interpretable and Testable Vision Features via Sparse Autoencoders
Samuel Stevens
Wei-Lun Chao
T. Berger-Wolf
Yu-Chuan Su
VLM
397
17
0
10 Feb 2025
Deciphering Functions of Neurons in Vision-Language Models
Jiaqi Xu
Cuiling Lan
Xuejin Chen
VLM
863
0
0
10 Feb 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
344
24
0
06 Feb 2025
Compositional Concept-Based Neuron-Level Interpretability for Deep Reinforcement Learning
Zeyu Jiang
Hai Huang
Xingquan Zuo
OffRL
201
0
0
02 Feb 2025
Dimensions underlying the representational alignment of deep neural networks with humans
Nature Machine Intelligence (Nat. Mach. Intell.), 2024
F. Mahner
Lukas Muttenthaler
Umut Güçlü
M. Hebart
388
24
0
28 Jan 2025
Faithful Counterfactual Visual Explanations (FCVE)
Knowledge-Based Systems (KBS), 2024
Bismillah Khan
Syed Ali Tariq
Tehseen Zia
Muhammad Ahsan
David Windridge
232
1
0
12 Jan 2025
Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Marko Tuononen
Dani Korpi
Ville Hautamäki
FAtt
327
2
0
10 Jan 2025
Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts
Jihye Choi
Jayaram Raghuram
Shouqing Yang
Somesh Jha
311
8
0
18 Dec 2024
Concept Learning in the Wild: Towards Algorithmic Understanding of Neural Networks
Elad Shohama
Hadar Cohena
Khalil Wattada
Havana Rikab
Dan Vilenchik
238
1
0
15 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
425
48
0
03 Dec 2024
GIFT: A Framework for Global Interpretable Faithful Textual Explanations of Vision Classifiers
Éloi Zablocki
Valentin Gerard
Amaia Cardiel
Eric Gaussier
Matthieu Cord
Eduardo Valle
451
0
0
23 Nov 2024
Towards Utilising a Range of Neural Activations for Comprehending Representational Associations
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024
Laura O'Mahony
Nikola S. Nikolov
David JP O'Sullivan
445
2
0
15 Nov 2024
Local vs distributed representations: What is the right basis for interpretability?
Julien Colin
L. Goetschalckx
Thomas Fel
Victor Boutin
Jay Gopal
Thomas Serre
Nuria Oliver
HAI
260
4
0
06 Nov 2024
FedMoE-DA: Federated Mixture of Experts via Domain Aware Fine-grained Aggregation
International Conference on Mobile Ad-hoc and Sensor Networks (ICMASN), 2024
Ziwei Zhan
Wenkuan Zhao
Yuanqing Li
Weijie Liu
Xiaoxi Zhang
Chee Wei Tan
Chuan Wu
Deke Guo
Xu Chen
MoE
385
8
0
04 Nov 2024
Enhancing Neural Network Interpretability with Feature-Aligned Sparse Autoencoders
Luke Marks
Alasdair Paren
David M. Krueger
Fazl Barez
AAML
191
16
0
02 Nov 2024
Previous
1
2
3
4
5
...
15
16
17
Next