ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.05041
  4. Cited By
Understanding the Role of Individual Units in a Deep Neural Network
v1v2 (latest)

Understanding the Role of Individual Units in a Deep Neural Network

Proceedings of the National Academy of Sciences of the United States of America (PNAS), 2020
10 September 2020
David Bau
Jun-Yan Zhu
Hendrik Strobelt
Àgata Lapedriza
Bolei Zhou
Antonio Torralba
    GAN
ArXiv (abs)PDFHTML

Papers citing "Understanding the Role of Individual Units in a Deep Neural Network"

50 / 233 papers shown
Mechanistic Finetuning of Vision-Language-Action Models via Few-Shot Demonstrations
Mechanistic Finetuning of Vision-Language-Action Models via Few-Shot Demonstrations
Chancharik Mitra
Yusen Luo
Raj Saravanan
Dantong Niu
Anirudh Pai
Jesse Thomason
Trevor Darrell
Abrar Anwar
Deva Ramanan
Roei Herzig
52
0
0
27 Nov 2025
Guaranteed Optimal Compositional Explanations for Neurons
Guaranteed Optimal Compositional Explanations for Neurons
Biagio La Rosa
Leilani H. Gilpin
72
0
0
25 Nov 2025
Open Vocabulary Compositional Explanations for Neuron Alignment
Open Vocabulary Compositional Explanations for Neuron Alignment
Biagio La Rosa
Leilani H. Gilpin
OCL
330
0
0
25 Nov 2025
Training Language Models to Explain Their Own Computations
Training Language Models to Explain Their Own Computations
Belinda Z. Li
Zifan Carl Guo
Vincent Huang
Jacob Steinhardt
Jacob Andreas
LRM
209
3
0
11 Nov 2025
Finding Culture-Sensitive Neurons in Vision-Language Models
Finding Culture-Sensitive Neurons in Vision-Language Models
Xiutian Zhao
Rochelle Choenni
Rohit Saxena
Ivan Titov
VLM
246
0
0
28 Oct 2025
Understanding Multi-View Transformers
Understanding Multi-View Transformers
Michal Stary
Julien Gaubil
A. Tewari
Vincent Sitzmann
ViT
84
1
0
28 Oct 2025
TextCAM: Explaining Class Activation Map with Text
TextCAM: Explaining Class Activation Map with Text
Qiming Zhao
Xingjian Li
Xiaoyu Cao
Xiaolong Wu
Min Xu
VLM
119
0
0
01 Oct 2025
Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations
Granular Concept Circuits: Toward a Fine-Grained Circuit Discovery for Concept Representations
Dahee Kwon
Sehyun Lee
Jaesik Choi
163
1
0
03 Aug 2025
Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics
Unraveling Hidden Representations: A Multi-Modal Layer Analysis for Better Synthetic Content Forensics
Tom Or
Omri Azencot
AAML
182
1
0
01 Aug 2025
Explaining How Visual, Textual and Multimodal Encoders Share Concepts
Explaining How Visual, Textual and Multimodal Encoders Share Concepts
Clément Cornet
Romaric Besançon
Hervé Le Borgne
153
0
0
24 Jul 2025
Escaping Plato's Cave: JAM for Aligning Independently Trained Vision and Language Models
Escaping Plato's Cave: JAM for Aligning Independently Trained Vision and Language Models
Lauren Hyoseo Yoon
Yisong Yue
Been Kim
363
0
0
01 Jul 2025
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
Jingtong Su
Julia Kempe
Karen Ullrich
268
3
0
20 Jun 2025
Evaluating Neuron Explanations: A Unified Framework with Sanity Checks
Evaluating Neuron Explanations: A Unified Framework with Sanity Checks
Tuomas P. Oikarinen
Ge Yan
Tsui-Wei Weng
FAttXAI
171
7
0
06 Jun 2025
Line of Sight: On Linear Representations in VLLMs
Achyuta Rajaram
Sarah Schwettmann
Jacob Andreas
Arthur Conmy
VLM
283
2
0
05 Jun 2025
Unconditional CNN denoisers contain sparse semantic representation of images
Unconditional CNN denoisers contain sparse semantic representation of images
Zahra Kadkhodaie
Stéphane Mallat
Eero P. Simoncelli
DiffM
315
0
0
02 Jun 2025
P: A Universal Measure of Predictive Intelligence
P: A Universal Measure of Predictive Intelligence
David Gamez
ELM
104
2
0
30 May 2025
Debiasing CLIP: Interpreting and Correcting Bias in Attention Heads
Debiasing CLIP: Interpreting and Correcting Bias in Attention Heads
Wei Jie Yeo
Rui Mao
Moloud Abdar
Erik Cambria
Frank Xing
277
3
0
23 May 2025
FastCAV: Efficient Computation of Concept Activation Vectors for Explaining Deep Neural Networks
FastCAV: Efficient Computation of Concept Activation Vectors for Explaining Deep Neural Networks
Laines Schmalwasser
Niklas Penzel
Joachim Denzler
Julia Niebling
175
3
0
23 May 2025
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
Ercong Nie
Helmut Schmid
Hinrich Schutze
365
4
0
22 May 2025
Explainable embeddings with Distance Explainer
Explainable embeddings with Distance Explainer
Christiaan Meijer
E. G. Patrick Bos
369
0
0
21 May 2025
Out-of-Distribution Detection via Channelwise Feature Aggregation in Neural Network-Based Receivers
Out-of-Distribution Detection via Channelwise Feature Aggregation in Neural Network-Based Receivers
Marko Tuononen
Duy Vu
Dani Korpi
Vesa Starck
Ville Hautamäki
Ville Hautamäki
372
1
0
21 May 2025
Explaining Neural Networks with Reasons
Explaining Neural Networks with Reasons
Levin Hornischer
Hannes Leitgeb
FAttAAMLMILM
319
0
0
20 May 2025
What's Pulling the Strings? Evaluating Integrity and Attribution in AI Training and Inference through Concept Shift
What's Pulling the Strings? Evaluating Integrity and Attribution in AI Training and Inference through Concept Shift
Jiamin Chang
Haoyang Li
Hammond Pearce
Ruoxi Sun
Yue Liu
Minhui Xue
319
0
0
28 Apr 2025
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
Saif Punjwani
Larry Heck
LRM
235
0
0
14 Apr 2025
Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning
Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning
Jiahua Lan
Sen Zhang
Haixia Pan
Ruijun Liu
Li Shen
Dacheng Tao
CLL
281
0
0
09 Apr 2025
Following the Whispers of Values: Unraveling Neural Mechanisms Behind Value-Oriented Behaviors in LLMs
Following the Whispers of Values: Unraveling Neural Mechanisms Behind Value-Oriented Behaviors in LLMs
Ling Hu
Yuemei Xu
Xiaoyang Gu
Letao Han
381
1
0
07 Apr 2025
LSNet: See Large, Focus Small
LSNet: See Large, Focus SmallComputer Vision and Pattern Recognition (CVPR), 2025
Ao Wang
Hui Chen
Zijia Lin
Jiawei Han
Guiguang Ding
295
11
0
29 Mar 2025
Effective Skill Unlearning through Intervention and Abstention
Effective Skill Unlearning through Intervention and AbstentionNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Yongce Li
Chung-En Sun
Tsui-Wei Weng
MU
862
4
0
27 Mar 2025
CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity Quantification
CoE: Chain-of-Explanation via Automatic Visual Concept Circuit Description and Polysemanticity QuantificationComputer Vision and Pattern Recognition (CVPR), 2025
Wenlong Yu
Qilong Wang
Chuang Liu
Dong Li
Q. Hu
LRM
276
2
0
19 Mar 2025
Representational Similarity via Interpretable Visual Concepts
Representational Similarity via Interpretable Visual ConceptsInternational Conference on Learning Representations (ICLR), 2025
Neehar Kondapaneni
Oisin Mac Aodha
Pietro Perona
DRL
976
3
0
19 Mar 2025
Post-Hoc Concept Disentanglement: From Correlated to Isolated Concept Representations
Eren Erogullari
Sebastian Lapuschkin
Wojciech Samek
Frederik Pahde
LLMSVCoGe
250
1
0
07 Mar 2025
Superscopes: Amplifying Internal Feature Representations for Language Model Interpretation
Jonathan Jacobi
Gal Niv
LRMReLM
421
2
0
03 Mar 2025
Steered Generation via Gradient Descent on Sparse Features
Steered Generation via Gradient Descent on Sparse Features
Sumanta Bhattacharyya
Pedram Rooshenas
LLMSV
287
0
0
25 Feb 2025
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
TinyEmo: Scaling down Emotional Reasoning via Metric Projection
Cristian Gutierrez
LRM
523
0
0
17 Feb 2025
Dimensions underlying the representational alignment of deep neural networks with humans
Dimensions underlying the representational alignment of deep neural networks with humansNature Machine Intelligence (Nat. Mach. Intell.), 2024
F. Mahner
Lukas Muttenthaler
Umut Güçlü
M. Hebart
388
24
0
28 Jan 2025
Faithful Counterfactual Visual Explanations (FCVE)
Faithful Counterfactual Visual Explanations (FCVE)Knowledge-Based Systems (KBS), 2024
Bismillah Khan
Syed Ali Tariq
Tehseen Zia
Muhammad Ahsan
David Windridge
232
1
0
12 Jan 2025
Towards Counterfactual and Contrastive Explainability and Transparency of DCNN Image Classifiers
Towards Counterfactual and Contrastive Explainability and Transparency of DCNN Image ClassifiersKnowledge-Based Systems (KBS), 2022
Syed Ali Tariq
Tehseen Zia
Mubeen Ghafoor
AAML
306
9
0
12 Jan 2025
GPT-2 Through the Lens of Vector Symbolic Architectures
GPT-2 Through the Lens of Vector Symbolic Architectures
Johannes Knittel
Tushaar Gangavarapu
Hendrik Strobelt
Hanspeter Pfister
155
2
0
10 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
Shijie Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
425
48
0
03 Dec 2024
From CNN to CNN + RNN: Adapting Visualization Techniques for Time-Series
  Anomaly Detection
From CNN to CNN + RNN: Adapting Visualization Techniques for Time-Series Anomaly Detection
Fabien Poirier
AI4TS
242
0
0
07 Nov 2024
Probing Ranking LLMs: A Mechanistic Analysis for Information Retrieval
Probing Ranking LLMs: A Mechanistic Analysis for Information RetrievalInternational Conference on the Theory of Information Retrieval (ICTIR), 2024
Tanya Chowdhury
Atharva Nijasure
James Allan
232
0
0
24 Oct 2024
Exploiting Text-Image Latent Spaces for the Description of Visual
  Concepts
Exploiting Text-Image Latent Spaces for the Description of Visual ConceptsInternational Conference on Pattern Recognition (ICPR), 2024
Laines Schmalwasser
J. Gawlikowski
Joachim Denzler
Julia Niebling
180
3
0
23 Oct 2024
Neuron-based Personality Trait Induction in Large Language Models
Neuron-based Personality Trait Induction in Large Language Models
Jia Deng
Tianyi Tang
Yanbin Yin
Wenhao Yang
Wayne Xin Zhao
Ji-Rong Wen
238
3
0
16 Oct 2024
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Interpreting and Editing Vision-Language Representations to Mitigate HallucinationsInternational Conference on Learning Representations (ICLR), 2024
Nick Jiang
Anish Kachinthaya
Suzie Petryk
Yossi Gandelsman
VLM
403
62
0
03 Oct 2024
Linking in Style: Understanding learned features in deep learning models
Linking in Style: Understanding learned features in deep learning modelsEuropean Conference on Computer Vision (ECCV), 2024
Maren H. Wehrheim
Pamela Osuna-Vargas
Matthias Kaschube
GAN
184
0
0
25 Sep 2024
Unveiling Language Competence Neurons: A Psycholinguistic Approach to
  Model Interpretability
Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model InterpretabilityInternational Conference on Computational Linguistics (COLING), 2024
Xufeng Duan
Xinyu Zhou
Bei Xiao
Zhenguang G. Cai
MILM
214
9
0
24 Sep 2024
Optimal ablation for interpretability
Optimal ablation for interpretabilityNeural Information Processing Systems (NeurIPS), 2024
Maximilian Li
Lucas Janson
FAtt
343
11
0
16 Sep 2024
Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning
Unveiling Markov Heads in Pretrained Language Models for Offline Reinforcement Learning
Wenhao Zhao
Qiushui Xu
Linjie Xu
Lei Song
Jinyu Wang
Chunlai Zhou
Jiang Bian
327
0
0
11 Sep 2024
How to Measure Human-AI Prediction Accuracy in Explainable AI Systems
How to Measure Human-AI Prediction Accuracy in Explainable AI Systems
Sujay Koujalgi
Andrew Anderson
Iyadunni Adenuga
Shikha Soneji
Rupika Dikkala
...
Leo Soccio
Sourav Panda
Rupak Kumar Das
Margaret Burnett
Jonathan Dodge
207
3
0
23 Aug 2024
Multilevel Interpretability Of Artificial Neural Networks: Leveraging
  Framework And Methods From Neuroscience
Multilevel Interpretability Of Artificial Neural Networks: Leveraging Framework And Methods From Neuroscience
Zhonghao He
Jascha Achterberg
Katie Collins
Kevin K. Nejad
Danyal Akarca
...
Chole Li
Kai J. Sandbrink
Stephen Casper
Anna Ivanova
Grace W. Lindsay
AI4CE
317
6
0
22 Aug 2024
12345
Next