Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.02066
Cited By
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks
5 October 2020
Róbert Csordás
Sjoerd van Steenkiste
Jürgen Schmidhuber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks"
50 / 70 papers shown
Title
An XAI-based Analysis of Shortcut Learning in Neural Networks
Phuong Quynh Le
Jorg Schlotterer
Christin Seifert
AAML
29
0
0
22 Apr 2025
MIB: A Mechanistic Interpretability Benchmark
Aaron Mueller
Atticus Geiger
Sarah Wiegreffe
Dana Arad
Iván Arcuschin
...
Alessandro Stolfo
Martin Tutek
Amir Zur
David Bau
Yonatan Belinkov
43
1
0
17 Apr 2025
Are formal and functional linguistic mechanisms dissociated in language models?
Michael Hanna
Sandro Pezzelle
Yonatan Belinkov
47
0
0
14 Mar 2025
Building Bridges, Not Walls -- Advancing Interpretability by Unifying Feature, Data, and Model Component Attribution
Shichang Zhang
Tessa Han
Usha Bhalla
Hima Lakkaraju
FAtt
147
0
0
17 Feb 2025
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
34
2
0
28 Oct 2024
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Philipp Mondorf
Sondre Wold
Barbara Plank
34
0
0
02 Oct 2024
Breaking Neural Network Scaling Laws with Modularity
Akhilan Boopathy
Sunshine Jiang
William Yue
Jaedong Hwang
Abhiram Iyer
Ila Fiete
OOD
39
2
0
09 Sep 2024
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small
Maheep Chaudhary
Atticus Geiger
26
13
0
05 Sep 2024
Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations
Róbert Csordás
Christopher Potts
Christopher D. Manning
Atticus Geiger
GAN
28
15
0
20 Aug 2024
Block-Operations: Using Modular Routing to Improve Compositional Generalization
Florian Dietz
Dietrich Klakow
AI4CE
19
0
0
01 Aug 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs
Nitay Calderon
Roi Reichart
38
10
0
27 Jul 2024
Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification
Vivi Nastase
Paola Merlo
28
2
0
25 Jul 2024
Out of spuriousity: Improving robustness to spurious correlations without group annotations
Phuong Quynh Le
Jorg Schlotterer
Christin Seifert
CML
OOD
37
2
0
20 Jul 2024
How and where does CLIP process negation?
Vincent Quantmeyer
Pablo Mosteiro
Albert Gatt
CoGe
29
6
0
15 Jul 2024
Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning
Lei Yu
Jingcheng Niu
Zining Zhu
Gerald Penn
38
5
0
04 Jul 2024
Are there identifiable structural parts in the sentence embedding whole?
Vivi Nastase
Paola Merlo
32
3
0
24 Jun 2024
METRIK: Measurement-Efficient Randomized Controlled Trials using Transformers with Input Masking
S. Lala
Niraj K. Jha
16
0
0
24 Jun 2024
When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models
Ting-Yun Chang
Jesse Thomason
Robin Jia
40
4
0
19 Jun 2024
Interpretability of Language Models via Task Spaces
Lucas Weber
Jaap Jumelet
Elia Bruni
Dieuwke Hupkes
35
4
0
10 Jun 2024
Debiasing surgeon: fantastic weights and how to find them
Rémi Nahon
Ivan Luiz De Moura Matos
Van-Tam Nguyen
Enzo Tartaglione
36
1
0
21 Mar 2024
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
Jing-ling Huang
Zhengxuan Wu
Christopher Potts
Mor Geva
Atticus Geiger
59
26
0
27 Feb 2024
Assessing the Portability of Parameter Matrices Trained by Parameter-Efficient Finetuning Methods
Mohammed Sabry
Anya Belz
31
0
0
25 Jan 2024
A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments
Zhengxuan Wu
Atticus Geiger
Jing-ling Huang
Aryaman Arora
Thomas F. Icard
Christopher Potts
Noah D. Goodman
33
6
0
23 Jan 2024
Compressing Image-to-Image Translation GANs Using Local Density Structures on Their Learned Manifold
Alireza Ganjdanesh
Shangqian Gao
Hirad Alipanah
Heng-Chiao Huang
GAN
27
6
0
22 Dec 2023
Examining Modularity in Multilingual LMs via Language-Specialized Subnetworks
Rochelle Choenni
Ekaterina Shutova
Daniel H Garrette
27
8
0
14 Nov 2023
Uncovering Intermediate Variables in Transformers using Circuit Probing
Michael A. Lepori
Thomas Serre
Ellie Pavlick
75
7
0
07 Nov 2023
Winning Prize Comes from Losing Tickets: Improve Invariant Learning by Exploring Variant Parameters for Out-of-Distribution Generalization
Zhuo Huang
Muyang Li
Li Shen
Jun-chen Yu
Chen Gong
Bo Han
Tongliang Liu
OOD
38
8
0
25 Oct 2023
Visually Grounded Continual Language Learning with Selective Specialization
Kyra Ahrens
Lennart Bengtson
Jae Hee Lee
Stefan Wermter
19
0
0
24 Oct 2023
Unlocking Emergent Modularity in Large Language Models
Zihan Qiu
Zeyu Huang
Jie Fu
22
8
0
17 Oct 2023
Instilling Inductive Biases with Subnetworks
Enyan Zhang
Michael A. Lepori
Ellie Pavlick
AI4CE
8
4
0
17 Oct 2023
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models
Deniz Bayazit
Negar Foroutan
Zeming Chen
Gail Weiss
Antoine Bosselut
KELM
24
13
0
04 Oct 2023
Modularity in Deep Learning: A Survey
Haozhe Sun
Isabelle Guyon
MoMe
30
2
0
02 Oct 2023
NeuroSurgeon: A Toolkit for Subnetwork Analysis
Michael A. Lepori
Ellie Pavlick
Thomas Serre
11
7
0
01 Sep 2023
Differentiable Weight Masks for Domain Transfer
Samarth Khanna
Skanda Vaidyanath
Akash Velu
21
0
0
26 Aug 2023
Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models
Kecheng Zheng
Wei Wu
Ruili Feng
Kai Zhu
Jiawei Liu
Deli Zhao
Zhengjun Zha
Wei Chen
Yujun Shen
VLM
19
8
0
27 Jul 2023
Causal interventions expose implicit situation models for commonsense language understanding
Takateru Yamakoshi
James L. McClelland
A. Goldberg
Robert D. Hawkins
17
6
0
06 Jun 2023
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
27
45
0
06 Jun 2023
Neural Sculpting: Uncovering hierarchically modular task structure in neural networks through pruning and network analysis
S. M. Patil
Loizos Michael
C. Dovrolis
34
0
0
28 May 2023
Emergent Modularity in Pre-trained Transformers
Zhengyan Zhang
Zhiyuan Zeng
Yankai Lin
Chaojun Xiao
Xiaozhi Wang
Xu Han
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Jie Zhou
MoE
37
23
0
28 May 2023
Similarity of Neural Network Models: A Survey of Functional and Representational Measures
Max Klabunde
Tobias Schumacher
M. Strohmaier
Florian Lemmerich
52
64
0
10 May 2023
Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Ziming Liu
Eric Gan
Max Tegmark
21
36
0
04 May 2023
Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Atticus Geiger
Zhengxuan Wu
Christopher Potts
Thomas F. Icard
Noah D. Goodman
CML
75
98
0
05 Mar 2023
Does Deep Learning Learn to Abstract? A Systematic Probing Framework
Shengnan An
Zeqi Lin
B. Chen
Qiang Fu
Nanning Zheng
Jian-Guang Lou
31
4
0
23 Feb 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
23
73
0
22 Feb 2023
Break It Down: Evidence for Structural Compositionality in Neural Networks
Michael A. Lepori
Thomas Serre
Ellie Pavlick
33
29
0
26 Jan 2023
Towards Modular Machine Learning Solution Development: Benefits and Trade-offs
Samiyuru Menik
Lakshmish Ramaswamy
29
4
0
23 Jan 2023
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Peter Hase
Mohit Bansal
Been Kim
Asma Ghandeharioun
MILM
34
167
0
10 Jan 2023
Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation
Yifan Zhou
Shubham D. Sonawani
Mariano Phielipp
Simon Stepputtis
H. B. Amor
LM&Ro
14
27
0
08 Dec 2022
Logical Tasks for Measuring Extrapolation and Rule Comprehension
Ippei Fujisawa
Ryota Kanai
ELM
LRM
20
4
0
14 Nov 2022
Training Debiased Subnetworks with Contrastive Weight Pruning
Geon Yeong Park
Sangmin Lee
Sang Wan Lee
Jong Chul Ye
CML
30
13
0
11 Oct 2022
1
2
Next