Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2504.02821
Cited By
v1
v2
v3 (latest)
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
3 April 2025
Mateusz Pach
Shyamgopal Karthik
Quentin Bouniot
Serge Belongie
Zeynep Akata
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (10 upvotes)
Github (45★)
Papers citing
"Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models"
11 / 11 papers shown
ActivationReasoning: Logical Reasoning in Latent Activation Spaces
Lukas Helff
Ruben Härle
Wolfgang Stammer
Felix Friedrich
Manuel Brack
Antonia Wüst
Hikaru Shindo
P. Schramowski
Kristian Kersting
LLMSV
LRM
AI4CE
319
2
0
21 Oct 2025
Causal Interpretation of Sparse Autoencoder Features in Vision
Sangyu Han
Yearim Kim
Nojun Kwak
ViT
CML
141
1
0
31 Aug 2025
Model-Agnostic Gender Bias Control for Text-to-Image Generation via Sparse Autoencoder
Chao Wu
Zhenyi Wang
Kangxian Xie
Naresh Kumar Devulapally
Vishnu Suresh Lokhande
Mingchen Gao
308
2
0
28 Jul 2025
Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
Yulu Qin
Dheeraj Varghese
Adam Dahlgren Lindström
L. Donatelli
Kanishka Misra
Najoung Kim
VLM
226
6
0
17 Jul 2025
Sparse Autoencoders, Again?
Yin Lu
X. Zhu
Tong He
David Wipf
395
1
0
05 Jun 2025
From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance
Maximilian Dreyer
Lorenz Hufe
J. Berend
Thomas Wiegand
Sebastian Lapuschkin
Wojciech Samek
279
2
0
26 May 2025
Interpretable and Testable Vision Features via Sparse Autoencoders
Samuel Stevens
Wei-Lun Chao
T. Berger-Wolf
Yu-Chuan Su
VLM
492
17
0
10 Feb 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
ELM
402
31
0
06 Feb 2025
Concept Steerers: Leveraging K-Sparse Autoencoders for Test-Time Controllable Generations
Dahye Kim
Deepti Ghadiyaram
LLMSV
DiffM
653
7
0
31 Jan 2025
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders
Bartosz Cywiński
Kamil Deja
DiffM
612
49
0
29 Jan 2025
Interpreting the Second-Order Effects of Neurons in CLIP
Yossi Gandelsman
Alexei A. Efros
Jacob Steinhardt
MILM
547
35
0
06 Jun 2024
1
Page 1 of 1