Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.03714
Cited By
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
6 February 2025
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment"
6 / 6 papers shown
Title
Interpreting the Linear Structure of Vision-language Model Embedding Spaces
Isabel Papadimitriou
Huangyuan Su
Thomas Fel
Naomi Saphra
Sham Kakade
Stephanie Gil
VLM
35
0
0
16 Apr 2025
Robustly identifying concepts introduced during chat fine-tuning using crosscoders
Julian Minder
Clement Dumas
Caden Juang
Bilal Chugtai
Neel Nanda
19
0
0
03 Apr 2025
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Mateusz Pach
Shyamgopal Karthik
Quentin Bouniot
Serge Belongie
Zeynep Akata
VLM
54
0
0
03 Apr 2025
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
Sai Sumedh R. Hindupur
Ekdeep Singh Lubana
Thomas Fel
Demba Ba
29
4
0
03 Mar 2025
Mind the Gap: Bridging the Divide Between AI Aspirations and the Reality of Autonomous Characterization
Grace Guinan
Addison Salvador
Michelle A. Smeaton
Andrew Glaws
Hilary Egan
Brian C. Wyatt
Babak Anasori
K. Fiedler
M. Olszta
Steven Spurgeon
58
0
0
25 Feb 2025
Sparse Autoencoders Reveal Universal Feature Spaces Across Large Language Models
Michael Lan
Philip H. S. Torr
Austin Meek
Ashkan Khakzar
David M. Krueger
Fazl Barez
25
9
0
09 Oct 2024
1