Steering CLIP's vision transformer with sparse autoencoders

Steering CLIP's vision transformer with sparse autoencoders

11 April 2025

Yossi Gandelsman

Blake A. Richards

ArXiv (abs)PDF HTML

Papers citing "Steering CLIP's vision transformer with sparse autoencoders"

9 / 9 papers shown

Title
MEIcoder: Decoding Visual Stimuli from Neural Activity by Leveraging Most Exciting Inputs Jan Sobotka Luca Baroni Ján Antolík OffRL 64 0 0 23 Oct 2025
From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers Praneet Suresh Jack Stanley Sonia Joseph Luca Scimeca Danilo Bzdok 104 1 0 08 Sep 2025
Towards Mechanistic Defenses Against Typographic Attacks in CLIP Lorenz Hufe Constantin Venhoff Maximilian Dreyer Sebastian Lapuschkin Wojciech Samek AAML 124 1 0 28 Aug 2025
From What to How: Attributing CLIP's Latent Components Reveals Unexpected Semantic Reliance Maximilian Dreyer Lorenz Hufe J. Berend Thomas Wiegand Sebastian Lapuschkin Wojciech Samek 196 2 0 26 May 2025
Symbolic Rule Extraction from Attention-Guided Sparse Representations in Vision TransformersTheory and Practice of Logic Programming (TPLP), 2025 Parth Padalkar Gopal Gupta 137 0 0 10 May 2025
Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video Sonia Joseph Praneet Suresh Lorenz Hufe Edward Stevinson Robert Graham Yash Vadi Danilo Bzdok Sebastian Lapuschkin Lee Sharkey Blake A. Richards 403 8 0 28 Apr 2025
Interpreting the Second-Order Effects of Neurons in CLIP Yossi Gandelsman Alexei A. Efros Jacob Steinhardt MILM 319 32 0 06 Jun 2024
Discriminative Class Tokens for Text-to-Image Diffusion ModelsIEEE International Conference on Computer Vision (ICCV), 2023 Idan Schwartz Vésteinn Snaebjarnarson Hila Chefer Robert Bamler Serge Belongie Lior Wolf Sagie Benaim 310 12 0 30 Mar 2023
Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional DivergenceInternational Conference on Learning Representations (ICLR), 2022 Frederik Pahde Maximilian Dreyer Leander Weber Moritz Weckbecker Christopher J. Anders Thomas Wiegand Wojciech Samek Sebastian Lapuschkin 322 17 0 07 Feb 2022