k-Sparse Autoencoders

19 December 2013

Papers citing "k-Sparse Autoencoders"

15 / 15 papers shown

Title
Inference-Time Decomposition of Activations (ITDA): A Scalable Approach to Interpreting Large Language Models Patrick Leask Neel Nanda Noura Al Moubayed 44 1 0 23 May 2025
TIDE : Temporal-Aware Sparse Autoencoders for Interpretable Diffusion Transformers in Image Generation Victor Shea-Jay Huang Le Zhuo Yi Xin Zhaokai Wang Peng Gao Hongsheng Li DiffM 128 1 0 10 Mar 2025
Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation Tiansheng Wen Yifei Wang Zequn Zeng Zhong Peng Yudi Su Xinyang Liu Bo Chen Hongwei Liu Stefanie Jegelka Chenyu You CLL 135 3 0 03 Mar 2025
SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders Bartosz Cywiński Kamil Deja DiffM 89 8 0 29 Jan 2025
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models Konstantin Donhauser Kristina Ulicna Gemma Elyse Moran Aditya Ravuri Kian Kenyon-Dean Cian Eastwood Jason Hartford 113 0 0 20 Dec 2024
Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders Charles OÑeill David Klindt David Klindt 125 1 0 20 Nov 2024
Decomposing The Dark Matter of Sparse Autoencoders Joshua Engels Logan Riggs Max Tegmark LLMSV 70 12 0 18 Oct 2024
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax I. Butakov Alexander Sememenko Alexander Tolmachev Andrey Gladkov Marina Munkhoeva Alexey Frolov 97 1 0 09 Oct 2024
Residual Stream Analysis with Multi-Layer SAEs Tim Lawson Lucy Farnik Conor Houghton Laurence Aitchison 48 5 0 06 Sep 2024
Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique Andrew Kiruluta Andreas Lemos 71 0 0 19 Aug 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models Samuel Marks Can Rager Eric J. Michaud Yonatan Belinkov David Bau Aaron Mueller 98 137 0 28 Mar 2024
PCAAE: Principal Component Analysis Autoencoder for organising the latent space of generative networks Chi-Hieu Pham Saïd Ladjal A. Newson DRL 50 31 0 14 Jun 2020
SCAT: Second Chance Autoencoder for Textual Data Somaieh Goudarzvand Gharib Gharibi Yugyung Lee 23 3 0 11 May 2020
Improving neural networks by preventing co-adaptation of feature detectors Geoffrey E. Hinton Nitish Srivastava A. Krizhevsky Ilya Sutskever Ruslan Salakhutdinov VLM 385 7,650 0 03 Jul 2012
Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition Koray Kavukcuoglu MarcÁurelio Ranzato Yann LeCun 89 248 0 18 Oct 2010