ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.08201
  4. Cited By
Efficient Dictionary Learning with Switch Sparse Autoencoders
v1v2 (latest)

Efficient Dictionary Learning with Switch Sparse Autoencoders

International Conference on Learning Representations (ICLR), 2024
10 October 2024
Anish Mudide
Joshua Engels
Eric J. Michaud
Max Tegmark
Christian Schroeder de Witt
ArXiv (abs)PDFHTML

Papers citing "Efficient Dictionary Learning with Switch Sparse Autoencoders"

18 / 18 papers shown
Beyond Redundancy: Diverse and Specialized Multi-Expert Sparse Autoencoder
Beyond Redundancy: Diverse and Specialized Multi-Expert Sparse Autoencoder
Zhen Xu
Zhen Tan
Song Wang
Kaidi Xu
Tianlong Chen
MoE
278
0
0
07 Nov 2025
Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
Xu Wang
Yan Hu
Benyou Wang
Difan Zou
LLMSV
208
1
0
04 Oct 2025
LLM Interpretability with Identifiable Temporal-Instantaneous Representation
LLM Interpretability with Identifiable Temporal-Instantaneous Representation
Xiangchen Song
Jiaqi Sun
Zijian Li
Yujia Zheng
Kun Zhang
128
0
0
27 Sep 2025
The Secret Agenda: LLMs Strategically Lie and Our Current Safety Tools Are Blind
The Secret Agenda: LLMs Strategically Lie and Our Current Safety Tools Are Blind
Caleb DeLeeuw
Gaurav Chawla
Aniket Sharma
Vanessa Dietze
105
1
0
23 Sep 2025
AdaptiveK Sparse Autoencoders: Dynamic Sparsity Allocation for Interpretable LLM Representations
AdaptiveK Sparse Autoencoders: Dynamic Sparsity Allocation for Interpretable LLM Representations
Yifei Yao
Mengnan Du
172
0
0
24 Aug 2025
Attention Layers Add Into Low-Dimensional Residual Subspaces
Attention Layers Add Into Low-Dimensional Residual Subspaces
Junxuan Wang
Xuyang Ge
Wentao Shu
Zhengfu He
Xipeng Qiu
166
0
0
23 Aug 2025
Probing the Representational Power of Sparse Autoencoders in Vision Models
Probing the Representational Power of Sparse Autoencoders in Vision Models
Matthew Lyle Olson
Musashi Hinck
Neale Ratzlaff
Changbai Li
Phillip Howard
Vasudev Lal
Shao-Yen Tseng
212
1
0
15 Aug 2025
Interpreting CFD Surrogates through Sparse Autoencoders
Interpreting CFD Surrogates through Sparse Autoencoders
Yeping Hu
Shusen Liu
AI4CE
131
0
0
21 Jul 2025
Incorporating Hierarchical Semantics in Sparse Autoencoder Architectures
Incorporating Hierarchical Semantics in Sparse Autoencoder Architectures
Mark Muchane
Sean Richardson
Kiho Park
Victor Veitch
212
2
0
01 Jun 2025
Kronecker Factorization Improves Efficiency and Interpretability of Sparse Autoencoders
Kronecker Factorization Improves Efficiency and Interpretability of Sparse Autoencoders
Daniil Laptev
Gleb Gerasimov
Yaroslav Aksenov
Daniil Gavrilov
Nikita Balagansky
248
0
0
28 May 2025
Sparsification and Reconstruction from the Perspective of Representation Geometry
Sparsification and Reconstruction from the Perspective of Representation Geometry
Wenjie Sun
Bingzhe Wu
Zhile Yang
Chengke Wu
237
0
0
28 May 2025
Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders
Evaluating Adversarial Robustness of Concept Representations in Sparse Autoencoders
Aaron Jiaxun Li
Suraj Srinivas
Usha Bhalla
Himabindu Lakkaraju
AAML
370
4
0
21 May 2025
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Are Sparse Autoencoders Useful for Java Function Bug Detection?
Rui Melo
Claudia Mamede
Andre Catarino
Rui Abreu
Henrique Lopes Cardoso
413
1
0
15 May 2025
Revisiting End-To-End Sparse Autoencoder Training: A Short Finetune Is All You Need
Revisiting End-To-End Sparse Autoencoder Training: A Short Finetune Is All You Need
Adam Karvonen
282
1
0
21 Mar 2025
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
Adam Karvonen
Can Rager
Johnny Lin
Curt Tigges
Joseph Isaac Bloom
...
Matthew Wearden
Arthur Conmy
Arthur Conmy
Samuel Marks
Neel Nanda
MU
593
51
0
12 Mar 2025
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
Subhash Kantamneni
Joshua Engels
Senthooran Rajamanoharan
Max Tegmark
Neel Nanda
356
46
0
23 Feb 2025
Steering Language Model Refusal with Sparse Autoencoders
Kyle O'Brien
David Majercak
Xavier Fernandes
Richard Edgar
Blake Bullwinkel
Jingya Chen
Harsha Nori
Dean Carignan
Eric Horvitz
Forough Poursabzi-Sangde
LLMSV
389
40
0
18 Nov 2024
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with
  Sparse Autoencoders
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders
Zhengfu He
Wentao Shu
Xuyang Ge
Lingjie Chen
Junxuan Wang
...
Qipeng Guo
Xuanjing Huang
Zuxuan Wu
Yu-Gang Jiang
Xipeng Qiu
337
77
0
27 Oct 2024
1