ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.11756
  4. Cited By
Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders
v1v2 (latest)

Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders

16 May 2025
David Chanin
Tomáš Dulka
Adrià Garriga-Alonso
ArXiv (abs)PDFHTMLGithub (3★)

Papers citing "Feature Hedging: Correlated Features Break Narrow Sparse Autoencoders"

7 / 7 papers shown
Title
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features
OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features
Anton Korznikov
Andrey V. Galichin
Alexey Dontsov
Oleg Y. Rogov
Elena Tutubalina
Ivan Oseledets
104
0
0
26 Sep 2025
Towards Atoms of Large Language Models
Towards Atoms of Large Language Models
Chenhui Hu
Pengfei Cao
Yubo Chen
Kang Liu
Jun Zhao
96
0
0
25 Sep 2025
Sparse but Wrong: Incorrect L0 Leads to Incorrect Features in Sparse Autoencoders
Sparse but Wrong: Incorrect L0 Leads to Incorrect Features in Sparse Autoencoders
David Chanin
Adrià Garriga-Alonso
128
0
0
22 Aug 2025
Learning Multi-Level Features with Matryoshka Sparse Autoencoders
Learning Multi-Level Features with Matryoshka Sparse Autoencoders
Bart Bussmann
Noa Nabeshima
Adam Karvonen
Neel Nanda
257
43
0
21 Mar 2025
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
Subhash Kantamneni
Joshua Engels
Senthooran Rajamanoharan
Max Tegmark
Neel Nanda
312
39
0
23 Feb 2025
Decomposing The Dark Matter of Sparse Autoencoders
Decomposing The Dark Matter of Sparse Autoencoders
Joshua Engels
Logan Riggs
Max Tegmark
LLMSV
274
29
0
18 Oct 2024
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks
Can Rager
Eric J. Michaud
Yonatan Belinkov
David Bau
Aaron Mueller
451
233
0
28 Mar 2024
1