ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11767
  4. Cited By
Analyzing (In)Abilities of SAEs via Formal Languages

Analyzing (In)Abilities of SAEs via Formal Languages

15 October 2024
Abhinav Menon
Manish Shrivastava
David M. Krueger
Ekdeep Singh Lubana
ArXivPDFHTML

Papers citing "Analyzing (In)Abilities of SAEs via Formal Languages"

5 / 5 papers shown
Title
How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders
Tatsuro Inaba
Kentaro Inui
Yusuke Miyao
Yohei Oseki
Benjamin Heinzerling
Yu Takagi
53
0
0
09 Mar 2025
Mixture of Experts Made Intrinsically Interpretable
Xingyi Yang
Constantin Venhoff
Ashkan Khakzar
Christian Schroeder de Witt
P. Dokania
Adel Bibi
Philip H. S. Torr
MoE
42
0
0
05 Mar 2025
Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry
Sai Sumedh R. Hindupur
Ekdeep Singh Lubana
Thomas Fel
Demba Ba
31
4
0
03 Mar 2025
FADE: Why Bad Descriptions Happen to Good Features
FADE: Why Bad Descriptions Happen to Good Features
Bruno Puri
Aakriti Jain
Elena Golimblevskaia
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
43
0
0
24 Feb 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
81
7
0
06 Feb 2025
1