Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.17495
Cited By
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
23 May 2025
Landon Butler
Abhineet Agarwal
Justin Singh Kang
Yigit Efe Erginbas
Bin Yu
Kannan Ramchandran
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs"
7 / 7 papers shown
Title
SPEX: Scaling Feature Interaction Explanations for LLMs
Justin Singh Kang
Landon Butler
Abhineet Agarwal
Yigit Efe Erginbas
Ramtin Pedarsani
Kannan Ramchandran
Bin Yu
VLM
LRM
114
2
0
20 Feb 2025
Attribution Patching Outperforms Automated Circuit Discovery
Aaquib Syed
Can Rager
Arthur Conmy
110
61
0
16 Oct 2023
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
117
7,386
0
02 Oct 2019
A Fine-Grained Spectral Perspective on Neural Networks
Greg Yang
Hadi Salman
57
112
0
24 Jul 2019
Deep learning generalizes because the parameter-function map is biased towards simple functions
Guillermo Valle Pérez
Chico Q. Camargo
A. Louis
MLT
AI4CE
49
231
0
22 May 2018
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
108
5,920
0
04 Mar 2017
Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers
Alexander Binder
G. Montavon
Sebastian Lapuschkin
K. Müller
Wojciech Samek
FAtt
54
456
0
04 Apr 2016
1