Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.17495
Cited By
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
23 May 2025
Landon Butler
Abhineet Agarwal
Justin Singh Kang
Yigit Efe Erginbas
Bin Yu
Kannan Ramchandran
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs"
9 / 9 papers shown
Title
SPEX: Scaling Feature Interaction Explanations for LLMs
Justin Singh Kang
Landon Butler
Abhineet Agarwal
Yigit Efe Erginbas
Ramtin Pedarsani
Kannan Ramchandran
Bin Yu
VLM
LRM
114
2
0
20 Feb 2025
Closed-Form Feedback-Free Learning with Forward Projection
Robert O'Shea
Bipin Rajendran
45
18
0
27 Jan 2025
Attribution Patching Outperforms Automated Circuit Discovery
Aaquib Syed
Can Rager
Arthur Conmy
112
62
0
16 Oct 2023
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
126
7,437
0
02 Oct 2019
A Fine-Grained Spectral Perspective on Neural Networks
Greg Yang
Hadi Salman
57
113
0
24 Jul 2019
Deep learning generalizes because the parameter-function map is biased towards simple functions
Guillermo Valle Pérez
Chico Q. Camargo
A. Louis
MLT
AI4CE
55
231
0
22 May 2018
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
538
21,613
0
22 May 2017
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
115
5,920
0
04 Mar 2017
Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers
Alexander Binder
G. Montavon
Sebastian Lapuschkin
K. Müller
Wojciech Samek
FAtt
54
456
0
04 Apr 2016
1