Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.15576
Cited By
Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders
24 February 2025
Xuansheng Wu
Jiayi Yuan
Wenlin Yao
Xiaoming Zhai
Ninghao Liu
LLMSV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders"
3 / 3 papers shown
Title
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders
Dong Shu
Xuansheng Wu
Haiyan Zhao
Mengnan Du
Ninghao Liu
LLMSV
35
0
0
12 May 2025
Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics
Yiran He
Yun Cao
Bowen Yang
Zeyu Zhang
24
0
0
16 Apr 2025
Towards Trustworthy GUI Agents: A Survey
Yucheng Shi
Wenhao Yu
Wenlin Yao
Wenhu Chen
Ninghao Liu
39
2
0
30 Mar 2025
1