ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.15576
  4. Cited By
Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders

Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders

24 February 2025
Xuansheng Wu
Jiayi Yuan
Wenlin Yao
Xiaoming Zhai
Ninghao Liu
    LLMSV
ArXivPDFHTML

Papers citing "Interpreting and Steering LLMs with Mutual Information-based Explanations on Sparse Autoencoders"

3 / 3 papers shown
Title
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders
Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders
Dong Shu
Xuansheng Wu
Haiyan Zhao
Mengnan Du
Ninghao Liu
LLMSV
35
0
0
12 May 2025
Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics
Can GPT tell us why these images are synthesized? Empowering Multimodal Large Language Models for Forensics
Yiran He
Yun Cao
Bowen Yang
Zeyu Zhang
24
0
0
16 Apr 2025
Towards Trustworthy GUI Agents: A Survey
Towards Trustworthy GUI Agents: A Survey
Yucheng Shi
Wenhao Yu
Wenlin Yao
Wenhu Chen
Ninghao Liu
39
2
0
30 Mar 2025
1