ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.11356
  4. Cited By
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models

SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models

17 February 2025
Z. He
Haiyan Zhao
Yiran Qiao
Fan Yang
Ali Payani
Jing Ma
Mengnan Du
    LLMSV
ArXivPDFHTML

Papers citing "SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models"

Title
No papers