ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.06394
53
0

How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders

9 March 2025
Tatsuro Inaba
Kentaro Inui
Yusuke Miyao
Yohei Oseki
Benjamin Heinzerling
Yu Takagi
ArXivPDFHTML
Abstract

Large Language Models (LLMs) demonstrate remarkable multilingual capabilities and broad knowledge. However, the internal mechanisms underlying the development of these capabilities remain poorly understood. To investigate this, we analyze how the information encoded in LLMs' internal representations evolves during the training process. Specifically, we train sparse autoencoders at multiple checkpoints of the model and systematically compare the interpretative results across these stages. Our findings suggest that LLMs initially acquire language-specific knowledge independently, followed by cross-linguistic correspondences. Moreover, we observe that after mastering token-level knowledge, the model transitions to learning higher-level, abstract concepts, indicating the development of more conceptual understanding.

View on arXiv
@article{inaba2025_2503.06394,
  title={ How LLMs Learn: Tracing Internal Representations with Sparse Autoencoders },
  author={ Tatsuro Inaba and Kentaro Inui and Yusuke Miyao and Yohei Oseki and Benjamin Heinzerling and Yu Takagi },
  journal={arXiv preprint arXiv:2503.06394},
  year={ 2025 }
}
Comments on this paper