ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.15754
31
0

Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

27 January 2025
Go Kamoda
Benjamin Heinzerling
Tatsuro Inaba
Keito Kudo
Keisuke Sakaguchi
Kentaro Inui
    MILM
ArXivPDFHTML
Abstract

According to the stages-of-inference hypothesis, early layers of language models map their subword-tokenized input, which does not necessarily correspond to a linguistically meaningful segmentation, to more meaningful representations that form the model's "inner vocabulary". Prior analysis of this detokenization stage has predominantly relied on probing and interventions such as path patching, which involve selecting particular inputs, choosing a subset of components that will be patched, and then observing changes in model behavior. Here, we show that several important aspects of the detokenization stage can be understood purely by analyzing model weights, without performing any model inference steps. Specifically, we introduce an analytical decomposition of first-layer attention in GPT-2. Our decomposition yields interpretable terms that quantify the relative contributions of position-related, token-related, and mixed effects. By focusing on terms in this decomposition, we discover weight-based explanations of attention bias toward close tokens and attention for detokenization.

View on arXiv
@article{kamoda2025_2501.15754,
  title={ Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference },
  author={ Go Kamoda and Benjamin Heinzerling and Tatsuro Inaba and Keito Kudo and Keisuke Sakaguchi and Kentaro Inui },
  journal={arXiv preprint arXiv:2501.15754},
  year={ 2025 }
}
Comments on this paper