Tracing Multilingual Representations in LLMs with Cross-Layer Transcoders

13 November 2025

Abir Harrasse

Florent Draye

Zhijing Jin

Bernhard Schölkopf

ArXiv (abs)PDF HTML Github (3★)

Main:11 Pages

48 Figures

Bibliography:2 Pages

5 Tables

Appendix:29 Pages

Abstract

Multilingual Large Language Models (LLMs) can process many languages, yet how they internally represent this diversity remains unclear. Do they form shared multilingual representations with language-specific decoding, and if so, why does performance favor the dominant training language? To address this, we train models on different multilingual mixtures and analyze their internal mechanisms using Cross-Layer Transcoders (CLTs) and Attribution Graphs. Our results reveal multilingual shared representations: the model employs highly similar features across languages, while language-specific decoding emerges in later layers.

View on arXiv

Comments on this paper