ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.20129
39
0

Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking

27 February 2025
Yifan Zhang
Wenyu Du
Dongming Jin
Jie Fu
Zhi Jin
    LRM
ArXivPDFHTML
Abstract

Chain-of-Thought (CoT) significantly enhances the performance of large language models (LLMs) across a wide range of tasks, and prior research shows that CoT can theoretically increase expressiveness. However, there is limited mechanistic understanding of the algorithms that Transformer+CoT can learn. In this work, we (1) evaluate the state tracking capabilities of Transformer+CoT and its variants, confirming the effectiveness of CoT. (2) Next, we identify the circuit, a subset of model components, responsible for tracking the world state, finding that late-layer MLP neurons play a key role. We propose two metrics, compression and distinction, and show that the neuron sets for each state achieve nearly 100% accuracy, providing evidence of an implicit finite state automaton (FSA) embedded within the model. (3) Additionally, we explore three realistic settings: skipping intermediate steps, introducing data noise, and testing length generalization. Our results demonstrate that Transformer+CoT learns robust algorithms (FSA), highlighting its resilience in challenging scenarios.

View on arXiv
@article{zhang2025_2502.20129,
  title={ Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking },
  author={ Yifan Zhang and Wenyu Du and Dongming Jin and Jie Fu and Zhi Jin },
  journal={arXiv preprint arXiv:2502.20129},
  year={ 2025 }
}
Comments on this paper