ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.09635
  4. Cited By
Transformers Get Stable: An End-to-End Signal Propagation Theory for
  Language Models
v1v2 (latest)

Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models

International Conference on Machine Learning (ICML), 2024
14 March 2024
Akhil Kedia
Mohd Abbas Zaidi
Sushil Khyalia
Jungho Jung
Harshith Goka
Haejun Lee
    MoE
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (10★)

Papers citing "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"

4 / 4 papers shown
Normalization in Attention Dynamics
Normalization in Attention Dynamics
Nikita Karagodin
Shu Ge
Yury Polyanskiy
Philippe Rigollet
274
3
0
24 Oct 2025
Stability of Transformers under Layer Normalization
Stability of Transformers under Layer Normalization
Kelvin Kan
Xingjian Li
Benjamin J. Zhang
Tuhin Sahai
Stanley Osher
Krishna Kumar
Markos A. Katsoulakis
178
3
0
10 Oct 2025
Short-Range Dependency Effects on Transformer Instability and a Decomposed Attention Solution
Short-Range Dependency Effects on Transformer Instability and a Decomposed Attention Solution
Suvadeep Hajra
323
1
0
21 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
702
38
0
02 May 2025
1
Page 1 of 1