SpanNorm: Reconciling Training Stability and Performance in Deep Transformers
Chao Wang
Bei Li
Jiaqi Zhang
Xinyu Liu
Yuchun Fan
Linkun Lyu
Xin Chen
Jingang Wang
Tong Xiao
Peng Pei
Xunliang Cai
Papers citing "SpanNorm: Reconciling Training Stability and Performance in Deep Transformers"
0 / 0 papers shown
No papers found |
