Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2405.05409
Cited By
v1
v2
v3
v4
v5 (latest)
Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
Neural Information Processing Systems (NeurIPS), 2024
8 May 2024
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Z. Xu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing"
3 / 3 papers shown
Title
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
Zheng-an Chen
Tao Luo
AI4CE
76
1
0
08 Oct 2025
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
404
2
0
21 Feb 2025
Quantifying artificial intelligence through algorithmic generalization
Nature Machine Intelligence (Nat. Mach. Intell.), 2024
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
380
0
0
08 Nov 2024
1