ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2405.05409
  4. Cited By
Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing
v1v2v3v4v5 (latest)

Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing

Neural Information Processing Systems (NeurIPS), 2024
8 May 2024
Zhongwang Zhang
Pengxiao Lin
Zhiwei Wang
Yaoyu Zhang
Z. Xu
ArXiv (abs)PDFHTML

Papers citing "Initialization is Critical to Whether Transformers Fit Composite Functions by Reasoning or Memorizing"

3 / 3 papers shown
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
From Condensation to Rank Collapse: A Two-Stage Analysis of Transformer Training Dynamics
Zheng-an Chen
Tao Luo
AI4CE
143
1
0
08 Oct 2025
Reasoning Bias of Next Token Prediction Training
Reasoning Bias of Next Token Prediction Training
Pengxiao Lin
Zhongwang Zhang
Zhi-Qin John Xu
LRM
476
2
0
21 Feb 2025
Quantifying artificial intelligence through algorithmic generalization
Quantifying artificial intelligence through algorithmic generalizationNature Machine Intelligence (Nat. Mach. Intell.), 2024
Takuya Ito
Murray Campbell
L. Horesh
Tim Klinger
Parikshit Ram
ELM
448
0
0
08 Nov 2024
1