ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.01092
  4. Cited By
A Dynamical Model of Neural Scaling Laws
v1v2v3v4 (latest)

A Dynamical Model of Neural Scaling Laws

2 February 2024
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
ArXiv (abs)PDFHTMLGithub

Papers citing "A Dynamical Model of Neural Scaling Laws"

45 / 45 papers shown
Data Curation Through the Lens of Spectral Dynamics: Static Limits, Dynamic Acceleration, and Practical Oracles
Data Curation Through the Lens of Spectral Dynamics: Static Limits, Dynamic Acceleration, and Practical Oracles
Yizhou Zhang
Lun Du
187
0
0
02 Dec 2025
Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data
Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data
Guillaume Braun
Bruno Loureiro
Ha Quang Minh
Masaaki Imaizumi
163
1
0
24 Nov 2025
Axial Neural Networks for Dimension-Free Foundation Models
Axial Neural Networks for Dimension-Free Foundation Models
Hyunsu Kim
Jonggeon Park
Joan Bruna
Hongseok Yang
Juho Lee
AI4CE
209
0
0
15 Oct 2025
Mid-Training of Large Language Models: A Survey
Mid-Training of Large Language Models: A Survey
Kaixiang Mo
Yuxin Shi
Weiwei Weng
Zhiqiang Zhou
Shuman Liu
Haibo Zhang
Anxiang Zeng
LRM
189
0
0
08 Oct 2025
Kernel ridge regression under power-law data: spectrum and generalization
Kernel ridge regression under power-law data: spectrum and generalization
Arie Wortsman
Bruno Loureiro
220
3
0
06 Oct 2025
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
Blake Bordelon
Mary I. Letey
Cengiz Pehlevan
208
5
0
01 Oct 2025
Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime
Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime
Leonardo Defilippis
Yizhou Xu
Julius Girardin
Emanuele Troiani
Vittorio Erba
Lenka Zdeborová
Bruno Loureiro
Florent Krzakala
182
7
0
29 Sep 2025
Evaluating the Robustness of Chinchilla Compute-Optimal Scaling
Evaluating the Robustness of Chinchilla Compute-Optimal Scaling
Rylan Schaeffer
Noam Levi
Andreas Kirsch
Theo Guenais
Brando Miranda
Elyas Obbad
Sanmi Koyejo
LRM
220
3
0
28 Sep 2025
Scaling Laws are Redundancy Laws
Scaling Laws are Redundancy Laws
Yuda Bi
Vince D. Calhoun
139
2
0
25 Sep 2025
Theoretical Foundations of Representation Learning using Unlabeled Data: Statistics and Optimization
Theoretical Foundations of Representation Learning using Unlabeled Data: Statistics and Optimization
Pascal Esser
Maximilian Fleissner
Debarghya Ghoshdastidar
SSL
270
0
0
23 Sep 2025
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Lowell Weissman
Michael Krumdick
A. Lynn Abbott
352
1
0
15 Jun 2025
Improved Scaling Laws in Linear Regression via Data Reuse
Licong Lin
Jingfeng Wu
Peter Bartlett
231
2
0
10 Jun 2025
Models of Heavy-Tailed Mechanistic Universality
Models of Heavy-Tailed Mechanistic Universality
Liam Hodgkinson
Zhichao Wang
Michael W. Mahoney
302
7
0
04 Jun 2025
X-Factor: Quality Is a Dataset-Intrinsic Property
X-Factor: Quality Is a Dataset-Intrinsic Property
Josiah D. Couch
Miao Li
Rima Arnaout
R. Arnaout
300
3
0
28 May 2025
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
Francesco Cagnetta
Alessandro Favero
Antonio Sclocchi
Matthieu Wyart
442
3
0
11 May 2025
Learning curves theory for hierarchically compositional data with power-law distributed features
Learning curves theory for hierarchically compositional data with power-law distributed features
Francesco Cagnetta
Hyunmo Kang
Matthieu Wyart
390
6
0
11 May 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
A Multi-Power Law for Loss Curve Prediction Across Learning Rate SchedulesInternational Conference on Learning Representations (ICLR), 2025
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
325
19
0
17 Mar 2025
Uncertainty Quantification From Scaling Laws in Deep Neural Networks
Uncertainty Quantification From Scaling Laws in Deep Neural Networks
Ibrahim Elsharkawy
Yonatan Kahn
Benjamin Hooberman
UQCV
309
0
0
07 Mar 2025
L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling
L2^22M: Mutual Information Scaling Law for Long-Context Language Modeling
Zhuo Chen
Oriol Mayné i Comas
Zhuotao Jin
Di Luo
Marin Soljacic
427
6
0
06 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
405
4
0
03 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
632
5
0
28 Feb 2025
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
Ayan Sengupta
Ayan Sengupta
Tanmoy Chakraborty
604
4
0
17 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon
Cengiz Pehlevan
AI4CE
800
13
0
04 Feb 2025
Physics of Skill Learning
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
409
4
0
21 Jan 2025
Loss-to-Loss Prediction: Scaling Laws for All Datasets
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener
Nikhil Anand
Nikhil Vyas
Eran Malach
Sham Kakade
335
12
0
19 Nov 2024
Scaling Laws for Precision
Scaling Laws for PrecisionInternational Conference on Learning Representations (ICLR), 2024
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
Cengiz Pehlevan
Christopher Ré
Aditi Raghunathan
AIFinMoMe
497
74
0
07 Nov 2024
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?International Conference on Learning Representations (ICLR), 2024
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
766
49
0
29 Oct 2024
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling LawsInternational Conference on Learning Representations (ICLR), 2024
M. E. Ildiz
Halil Alperen Gozeten
Ege Onur Taga
Marco Mondelli
Samet Oymak
649
15
0
24 Oct 2024
Towards Neural Scaling Laws for Time Series Foundation Models
Towards Neural Scaling Laws for Time Series Foundation ModelsInternational Conference on Learning Representations (ICLR), 2024
Qingren Yao
Chao-Han Huck Yang
Renhe Jiang
Yuxuan Liang
Ming Jin
Shirui Pan
AI4TSAI4CE
451
31
0
16 Oct 2024
Scaling laws for post-training quantized large language models
Scaling laws for post-training quantized large language models
Zifei Xu
Alexander Lan
W. Yazar
T. Webb
Sayeh Sharify
Xin Eric Wang
MQ
564
4
0
15 Oct 2024
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data
  Spectra
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data SpectraInternational Conference on Learning Representations (ICLR), 2024
Roman Worschech
B. Rosenow
443
3
0
11 Oct 2024
The Optimization Landscape of SGD Across the Feature Learning Strength
The Optimization Landscape of SGD Across the Feature Learning StrengthInternational Conference on Learning Representations (ICLR), 2024
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
Cengiz Pehlevan
495
12
0
06 Oct 2024
Dynamic neuron approach to deep neural networks: Decoupling neurons for renormalization group analysis
Dynamic neuron approach to deep neural networks: Decoupling neurons for renormalization group analysis
Donghee Lee
Hye-Sung Lee
Jaeok Yi
511
1
0
01 Oct 2024
How Feature Learning Can Improve Neural Scaling Laws
How Feature Learning Can Improve Neural Scaling LawsInternational Conference on Learning Representations (ICLR), 2024
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
559
43
0
26 Sep 2024
Unified Neural Network Scaling Laws and Scale-time Equivalence
Unified Neural Network Scaling Laws and Scale-time Equivalence
Akhilan Boopathy
Ila Fiete
550
3
0
09 Sep 2024
Risk and cross validation in ridge regression with correlated samples
Risk and cross validation in ridge regression with correlated samples
Alexander B. Atanasov
Jacob A. Zavatone-Veth
Cengiz Pehlevan
586
8
0
08 Aug 2024
Spring-block theory of feature learning in deep neural networks
Spring-block theory of feature learning in deep neural networks
Chengzhi Shi
Liming Pan
Ivan Dokmanić
AI4CE
647
4
0
28 Jul 2024
A Generalization Bound for Nearly-Linear Networks
A Generalization Bound for Nearly-Linear Networks
Eugene Golikov
354
0
0
09 Jul 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
620
62
0
27 Jun 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
556
42
0
12 Jun 2024
Towards a theory of how the structure of language is acquired by deep
  neural networks
Towards a theory of how the structure of language is acquired by deep neural networks
Francesco Cagnetta
Matthieu Wyart
481
23
0
28 May 2024
Infinite Limits of Multi-head Transformer Dynamics
Infinite Limits of Multi-head Transformer Dynamics
Blake Bordelon
Hamza Tahir Chaudhry
Cengiz Pehlevan
AI4CE
430
32
0
24 May 2024
Chinchilla Scaling: A replication attempt
Chinchilla Scaling: A replication attempt
T. Besiroglu
Ege Erdil
Matthew Barnett
Josh You
341
47
0
15 Apr 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling LawsInternational Conference on Machine Learning (ICML), 2023
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
1.1K
136
0
31 Dec 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean
  Field Neural Networks
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural NetworksNeural Information Processing Systems (NeurIPS), 2023
Blake Bordelon
Cengiz Pehlevan
MLT
377
43
0
06 Apr 2023
1
Page 1 of 1