v1v2v3v4 (latest)

A Dynamical Model of Neural Scaling Laws

2 February 2024

Blake Bordelon

Alexander B. Atanasov

Cengiz Pehlevan

ArXiv (abs)PDF HTML Github

Papers citing "A Dynamical Model of Neural Scaling Laws"

45 / 45 papers shown

Data Curation Through the Lens of Spectral Dynamics: Static Limits, Dynamic Acceleration, and Practical Oracles

Yizhou Zhang

Lun Du

187

02 Dec 2025

Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data

163

24 Nov 2025

Axial Neural Networks for Dimension-Free Foundation Models

209

15 Oct 2025

Mid-Training of Large Language Models: A Survey

189

08 Oct 2025

Kernel ridge regression under power-law data: spectrum and generalization

Arie Wortsman

Bruno Loureiro

220

06 Oct 2025

Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time

Blake Bordelon

Mary I. Letey

Cengiz Pehlevan

208

01 Oct 2025

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime

182

29 Sep 2025

Evaluating the Robustness of Chinchilla Compute-Optimal Scaling

220

28 Sep 2025

Scaling Laws are Redundancy Laws

Yuda Bi

Vince D. Calhoun

139

25 Sep 2025

Theoretical Foundations of Representation Learning using Unlabeled Data: Statistics and Optimization

Pascal Esser

Maximilian Fleissner

Debarghya Ghoshdastidar

SSL

270

23 Sep 2025

Complexity Scaling Laws for Neural Models using Combinatorial Optimization

Lowell Weissman

Michael Krumdick

A. Lynn Abbott

352

15 Jun 2025

Improved Scaling Laws in Linear Regression via Data Reuse

Licong Lin

Jingfeng Wu

Peter Bartlett

231

10 Jun 2025

Models of Heavy-Tailed Mechanistic Universality

Liam Hodgkinson

Zhichao Wang

Michael W. Mahoney

302

04 Jun 2025

X-Factor: Quality Is a Dataset-Intrinsic Property

300

28 May 2025

Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures

442

11 May 2025

Learning curves theory for hierarchically compositional data with power-law distributed features

Francesco Cagnetta

Hyunmo Kang

Matthieu Wyart

390

11 May 2025

A Multi-Power Law for Loss Curve Prediction Across Learning Rate SchedulesInternational Conference on Learning Representations (ICLR), 2025

325

17 Mar 2025

Uncertainty Quantification From Scaling Laws in Deep Neural Networks

309

07 Mar 2025

^2

M: Mutual Information Scaling Law for Long-Context Language Modeling

427

06 Mar 2025

Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches

405

03 Mar 2025

Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

632

28 Feb 2025

How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Ayan Sengupta

Tanmoy Chakraborty

604

17 Feb 2025

Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer

Blake Bordelon

Cengiz Pehlevan

AI4CE

800

04 Feb 2025

Physics of Skill Learning

409

21 Jan 2025

Loss-to-Loss Prediction: Scaling Laws for All Datasets

335

19 Nov 2024

Scaling Laws for PrecisionInternational Conference on Learning Representations (ICLR), 2024

497

07 Nov 2024

How Does Critical Batch Size Scale in Pre-training?International Conference on Learning Representations (ICLR), 2024

766

29 Oct 2024

High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling LawsInternational Conference on Learning Representations (ICLR), 2024

M. E. Ildiz

Halil Alperen Gozeten

Ege Onur Taga

Marco Mondelli

Samet Oymak

649

24 Oct 2024

Towards Neural Scaling Laws for Time Series Foundation ModelsInternational Conference on Learning Representations (ICLR), 2024

451

16 Oct 2024

Scaling laws for post-training quantized large language models

564

15 Oct 2024

Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data SpectraInternational Conference on Learning Representations (ICLR), 2024

Roman Worschech

B. Rosenow

443

11 Oct 2024

The Optimization Landscape of SGD Across the Feature Learning StrengthInternational Conference on Learning Representations (ICLR), 2024

Alexander B. Atanasov

Alexandru Meterez

James B. Simon

Cengiz Pehlevan

495

06 Oct 2024

Dynamic neuron approach to deep neural networks: Decoupling neurons for renormalization group analysis

Donghee Lee

Hye-Sung Lee

Jaeok Yi

511

01 Oct 2024

How Feature Learning Can Improve Neural Scaling LawsInternational Conference on Learning Representations (ICLR), 2024

Blake Bordelon

Alexander B. Atanasov

Cengiz Pehlevan

559

26 Sep 2024

Unified Neural Network Scaling Laws and Scale-time Equivalence

Akhilan Boopathy

Ila Fiete

550

09 Sep 2024

Risk and cross validation in ridge regression with correlated samples

Alexander B. Atanasov

Jacob A. Zavatone-Veth

Cengiz Pehlevan

586

08 Aug 2024

Spring-block theory of feature learning in deep neural networks

647

28 Jul 2024

A Generalization Bound for Nearly-Linear Networks

Eugene Golikov

354

09 Jul 2024

Resolving Discrepancies in Compute-Optimal Scaling of Language Models

Ludwig Schmidt

620

27 Jun 2024

Scaling Laws in Linear Regression: Compute, Parameters, and Data

556

12 Jun 2024

Towards a theory of how the structure of language is acquired by deep neural networks

Francesco Cagnetta

Matthieu Wyart

481

28 May 2024

Infinite Limits of Multi-head Transformer Dynamics

430

24 May 2024

Chinchilla Scaling: A replication attempt

341

15 Apr 2024

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling LawsInternational Conference on Machine Learning (ICML), 2023

1.1K

136

31 Dec 2023

Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural NetworksNeural Information Processing Systems (NeurIPS), 2023

Blake Bordelon

Cengiz Pehlevan

MLT

377

06 Apr 2023