An analytic theory of generalization dynamics and transfer learning in deep linear networks

27 September 2018

Papers citing "An analytic theory of generalization dynamics and transfer learning in deep linear networks"

30 / 30 papers shown

Title
Make Haste Slowly: A Theory of Emergent Structured Mixed Selectivity in Feature Learning ReLU Networks Devon Jarvis Richard Klein Benjamin Rosman Andrew M. Saxe MLT 64 1 0 08 Mar 2025
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 32 3 0 22 Sep 2024
Disentangling and Mitigating the Impact of Task Similarity for Continual Learning Naoki Hiratani CLL 35 2 0 30 May 2024
Learned feature representations are biased by complexity, learning order, position, and more Andrew Kyle Lampinen Stephanie C. Y. Chan Katherine Hermann AI4CE FaML SSL OOD 34 6 0 09 May 2024
Reconciling Shared versus Context-Specific Information in a Neural Network Model of Latent Causes Qihong Lu Tan Nguyen Qiong Zhang Uri Hasson Thomas L. Griffiths Jeffrey M. Zacks Samuel Gershman K. A. Norman 30 4 0 13 Dec 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions Nishil Patel Sebastian Lee Stefano Sarao Mannelli Sebastian Goldt Adrew Saxe OffRL 28 3 0 17 Jun 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 16 6 0 03 Feb 2023
Globally Gated Deep Linear Networks Qianyi Li H. Sompolinsky AI4CE 14 10 0 31 Oct 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks Andrew M. Saxe Shagun Sodhani Sam Lewallen AI4CE 28 34 0 21 Jul 2022
Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli 21 71 0 08 Jun 2022
Explaining the physics of transfer learning a data-driven subgrid-scale closure to a different turbulent flow Adam Subel Yifei Guan A. Chattopadhyay P. Hassanzadeh AI4CE 27 41 0 07 Jun 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks Noam Razin Asaf Maman Nadav Cohen 40 29 0 27 Jan 2022
Overview frequency principle/spectral bias in deep learning Z. Xu Yaoyu Zhang Tao Luo FaML 27 65 0 19 Jan 2022
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations Jiayao Zhang Hua Wang Weijie J. Su 32 7 0 11 Oct 2021
Towards Demystifying Representation Learning with Non-contrastive Self-supervision Xiang Wang Xinlei Chen S. Du Yuandong Tian SSL 18 26 0 11 Oct 2021
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning Yehuda Dar Vidya Muthukumar Richard G. Baraniuk 29 71 0 06 Sep 2021
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs Gadi Naveh Z. Ringel SSL MLT 23 31 0 08 Jun 2021
Understanding self-supervised Learning Dynamics without Contrastive Pairs Yuandong Tian Xinlei Chen Surya Ganguli SSL 138 279 0 12 Feb 2021
Phase Transitions in Transfer Learning for High-Dimensional Perceptrons Oussama Dhifallah Yue M. Lu 32 20 0 06 Jan 2021
Gradient Starvation: A Learning Proclivity in Neural Networks Mohammad Pezeshki Sekouba Kaba Yoshua Bengio Aaron Courville Doina Precup Guillaume Lajoie MLT 45 257 0 18 Nov 2020
Chaos and Complexity from Quantum Neural Network: A study with Diffusion Metric in Machine Learning S. Choudhury Ankan Dutta Debisree Ray 22 21 0 16 Nov 2020
Understanding Self-supervised Learning with Dual Deep Networks Yuandong Tian Lantao Yu Xinlei Chen Surya Ganguli SSL 13 78 0 01 Oct 2020
Learning to Play against Any Mixture of Opponents Max O. Smith Thomas W. Anthony Yongzhao Wang Michael P. Wellman OffRL 22 9 0 29 Sep 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding Dmitry Lepikhin HyoukJoong Lee Yuanzhong Xu Dehao Chen Orhan Firat Yanping Huang M. Krikun Noam M. Shazeer Z. Chen MoE 20 1,106 0 30 Jun 2020
What shapes feature representations? Exploring datasets, architectures, and training Katherine L. Hermann Andrew Kyle Lampinen OOD 23 153 0 22 Jun 2020
An analytic theory of shallow networks dynamics for hinge loss classification Franco Pellegrini Giulio Biroli 24 19 0 19 Jun 2020
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks Yehuda Dar Richard G. Baraniuk 23 19 0 12 Jun 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms Noam Razin Nadav Cohen 21 155 0 13 May 2020
Implicit Regularization in Deep Matrix Factorization Sanjeev Arora Nadav Cohen Wei Hu Yuping Luo AI4CE 24 491 0 31 May 2019
Norm-Based Capacity Control in Neural Networks Behnam Neyshabur Ryota Tomioka Nathan Srebro 119 577 0 27 Feb 2015