On Lazy Training in Differentiable Programming

19 December 2018

Papers citing "On Lazy Training in Differentiable Programming"

50 / 227 papers shown

Title
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features Simone Bombari Marco Mondelli AAML 31 4 0 20 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks Eshaan Nichani Alexandru Damian Jason D. Lee MLT 44 13 0 11 May 2023
Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions Alberto Bordino Stefano Favaro S. Fortini 32 7 0 08 Apr 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon Cengiz Pehlevan MLT 38 29 0 06 Apr 2023
Wide neural networks: From non-gaussian random fields at initialization to the NTK geometry of training Luís Carvalho Joao L. Costa José Mourao Gonccalo Oliveira AI4CE 26 1 0 06 Apr 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks Scott Pesme Nicolas Flammarion 31 35 0 02 Apr 2023
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels Xuchen You Shouvanik Chakrabarti Boyang Chen Xiaodi Wu 34 10 0 26 Mar 2023
Online Learning for the Random Feature Model in the Student-Teacher Framework Roman Worschech B. Rosenow 46 0 0 24 Mar 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks Zheng Chen Yuqing Li Tao Luo Zhaoguang Zhou Z. Xu MLT AI4CE 49 8 0 12 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together! Shiwei Liu Tianlong Chen Zhenyu (Allen) Zhang Xuxi Chen Tianjin Huang Ajay Jaiswal Zhangyang Wang 32 29 0 03 Mar 2023
The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting Hongyao Tang Hao Fei Jianye Hao 23 1 0 02 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 37 16 0 20 Feb 2023
Dataset Distillation with Convexified Implicit Gradients Noel Loo Ramin Hasani Mathias Lechner Daniela Rus DD 31 41 0 13 Feb 2023
How to prepare your task head for finetuning Yi Ren Shangmin Guo Wonho Bae Danica J. Sutherland 24 14 0 11 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels Simone Bombari Shayan Kiyani Marco Mondelli AAML 40 10 0 03 Feb 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation Noel Loo Ramin Hasani Mathias Lechner Alexander Amini Daniela Rus DD 42 5 0 02 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 64 2 0 02 Feb 2023
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning Antonio Sclocchi Mario Geiger M. Wyart 40 6 0 31 Jan 2023
A Simple Algorithm For Scaling Up Kernel Methods Tengyu Xu Bryan Kelly Semyon Malamud 16 0 0 26 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Guihong Li Yuedong Yang Kartikeya Bhardwaj R. Marculescu 36 61 0 26 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 53 11 0 30 Dec 2022
The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for Deep Quantum Machine Learning Massimiliano Incudini Michele Grossi Antonio Mandarino S. Vallecorsa Alessandra Di Pierro David Windridge 33 6 0 22 Dec 2022
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 38 36 0 14 Dec 2022
Selective Amnesia: On Efficient, High-Fidelity and Blind Suppression of Backdoor Effects in Trojaned Machine Learning Models Rui Zhu Di Tang Siyuan Tang Xiaofeng Wang Haixu Tang AAML FedML 37 13 0 09 Dec 2022
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels Kangyu Weng Aohua Cheng Ziyang Zhang Pei Sun Yang Tian 53 2 0 04 Dec 2022
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
Why Neural Networks Work Sayan Mukherjee Bernardo A. Huberman 13 2 0 26 Nov 2022
Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models Mark Rofin Nikita Balagansky Daniil Gavrilov MoMe KELM 38 5 0 22 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare? Julius Martinetz T. Martinetz 30 1 0 07 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 27 5 0 28 Oct 2022
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 170 68 0 27 Oct 2022
Evolution of Neural Tangent Kernels under Benign and Adversarial Training Noel Loo Ramin Hasani Alexander Amini Daniela Rus AAML 34 13 0 21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 26 10 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 26 5 0 20 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? Nikolaos Tsilivis Julia Kempe AAML 44 17 0 11 Oct 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 45 56 0 11 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies Sho Yaida 58 16 0 10 Oct 2022
Continual task learning in natural and artificial agents Timo Flesch Andrew M. Saxe Christopher Summerfield CLL 43 24 0 10 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 19 1 0 10 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 324 48 0 29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons Sangmin Lee Byeongsu Sim Jong Chul Ye MLT 96 6 0 27 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty Thomas George Guillaume Lajoie A. Baratin 31 5 0 19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 52 6 0 17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher 39 19 0 15 Sep 2022
Differentiable Programming for Earth System Modeling Maximilian Gelbrecht Alistair J R White S. Bathiany Niklas Boers 21 16 0 29 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability Z. Li Zixuan Wang Jian Li 19 44 0 26 Jul 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks Andrew M. Saxe Shagun Sodhani Sam Lewallen AI4CE 30 34 0 21 Jul 2022
Graph Neural Network Bandits Parnian Kassraie Andreas Krause Ilija Bogunovic 26 11 0 13 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent Zhiyuan Li Tianhao Wang Jason D. Lee Sanjeev Arora 42 27 0 08 Jul 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 22 114 0 30 Jun 2022