Deep Double Descent: Where Bigger Models and More Data Hurt

4 December 2019

Papers citing "Deep Double Descent: Where Bigger Models and More Data Hurt"

50 / 182 papers shown

Title
Unifying Grokking and Double Descent Peter W. Battaglia David Raposo Kelsey 34 31 0 10 Mar 2023
Tradeoff of generalization error in unsupervised learning Gilhan Kim Ho-Jun Lee Junghyo Jo Yongjoo Baek 13 0 0 10 Mar 2023
DSD $^2$ : Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free? Victor Quétu Enzo Tartaglione 26 7 0 02 Mar 2023
Can we avoid Double Descent in Deep Neural Networks? Victor Quétu Enzo Tartaglione AI4CE 20 3 0 26 Feb 2023
Generalization Bounds with Data-dependent Fractal Dimensions Benjamin Dupuis George Deligiannidis Umut cSimcsekli AI4CE 35 12 0 06 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels Simone Bombari Shayan Kiyani Marco Mondelli AAML 28 10 0 03 Feb 2023
Pathologies of Predictive Diversity in Deep Ensembles Taiga Abe E. Kelly Buchanan Geoff Pleiss John P. Cunningham UQCV 38 13 0 01 Feb 2023
On the Lipschitz Constant of Deep Networks and Double Descent Matteo Gamba Hossein Azizpour Marten Bjorkman 25 7 0 28 Jan 2023
A Simple Algorithm For Scaling Up Kernel Methods Tengyu Xu Bryan T. Kelly Semyon Malamud 11 0 0 26 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model Z. Fu Anthony Man-Cho So Nigel Collier 23 3 0 24 Jan 2023
Strong inductive biases provably prevent harmless interpolation Michael Aerni Marco Milanta Konstantin Donhauser Fanny Yang 30 9 0 18 Jan 2023
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer Firas Laakom Jenni Raitoharju Alexandros Iosifidis M. Gabbouj AI4CE 26 7 0 03 Jan 2023
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification Yuxuan Du Yibo Yang Dacheng Tao Min-hsiu Hsieh 36 22 0 29 Dec 2022
Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures Antione Bodin N. Macris 34 4 0 13 Dec 2022
Leveraging Unlabeled Data to Track Memorization Mahsa Forouzesh Hanie Sedghi Patrick Thiran NoLa TDI 34 3 0 08 Dec 2022
Task Discovery: Finding the Tasks that Neural Networks Generalize on Andrei Atanov Andrei Filatov Teresa Yeo Ajay Sohmshetty Amir Zamir OOD 40 10 0 01 Dec 2022
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance Marco Loog T. Viering 21 1 0 25 Nov 2022
PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization Sanae Lotfi Marc Finzi Sanyam Kapoor Andres Potapczynski Micah Goldblum A. Wilson BDL MLT AI4CE 29 51 0 24 Nov 2022
Novel transfer learning schemes based on Siamese networks and synthetic data Dominik Stallmann Philip Kenneweg Barbara Hammer 18 6 0 21 Nov 2022
Understanding the double descent curve in Machine Learning Luis Sa-Couto J. M. Ramos Miguel Almeida Andreas Wichert 27 1 0 18 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features Lawrence Stewart Francis R. Bach Quentin Berthet Jean-Philippe Vert 27 24 0 10 Nov 2022
A Solvable Model of Neural Scaling Laws A. Maloney Daniel A. Roberts J. Sully 36 51 0 30 Oct 2022
Broken Neural Scaling Laws Ethan Caballero Kshitij Gupta Irina Rish David M. Krueger 19 74 0 26 Oct 2022
Grokking phase transitions in learning local rules with gradient descent Bojan Žunkovič E. Ilievski 63 16 0 26 Oct 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 45 56 0 11 Oct 2022
The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels Daniel Shwartz Uri Stern D. Weinshall NoLa 33 2 0 02 Oct 2022
The Minority Matters: A Diversity-Promoting Collaborative Metric Learning Algorithm Shilong Bao Qianqian Xu Zhiyong Yang Yuan He Xiaochun Cao Qingming Huang 30 8 0 30 Sep 2022
On the Impossible Safety of Large AI Models El-Mahdi El-Mhamdi Sadegh Farhadkhani R. Guerraoui Nirupam Gupta L. Hoang Rafael Pinot Sébastien Rouault John Stephan 30 31 0 30 Sep 2022
Bayesian Neural Network Versus Ex-Post Calibration For Prediction Uncertainty Satya Borgohain Klaus Ackermann Rubén Loaiza-Maya BDL UQCV 13 0 0 29 Sep 2022
Neural parameter calibration for large-scale multi-agent models Thomas Gaskin G. Pavliotis Mark Girolami AI4TS 23 23 0 27 Sep 2022
In-context Learning and Induction Heads Catherine Olsson Nelson Elhage Neel Nanda Nicholas Joseph Nova Dassarma ... Tom B. Brown Jack Clark Jared Kaplan Sam McCandlish C. Olah 250 460 0 24 Sep 2022
Deep Double Descent via Smooth Interpolation Matteo Gamba Erik Englesson Marten Bjorkman Hossein Azizpour 63 10 0 21 Sep 2022
Importance Tempering: Group Robustness for Overparameterized Models Yiping Lu Wenlong Ji Zachary Izzo Lexing Ying 39 7 0 19 Sep 2022
Information FOMO: The unhealthy fear of missing out on information. A method for removing misleading data for healthier models Ethan Pickering T. Sapsis 21 6 0 27 Aug 2022
Learning Hyper Label Model for Programmatic Weak Supervision Renzhi Wu Sheng Chen Jieyu Zhang Xu Chu 23 16 0 27 Jul 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks Charles Edison Tripp J. Perr-Sauer L. Hayne M. Lunacek Jamil Gafur AI4CE 21 0 0 25 Jul 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting Zhengqi He Zeke Xie Quanzhi Zhu Zengchang Qin 69 27 0 17 Jun 2022
Learning Uncertainty with Artificial Neural Networks for Improved Predictive Process Monitoring Hans Weytjens Jochen De Weerdt 19 17 0 13 Jun 2022
Towards Understanding Sharpness-Aware Minimization Maksym Andriushchenko Nicolas Flammarion AAML 26 133 0 13 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation Ge Yang Anurag Ajay Pulkit Agrawal 32 25 0 09 Jun 2022
Regularization-wise double descent: Why it occurs and how to eliminate it Fatih Yilmaz Reinhard Heckel 25 11 0 03 Jun 2022
A Blessing of Dimensionality in Membership Inference through Regularization Jasper Tan Daniel LeJeune Blake Mason Hamid Javadi Richard G. Baraniuk 32 18 0 27 May 2022
Physics-Embedded Neural Networks: Graph Neural PDE Solvers with Mixed Boundary Conditions Masanobu Horie Naoto Mitsume PINN AI4CE 26 23 0 24 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models Kushal Tirumala Aram H. Markosyan Luke Zettlemoyer Armen Aghajanyan TDI 29 185 0 22 May 2022
Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime Hong Hu Yue M. Lu 51 15 0 13 May 2022
Overparameterization Improves StyleGAN Inversion Yohan Poirier-Ginter Alexandre Lessard Ryan Smith Jean-François Lalonde 40 4 0 12 May 2022
Investigating Generalization by Controlling Normalized Margin Alexander R. Farhang Jeremy Bernstein Kushal Tirumala Yang Liu Yisong Yue 28 6 0 08 May 2022
Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning Mathias Lechner Alexander Amini Daniela Rus T. Henzinger AAML 26 9 0 15 Apr 2022
Machine Learning and Deep Learning -- A review for Ecologists Maximilian Pichler F. Hartig 42 127 0 11 Apr 2022
Discovering and forecasting extreme events via active learning in neural operators Ethan Pickering Stephen Guth George Karniadakis T. Sapsis AI4CE 11 57 0 05 Apr 2022