Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.02292
Cited By
Deep Double Descent: Where Bigger Models and More Data Hurt
4 December 2019
Preetum Nakkiran
Gal Kaplun
Yamini Bansal
Tristan Yang
Boaz Barak
Ilya Sutskever
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Double Descent: Where Bigger Models and More Data Hurt"
50 / 182 papers shown
Title
A dynamic view of the double descent
Vivek Shripad Borkar
63
0
0
03 May 2025
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
Roman Abramov
Felix Steinbauer
Gjergji Kasneci
144
0
0
29 Apr 2025
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Yiping Wang
Qing Yang
Zhiyuan Zeng
Liliang Ren
L. Liu
...
Jianfeng Gao
Weizhu Chen
S. Wang
Simon S. Du
Yelong Shen
OffRL
ReLM
LRM
118
4
0
29 Apr 2025
The Double Descent Behavior in Two Layer Neural Network for Binary Classification
Chathurika S Abeykoon
A. Beknazaryan
Hailin Sang
51
1
0
27 Apr 2025
A Model Zoo on Phase Transitions in Neural Networks
Konstantin Schurholt
Léo Meynent
Yefan Zhou
Haiquan Lu
Yaoqing Yang
Damian Borth
68
0
0
25 Apr 2025
PETNet -- Coincident Particle Event Detection using Spiking Neural Networks
Jan Debus
Charlotte Debus
Günther Dissertori
Markus Gotz
31
0
0
09 Apr 2025
The Challenge of Achieving Attributability in Multilingual Table-to-Text Generation with Question-Answer Blueprints
Aden Haussmann
LMTD
57
0
0
29 Mar 2025
On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process
Shun Iwase
Shuya Takahashi
Nakamasa Inoue
Rio Yokota
Ryo Nakamura
Hirokatsu Kataoka
74
0
0
04 Mar 2025
From Small to Large Language Models: Revisiting the Federalist Papers
So Won Jeong
Veronika Rockova
37
0
0
25 Feb 2025
On Memorization in Diffusion Models
Xiangming Gu
Chao Du
Tianyu Pang
Chongxuan Li
Min-Bin Lin
Ye Wang
DiffM
TDI
166
43
0
21 Feb 2025
Early Stopping Against Label Noise Without Validation Data
Suqin Yuan
Lei Feng
Tongliang Liu
NoLa
98
15
0
11 Feb 2025
Analysis of Overparameterization in Continual Learning under a Linear Model
Daniel Goldfarb
Paul Hand
CLL
39
0
0
11 Feb 2025
The Cake that is Intelligence and Who Gets to Bake it: An AI Analogy and its Implications for Participation
Martin Mundt
Anaelia Ovalle
Felix Friedrich
A Pranav
Subarnaduti Paul
Manuel Brack
Kristian Kersting
William Agnew
281
0
0
05 Feb 2025
How more data can hurt: Instability and regularization in next-generation reservoir computing
Yuanzhao Zhang
Edmilson Roque dos Santos
Sean P. Cornelius
77
2
0
28 Jan 2025
Functional Risk Minimization
Ferran Alet
Clement Gehring
Tomás Lozano-Pérez
Kenji Kawaguchi
Joshua B. Tenenbaum
Leslie Pack Kaelbling
OffRL
60
0
0
31 Dec 2024
Understanding Model Ensemble in Transferable Adversarial Attack
Wei Yao
Zeliang Zhang
Huayi Tang
Yong Liu
33
2
0
09 Oct 2024
U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models
Tung-Yu Wu
Pei-Yu Lo
ReLM
LRM
46
2
0
02 Oct 2024
Investigating the Impact of Model Complexity in Large Language Models
Jing Luo
Huiyuan Wang
Weiran Huang
34
0
0
01 Oct 2024
Zero-shot forecasting of chaotic systems
Yuanzhao Zhang
William Gilpin
AI4TS
37
4
0
24 Sep 2024
Improved Diversity-Promoting Collaborative Metric Learning for Recommendation
Shilong Bao
Qianqian Xu
Zhiyong Yang
Yuan He
Xiaochun Cao
Qingming Huang
45
5
0
02 Sep 2024
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
103
1
0
29 Aug 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Daking Rai
Yilun Zhou
Shi Feng
Abulhair Saparov
Ziyu Yao
82
19
0
02 Jul 2024
Establishing Deep InfoMax as an effective self-supervised learning methodology in materials informatics
Michael Moran
Vladimir V. Gusev
M. Gaultois
Dmytro Antypov
M. Rosseinsky
AI4CE
25
0
0
30 Jun 2024
Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models
Ziche Liu
Rui Ke
Feng Jiang
Feng Jiang
Haizhou Li
69
1
0
20 Jun 2024
Just How Flexible are Neural Networks in Practice?
Ravid Shwartz-Ziv
Micah Goldblum
Arpit Bansal
C. B. Bruss
Yann LeCun
Andrew Gordon Wilson
40
4
0
17 Jun 2024
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement
Wangyou Zhang
Kohei Saijo
Jee-weon Jung
Chenda Li
Shinji Watanabe
Yanmin Qian
32
4
0
06 Jun 2024
Representations as Language: An Information-Theoretic Framework for Interpretability
Henry Conklin
Kenny Smith
MILM
39
1
0
04 Jun 2024
A Margin-based Multiclass Generalization Bound via Geometric Complexity
Michael Munn
Benoit Dherin
Javier Gonzalvo
UQCV
40
2
0
28 May 2024
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding
Zifan Carl Guo
Eric J. Michaud
Ziming Liu
Max Tegmark
48
3
0
27 May 2024
Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Xueyan Niu
Bo Bai
Lei Deng
Wei Han
31
6
0
14 May 2024
pFedLVM: A Large Vision Model (LVM)-Driven and Latent Feature-Based Personalized Federated Learning Framework in Autonomous Driving
Wei-Bin Kou
Qingfeng Lin
Ming Tang
Sheng Xu
Rongguang Ye
...
Shuai Wang
Guofa Li
Zhenyu Chen
Guangxu Zhu
Yik-Chung Wu
FedML
52
11
0
07 May 2024
Why is SAM Robust to Label Noise?
Christina Baek
Zico Kolter
Aditi Raghunathan
NoLa
AAML
41
9
0
06 May 2024
LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing
Zeyang Ma
A. Chen
Dong Jae Kim
Tse-Husn Chen
Shaowei Wang
27
44
0
27 Apr 2024
Predictive Churn with the Set of Good Models
J. Watson-Daniels
Flavio du Pin Calmon
Alexander DÁmour
Carol Xuan Long
David C. Parkes
Berk Ustun
83
7
0
12 Feb 2024
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker
Frederick Altrock
Benjamin Risse
76
5
0
22 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Ori Shem-Ur
Yaron Oz
14
0
0
08 Jan 2024
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
48
8
0
07 Sep 2023
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
T. Kajitsuka
Issei Sato
31
16
0
26 Jul 2023
What, Indeed, is an Achievable Provable Guarantee for Learning-Enabled Safety Critical Systems
Saddek Bensalem
Chih-Hong Cheng
Wei Huang
Xiaowei Huang
Changshun Wu
Xingyu Zhao
AAML
24
6
0
20 Jul 2023
Quantifying lottery tickets under label noise: accuracy, calibration, and complexity
V. Arora
Daniele Irto
Sebastian Goldt
G. Sanguinetti
36
2
0
21 Jun 2023
Gibbs-Based Information Criteria and the Over-Parameterized Regime
Haobo Chen
Yuheng Bu
Greg Wornell
27
1
0
08 Jun 2023
Double Descent of Discrepancy: A Task-, Data-, and Model-Agnostic Phenomenon
Yi-Xiao Luo
Bin Dong
26
0
0
25 May 2023
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features
Simone Bombari
Marco Mondelli
AAML
19
4
0
20 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
55
13
0
04 May 2023
Diversifying the High-level Features for better Adversarial Transferability
Zhiyuan Wang
Zeliang Zhang
Siyuan Liang
Xiaosen Wang
AAML
42
18
0
20 Apr 2023
Mathematical Challenges in Deep Learning
V. Nia
Guojun Zhang
I. Kobyzev
Michael R. Metel
Xinlin Li
...
S. Hemati
M. Asgharian
Linglong Kong
Wulong Liu
Boxing Chen
AI4CE
VLM
37
1
0
24 Mar 2023
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset
Thanh-Dung Le
P. Jouvet
R. Noumeir
MoE
MedIm
72
5
0
22 Mar 2023
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
28
3
0
21 Mar 2023
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
30
4
0
20 Mar 2023
Deep Learning Weight Pruning with RMT-SVD: Increasing Accuracy and Reducing Overfitting
Yitzchak Shmalo
Jonathan Jenkins
Oleksii Krupchytskyi
22
3
0
15 Mar 2023
1
2
3
4
Next