ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1709.01953
  4. Cited By
Implicit Regularization in Deep Learning
v1v2 (latest)

Implicit Regularization in Deep Learning

6 September 2017
Behnam Neyshabur
ArXiv (abs)PDFHTML

Papers citing "Implicit Regularization in Deep Learning"

50 / 108 papers shown
Title
Stabilizing Policy Gradient Methods via Reward Profiling
Shihab Ahmed
El Houcine Bergou
A. Dutta
Yue Wang
184
0
0
20 Nov 2025
Deep Learning Inductive Biases for fMRI Time Series Classification during Resting-state and Movie-watching
Deep Learning Inductive Biases for fMRI Time Series Classification during Resting-state and Movie-watching
Behdad Khodabandehloo
Reza Rajimehr
62
0
0
21 Sep 2025
Reason to Rote: Rethinking Memorization in Reasoning
Reason to Rote: Rethinking Memorization in Reasoning
Yupei Du
Philipp Mondorf
Silvia Casola
Yuekun Yao
Robert Litschko
Barbara Plank
168
0
0
07 Jul 2025
Variational Adaptive Noise and Dropout towards Stable Recurrent Neural Networks
Variational Adaptive Noise and Dropout towards Stable Recurrent Neural Networks
Taisuke Kobayashi
Shingo Murata
149
0
0
02 Jun 2025
Identifying Key Challenges of Hardness-Based Resampling
Identifying Key Challenges of Hardness-Based Resampling
Pawel Pukowski
Venet Osmani
257
0
0
09 Apr 2025
High-entropy Advantage in Neural Networks' Generalizability
High-entropy Advantage in Neural Networks' Generalizability
Entao Yang
Wei Wei
Yue Shang
Ge Zhang
AI4CE
353
2
0
17 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Yahel Uffenheimer
Omri Weinstein
409
3
0
15 Mar 2025
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture
Yikun Hou
Suvrit Sra
A. Yurtsever
331
0
0
27 Jan 2025
ExpTest: Automating Learning Rate Searching and Tuning with Insights
  from Linearized Neural Networks
ExpTest: Automating Learning Rate Searching and Tuning with Insights from Linearized Neural Networks
Zan Chaudhry
Naoko Mizuno
283
0
0
25 Nov 2024
LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics
LDAdam: Adaptive Optimization from Low-Dimensional Gradient StatisticsInternational Conference on Learning Representations (ICLR), 2024
Thomas Robert
M. Safaryan
Ionut-Vlad Modoranu
Dan Alistarh
ODL
406
20
0
21 Oct 2024
A Theoretical Survey on Foundation Models
A Theoretical Survey on Foundation Models
Shi Fu
Yuzhu Chen
Yingjie Wang
Dacheng Tao
263
0
0
15 Oct 2024
Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
Low-Dimension-to-High-Dimension Generalization And Its Implications for Length Generalization
Yang Chen
Long Yang
Yitao Liang
Zhouchen Lin
318
2
0
11 Oct 2024
Input Space Mode Connectivity in Deep Neural Networks
Input Space Mode Connectivity in Deep Neural NetworksInternational Conference on Learning Representations (ICLR), 2024
Jakub Vrabel
Ori Shem-Ur
Yaron Oz
David Krueger
318
1
0
09 Sep 2024
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric EstimationInternational Conference on Machine Learning (ICML), 2024
Qinshuo Liu
Zixin Wang
Xi-An Li
Xinyao Ji
Lei Zhang
Lin Liu
Zhonghua Liu
260
0
0
04 Aug 2024
A Margin-based Multiclass Generalization Bound via Geometric Complexity
A Margin-based Multiclass Generalization Bound via Geometric Complexity
Michael Munn
Benoit Dherin
Javier Gonzalvo
UQCV
210
2
0
28 May 2024
The Impact of Geometric Complexity on Neural Collapse in Transfer
  Learning
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
Michael Munn
Benoit Dherin
Javier Gonzalvo
AAML
243
5
0
24 May 2024
Improving Generalization of Deep Neural Networks by Optimum Shifting
Improving Generalization of Deep Neural Networks by Optimum ShiftingAAAI Conference on Artificial Intelligence (AAAI), 2024
Yuyan Zhou
Ye Li
Lei Feng
Sheng-Jun Huang
OODODL
150
0
0
23 May 2024
A General Theory for Compositional Generalization
A General Theory for Compositional Generalization
Jingwen Fu
Zhizheng Zhang
Yan Lu
Nanning Zheng
AI4CECoGe
210
2
0
20 May 2024
Development of Skip Connection in Deep Neural Networks for Computer
  Vision and Medical Image Analysis: A Survey
Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A SurveyEngineering applications of artificial intelligence (EAAI), 2024
Guoping Xu
Xiaxia Wang
Xinglong Wu
Xuesong Leng
Yongchao Xu
3DPC
219
12
0
02 May 2024
A Gauss-Newton Approach for Min-Max Optimization in Generative
  Adversarial Networks
A Gauss-Newton Approach for Min-Max Optimization in Generative Adversarial NetworksIEEE International Joint Conference on Neural Network (IJCNN), 2024
Neel Mishra
Bamdev Mishra
Pratik Jawanpuria
Pawan Kumar
GAN
181
2
0
10 Apr 2024
No Free Prune: Information-Theoretic Barriers to Pruning at
  Initialization
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
Tanishq Kumar
Kevin Luo
Mark Sellke
228
8
0
02 Feb 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training
  Dynamics, and Generative Models
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative ModelsAnnual Review of Statistics and Its Application (ARSIA), 2024
Namjoon Suh
Guang Cheng
MedIm
305
17
0
14 Jan 2024
Interpretability Illusions in the Generalization of Simplified Models
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman
Andrew Kyle Lampinen
Lucas Dixon
Danqi Chen
Asma Ghandeharioun
305
19
0
06 Dec 2023
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware MinimizationConference on Uncertainty in Artificial Intelligence (UAI), 2023
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
740
2
0
29 Nov 2023
A PAC-Bayesian Perspective on the Interpolating Information Criterion
A PAC-Bayesian Perspective on the Interpolating Information Criterion
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
267
2
0
13 Nov 2023
Efficient Compression of Overparameterized Deep Models through
  Low-Dimensional Learning Dynamics
Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics
Soo Min Kwon
Zekai Zhang
Dogyoon Song
Laura Balzano
Qing Qu
265
4
0
08 Nov 2023
PRIOR: Personalized Prior for Reactivating the Information Overlooked in
  Federated Learning
PRIOR: Personalized Prior for Reactivating the Information Overlooked in Federated Learning
Mingjia Shi
Yuhao Zhou
Xiaojiang Peng
Huaizheng Zhang
Shudong Huang
Qing Ye
Jiangcheng Lv
240
15
0
13 Oct 2023
A path-norm toolkit for modern networks: consequences, promises and
  challenges
A path-norm toolkit for modern networks: consequences, promises and challengesInternational Conference on Learning Representations (ICLR), 2023
Antoine Gonon
Nicolas Brisebarre
E. Riccietti
Rémi Gribonval
440
10
0
02 Oct 2023
Asynchronous Graph Generator
Asynchronous Graph GeneratorSignal Processing (Signal Process.), 2023
Christopher P. Ley
Felipe Tobar
AI4TS
327
0
0
29 Sep 2023
Unveiling Invariances via Neural Network Pruning
Unveiling Invariances via Neural Network Pruning
Derek Xu
Luke Huan
Wei Wang
208
0
0
15 Sep 2023
The Interpolating Information Criterion for Overparameterized Models
The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson
Christopher van der Heide
Roberto Salomone
Fred Roosta
Michael W. Mahoney
202
10
0
15 Jul 2023
Abide by the Law and Follow the Flow: Conservation Laws for Gradient
  Flows
Abide by the Law and Follow the Flow: Conservation Laws for Gradient FlowsNeural Information Processing Systems (NeurIPS), 2023
Sibylle Marcotte
Rémi Gribonval
Gabriel Peyré
305
27
0
30 Jun 2023
Catching Image Retrieval Generalization
Catching Image Retrieval Generalization
Maksim Zhdanov
I. Karpukhin
VLM
156
0
0
23 Jun 2023
Understanding and Mitigating Extrapolation Failures in Physics-Informed
  Neural Networks
Understanding and Mitigating Extrapolation Failures in Physics-Informed Neural Networks
Lukas Fesser
Luca DÁmico-Wong
Richard Qiu
256
7
0
15 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear
  Networks
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
265
20
0
01 Jun 2023
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement
  Discrepancy
(Almost) Provable Error Bounds Under Distribution Shift via Disagreement DiscrepancyNeural Information Processing Systems (NeurIPS), 2023
Elan Rosenfeld
Saurabh Garg
UQCV
162
12
0
01 Jun 2023
When Does Optimizing a Proper Loss Yield Calibration?
When Does Optimizing a Proper Loss Yield Calibration?Neural Information Processing Systems (NeurIPS), 2023
Jarosław Błasiok
Parikshit Gopalan
Lunjia Hu
Preetum Nakkiran
225
37
0
30 May 2023
Consistent Optimal Transport with Empirical Conditional Measures
Consistent Optimal Transport with Empirical Conditional MeasuresInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Piyushi Manupriya
Rachit Keerti Das
Sayantan Biswas
S. Jagarlapudi
OT
416
6
0
25 May 2023
Exploring the Complexity of Deep Neural Networks through Functional
  Equivalence
Exploring the Complexity of Deep Neural Networks through Functional EquivalenceInternational Conference on Machine Learning (ICML), 2023
Guohao Shen
338
6
0
19 May 2023
Adaptive Consensus Optimization Method for GANs
Adaptive Consensus Optimization Method for GANsIEEE International Joint Conference on Neural Network (IJCNN), 2023
Sachin Kumar Danisetty
Santhosh Reddy Mylaram
Pawan Kumar
ODL
130
3
0
20 Apr 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Saddle-to-Saddle Dynamics in Diagonal Linear NetworksNeural Information Processing Systems (NeurIPS), 2023
Scott Pesme
Nicolas Flammarion
388
46
0
02 Apr 2023
Implicit regularization in Heavy-ball momentum accelerated stochastic
  gradient descent
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descentInternational Conference on Learning Representations (ICLR), 2023
Avrajit Ghosh
He Lyu
Xitong Zhang
Rongrong Wang
190
27
0
02 Feb 2023
Why Deep Learning Generalizes
Why Deep Learning Generalizes
Benjamin L. Badger
TDIAI4CE
127
4
0
17 Nov 2022
C-Mixup: Improving Generalization in Regression
C-Mixup: Improving Generalization in RegressionNeural Information Processing Systems (NeurIPS), 2022
Huaxiu Yao
Yiping Wang
Linjun Zhang
James Zou
Chelsea Finn
UQCVOOD
190
81
0
11 Oct 2022
DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep
  Learning
DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep LearningNeural Information Processing Systems (NeurIPS), 2022
Siqi Xu
Lin Liu
Zhong Liu
CMLMedIm
183
12
0
10 Oct 2022
Learning Temporal Resolution in Spectrogram for Audio Classification
Learning Temporal Resolution in Spectrogram for Audio ClassificationAAAI Conference on Artificial Intelligence (AAAI), 2022
Haohe Liu
Xubo Liu
Qiuqiang Kong
Wenwu Wang
Mark D. Plumbley
251
13
0
04 Oct 2022
The Dynamic of Consensus in Deep Networks and the Identification of
  Noisy Labels
The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels
Daniel Shwartz
Uri Stern
D. Weinshall
NoLa
235
2
0
02 Oct 2022
Why neural networks find simple solutions: the many regularizers of
  geometric complexity
Why neural networks find simple solutions: the many regularizers of geometric complexityNeural Information Processing Systems (NeurIPS), 2022
Benoit Dherin
Michael Munn
M. Rosca
David Barrett
320
42
0
27 Sep 2022
Robust Constrained Reinforcement Learning
Robust Constrained Reinforcement Learning
Yue Wang
Fei Miao
Shaofeng Zou
152
20
0
14 Sep 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully
  Connected Neural Networks
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
Charles Edison Tripp
J. Perr-Sauer
L. Hayne
M. Lunacek
Jamil Gafur
AI4CE
256
1
0
25 Jul 2022
123
Next