ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.11371
  4. Cited By
Understanding the Generalization of Adam in Learning Neural Networks
  with Proper Regularization

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

25 August 2021
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
    MLT
    AI4CE
ArXivPDFHTML

Papers citing "Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization"

8 / 8 papers shown
Title
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
52
0
0
11 Apr 2025
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
13
1
0
12 Apr 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Yusu Hong
Junhong Lin
30
10
0
06 Feb 2024
Revisiting Knowledge Distillation under Distribution Shift
Revisiting Knowledge Distillation under Distribution Shift
Songming Zhang
Ziyu Lyu
Xiaofeng Chen
6
1
0
25 Dec 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer
  Linear Convolutional Neural Networks
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
14
5
0
20 Jun 2023
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
6
34
0
12 May 2022
A Simple Convergence Proof of Adam and Adagrad
A Simple Convergence Proof of Adam and Adagrad
Alexandre Défossez
Léon Bottou
Francis R. Bach
Nicolas Usunier
56
143
0
05 Mar 2020
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
V. Papyan
Yaniv Romano
Michael Elad
48
283
0
27 Jul 2016
1