ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
  • Feedback
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02263
  4. Cited By
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex
  Stochastic Optimization
v1v2v3v4 (latest)

Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization

4 October 2018
Anas Barakat
Pascal Bianchi
ArXiv (abs)PDFHTML

Papers citing "Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization"

35 / 35 papers shown
Title
Transformative or Conservative? Conservation laws for ResNets and Transformers
Transformative or Conservative? Conservation laws for ResNets and Transformers
Sibylle Marcotte
Rémi Gribonval
Gabriel Peyré
112
1
0
06 Jun 2025
PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning
PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning
Arnulf Jentzen
Julian Kranz
Adrian Riekert
ODL
124
0
0
28 May 2025
Sharp higher order convergence rates for the Adam optimizer
Sharp higher order convergence rates for the Adam optimizer
Steffen Dereich
Arnulf Jentzen
Adrian Riekert
ODL
130
1
0
28 Apr 2025
Diagnosis of Patients with Viral, Bacterial, and Non-Pneumonia Based on Chest X-Ray Images Using Convolutional Neural Networks
Carlos Arizmendi
Jorge Pinto
Alejandro Arboleda
Hernando Gonzalez
115
1
0
03 Mar 2025
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Sizhuang He
Ananyae Kumar Bhartari
Bowen Li
P. Perdikaris
PINN
201
25
0
02 Feb 2025
A second-order-like optimizer with adaptive gradient scaling for deep
  learning
A second-order-like optimizer with adaptive gradient scaling for deep learning
Jérôme Bolte
Ryan Boustany
Edouard Pauwels
Andrei Purica
ODL
110
0
0
08 Oct 2024
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal
  Domain
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain
Pierre Colombo
T. Pires
Malik Boudiaf
Rui Melo
Dominic Culver
Sofia Morgado
Etienne Malaboeuf
Gabriel Hautreux
Johanne Charpentier
Michael Desa
ELMAILawALM
122
24
0
28 Jul 2024
Regularized DeepIV with Model Selection
Regularized DeepIV with Model Selection
Zihao Li
Hui Lan
Vasilis Syrgkanis
Mengdi Wang
Masatoshi Uehara
146
2
0
07 Mar 2024
Accelerating Distributed Stochastic Optimization via Self-Repellent
  Random Walks
Accelerating Distributed Stochastic Optimization via Self-Repellent Random Walks
Jie Hu
Vishwaraj Doshi
Do Young Eun
135
4
0
18 Jan 2024
Adam-family Methods with Decoupled Weight Decay in Deep Learning
Adam-family Methods with Decoupled Weight Decay in Deep Learning
Kuang-Yu Ding
Nachuan Xiao
Kim-Chuan Toh
95
4
0
13 Oct 2023
On the Implicit Bias of Adam
On the Implicit Bias of Adam
M. D. Cattaneo
Jason M. Klusowski
Boris Shigida
160
19
0
31 Aug 2023
Stability and Convergence of Distributed Stochastic Approximations with
  large Unbounded Stochastic Information Delays
Stability and Convergence of Distributed Stochastic Approximations with large Unbounded Stochastic Information Delays
Adrian Redder
Arunselvan Ramaswamy
Holger Karl
62
1
0
11 May 2023
Adam-family Methods for Nonsmooth Optimization with Convergence
  Guarantees
Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees
Nachuan Xiao
Xiaoyin Hu
Xin Liu
Kim-Chuan Toh
76
24
0
06 May 2023
Beyond Mahalanobis-Based Scores for Textual OOD Detection
Beyond Mahalanobis-Based Scores for Textual OOD Detection
Pierre Colombo
Eduardo Dadalto Camara Gomes
Guillaume Staerman
Nathan Noiry
Pablo Piantanida
OODD
159
5
0
24 Nov 2022
Toward Equation of Motion for Deep Neural Networks: Continuous-time
  Gradient Descent and Discretization Error Analysis
Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error Analysis
Taiki Miyagawa
117
10
0
28 Oct 2022
Efficiency Ordering of Stochastic Gradient Descent
Efficiency Ordering of Stochastic Gradient Descent
Jie Hu
Vishwaraj Doshi
Do Young Eun
92
7
0
15 Sep 2022
Convergence of Batch Updating Methods with Approximate Gradients and/or
  Noisy Measurements: Theory and Computational Results
Convergence of Batch Updating Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results
Tadipatri Uday
M. Vidyasagar
72
0
0
12 Sep 2022
Adaptive Gradient Methods at the Edge of Stability
Adaptive Gradient Methods at the Edge of Stability
Jeremy M. Cohen
Behrooz Ghorbani
Shankar Krishnan
Naman Agarwal
Sourabh Medapati
...
Daniel Suo
David E. Cardoze
Zachary Nado
George E. Dahl
Justin Gilmer
ODL
182
57
0
29 Jul 2022
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and
  the Grokking Phenomenon
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon
Vimal Thilak
Etai Littwin
Shuangfei Zhai
Omid Saremi
Roni Paiss
J. Susskind
123
68
0
10 Jun 2022
A Control Theoretic Framework for Adaptive Gradient Optimizers in
  Machine Learning
A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning
Kushal Chakrabarti
Nikhil Chopra
ODLAI4CE
176
6
0
04 Jun 2022
A theoretical and empirical study of new adaptive algorithms with
  additional momentum steps and shifted updates for stochastic non-convex
  optimization
A theoretical and empirical study of new adaptive algorithms with additional momentum steps and shifted updates for stochastic non-convex optimization
C. Alecsa
97
0
0
16 Oct 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective
Kushal Chakrabarti
Nikhil Chopra
ODLAI4CE
111
10
0
31 May 2021
On the Distributional Properties of Adaptive Gradients
On the Distributional Properties of Adaptive Gradients
Z. Zhiyi
Liu Ziyin
79
4
0
15 May 2021
Asymptotic study of stochastic adaptive algorithm in non-convex
  landscape
Asymptotic study of stochastic adaptive algorithm in non-convex landscape
S. Gadat
Ioana Gavra
125
18
0
10 Dec 2020
Stochastic optimization with momentum: convergence, fluctuations, and
  traps avoidance
Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance
Anas Barakat
Pascal Bianchi
W. Hachem
S. Schechtman
157
14
0
07 Dec 2020
Sequential convergence of AdaGrad algorithm for smooth convex
  optimization
Sequential convergence of AdaGrad algorithm for smooth convex optimization
Cheik Traoré
Edouard Pauwels
72
24
0
24 Nov 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient
  Algorithms
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms
Chao Ma
Lei Wu
E. Weinan
ODL
84
29
0
14 Sep 2020
Incremental Without Replacement Sampling in Nonconvex Optimization
Incremental Without Replacement Sampling in Nonconvex Optimization
Edouard Pauwels
139
5
0
15 Jul 2020
Taming neural networks with TUSLA: Non-convex learning via adaptive
  stochastic gradient Langevin algorithms
Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms
A. Lovas
Iosif Lytras
Miklós Rásonyi
Sotirios Sabanis
96
27
0
25 Jun 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks
Umut Simsekli
Ozan Sener
George Deligiannidis
Murat A. Erdogdu
125
62
0
16 Jun 2020
Gradient descent with momentum --- to accelerate or to super-accelerate?
Gradient descent with momentum --- to accelerate or to super-accelerate?
Goran Nakerst
John Brennan
M. Haque
ODL
76
17
0
17 Jan 2020
An empirical study of neural networks for trend detection in time series
An empirical study of neural networks for trend detection in time series
Alexandre Miot
Gilles Drigout
AI4TS
95
2
0
09 Dec 2019
Conservative set valued fields, automatic differentiation, stochastic
  gradient method and deep learning
Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning
Jérôme Bolte
Edouard Pauwels
264
144
0
23 Sep 2019
An Inertial Newton Algorithm for Deep Learning
An Inertial Newton Algorithm for Deep Learning
Camille Castera
Jérôme Bolte
Cédric Févotte
Edouard Pauwels
PINNODL
181
65
0
29 May 2019
Global Convergence of Adaptive Gradient Methods for An
  Over-parameterized Neural Network
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
Xiaoxia Wu
S. Du
Rachel A. Ward
148
66
0
19 Feb 2019
1