v1v2v3v4 (latest)

Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization

4 October 2018

Papers citing "Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization"

35 / 35 papers shown

Title
Transformative or Conservative? Conservation laws for ResNets and Transformers Sibylle Marcotte Rémi Gribonval Gabriel Peyré 112 1 0 06 Jun 2025
PADAM: Parallel averaged Adam reduces the error for stochastic optimization in scientific machine learning Arnulf Jentzen Julian Kranz Adrian Riekert ODL 124 0 0 28 May 2025
Sharp higher order convergence rates for the Adam optimizer Steffen Dereich Arnulf Jentzen Adrian Riekert ODL 130 1 0 28 Apr 2025
Diagnosis of Patients with Viral, Bacterial, and Non-Pneumonia Based on Chest X-Ray Images Using Convolutional Neural Networks Carlos Arizmendi Jorge Pinto Alejandro Arboleda Hernando Gonzalez 115 1 0 03 Mar 2025
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective Sizhuang He Ananyae Kumar Bhartari Bowen Li P. Perdikaris PINN 201 25 0 02 Feb 2025
A second-order-like optimizer with adaptive gradient scaling for deep learning Jérôme Bolte Ryan Boustany Edouard Pauwels Andrei Purica ODL 110 0 0 08 Oct 2024
SaulLM-54B & SaulLM-141B: Scaling Up Domain Adaptation for the Legal Domain Pierre Colombo T. Pires Malik Boudiaf Rui Melo Dominic Culver Sofia Morgado Etienne Malaboeuf Gabriel Hautreux Johanne Charpentier Michael Desa ELM AILaw ALM 122 24 0 28 Jul 2024
Regularized DeepIV with Model Selection Zihao Li Hui Lan Vasilis Syrgkanis Mengdi Wang Masatoshi Uehara 146 2 0 07 Mar 2024
Accelerating Distributed Stochastic Optimization via Self-Repellent Random Walks Jie Hu Vishwaraj Doshi Do Young Eun 135 4 0 18 Jan 2024
Adam-family Methods with Decoupled Weight Decay in Deep Learning Kuang-Yu Ding Nachuan Xiao Kim-Chuan Toh 95 4 0 13 Oct 2023
On the Implicit Bias of Adam M. D. Cattaneo Jason M. Klusowski Boris Shigida 160 19 0 31 Aug 2023
Stability and Convergence of Distributed Stochastic Approximations with large Unbounded Stochastic Information Delays Adrian Redder Arunselvan Ramaswamy Holger Karl 62 1 0 11 May 2023
Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees Nachuan Xiao Xiaoyin Hu Xin Liu Kim-Chuan Toh 76 24 0 06 May 2023
Beyond Mahalanobis-Based Scores for Textual OOD Detection Pierre Colombo Eduardo Dadalto Camara Gomes Guillaume Staerman Nathan Noiry Pablo Piantanida OODD 159 5 0 24 Nov 2022
Toward Equation of Motion for Deep Neural Networks: Continuous-time Gradient Descent and Discretization Error Analysis Taiki Miyagawa 117 10 0 28 Oct 2022
Efficiency Ordering of Stochastic Gradient Descent Jie Hu Vishwaraj Doshi Do Young Eun 92 7 0 15 Sep 2022
Convergence of Batch Updating Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results Tadipatri Uday M. Vidyasagar 72 0 0 12 Sep 2022
Adaptive Gradient Methods at the Edge of Stability Jeremy M. Cohen Behrooz Ghorbani Shankar Krishnan Naman Agarwal Sourabh Medapati ... Daniel Suo David E. Cardoze Zachary Nado George E. Dahl Justin Gilmer ODL 182 57 0 29 Jul 2022
The Slingshot Mechanism: An Empirical Study of Adaptive Optimizers and the Grokking Phenomenon Vimal Thilak Etai Littwin Shuangfei Zhai Omid Saremi Roni Paiss J. Susskind 123 68 0 10 Jun 2022
A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning Kushal Chakrabarti Nikhil Chopra ODL AI4CE 176 6 0 04 Jun 2022
A theoretical and empirical study of new adaptive algorithms with additional momentum steps and shifted updates for stochastic non-convex optimization C. Alecsa 97 0 0 16 Oct 2021
Generalized AdaGrad (G-AdaGrad) and Adam: A State-Space Perspective Kushal Chakrabarti Nikhil Chopra ODL AI4CE 111 10 0 31 May 2021
On the Distributional Properties of Adaptive Gradients Z. Zhiyi Liu Ziyin 79 4 0 15 May 2021
Asymptotic study of stochastic adaptive algorithm in non-convex landscape S. Gadat Ioana Gavra 125 18 0 10 Dec 2020
Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance Anas Barakat Pascal Bianchi W. Hachem S. Schechtman 157 14 0 07 Dec 2020
Sequential convergence of AdaGrad algorithm for smooth convex optimization Cheik Traoré Edouard Pauwels 72 24 0 24 Nov 2020
A Qualitative Study of the Dynamic Behavior for Adaptive Gradient Algorithms Chao Ma Lei Wu E. Weinan ODL 84 29 0 14 Sep 2020
Incremental Without Replacement Sampling in Nonconvex Optimization Edouard Pauwels 139 5 0 15 Jul 2020
Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms A. Lovas Iosif Lytras Miklós Rásonyi Sotirios Sabanis 96 27 0 25 Jun 2020
Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks Umut Simsekli Ozan Sener George Deligiannidis Murat A. Erdogdu 125 62 0 16 Jun 2020
Gradient descent with momentum --- to accelerate or to super-accelerate? Goran Nakerst John Brennan M. Haque ODL 76 17 0 17 Jan 2020
An empirical study of neural networks for trend detection in time series Alexandre Miot Gilles Drigout AI4TS 95 2 0 09 Dec 2019
Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning Jérôme Bolte Edouard Pauwels 264 144 0 23 Sep 2019
An Inertial Newton Algorithm for Deep Learning Camille Castera Jérôme Bolte Cédric Févotte Edouard Pauwels PINN ODL 181 65 0 29 May 2019
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network Xiaoxia Wu S. Du Rachel A. Ward 148 66 0 19 Feb 2019