A Diffusion Approximation Theory of Momentum SGD in Nonconvex
Optimization

v1v2v3v4v5 (latest)

A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization

14 February 2018

Zhehui Chen

ArXiv (abs)PDF HTML

Papers citing "A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization"

6 / 6 papers shown

Title
Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum Guojing Cong Tianyi Liu 103 0 0 01 Oct 2021
Rethinking the Hyperparameters for Fine-tuning Hao Li Pratik Chaudhari Hao Yang Michael Lam Avinash Ravichandran Rahul Bhotika Stefano Soatto VLM 93 130 0 19 Feb 2020
Learning to Defend by Learning to Attack Haoming Jiang Zhehui Chen Yuyang Shi Bo Dai T. Zhao 93 22 0 03 Nov 2018
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration Xuefeng Gao Mert Gurbuzbalaban Lingjiong Zhu 78 60 0 12 Sep 2018
Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization Tianyi Liu Shiyang Li Jianping Shi Enlu Zhou T. Zhao 60 9 0 04 Jun 2018
A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay L. Smith 295 1,036 0 26 Mar 2018