ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05155
  4. Cited By
A Diffusion Approximation Theory of Momentum SGD in Nonconvex
  Optimization
v1v2v3v4v5 (latest)

A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization

14 February 2018
Tianyi Liu
Zhehui Chen
Enlu Zhou
T. Zhao
ArXiv (abs)PDFHTML

Papers citing "A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization"

6 / 6 papers shown
Title
Accelerate Distributed Stochastic Descent for Nonconvex Optimization
  with Momentum
Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum
Guojing Cong
Tianyi Liu
103
0
0
01 Oct 2021
Rethinking the Hyperparameters for Fine-tuning
Rethinking the Hyperparameters for Fine-tuning
Hao Li
Pratik Chaudhari
Hao Yang
Michael Lam
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
VLM
93
130
0
19 Feb 2020
Learning to Defend by Learning to Attack
Learning to Defend by Learning to Attack
Haoming Jiang
Zhehui Chen
Yuyang Shi
Bo Dai
T. Zhao
93
22
0
03 Nov 2018
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for
  Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and
  Momentum-Based Acceleration
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration
Xuefeng Gao
Mert Gurbuzbalaban
Lingjiong Zhu
78
60
0
12 Sep 2018
Towards Understanding Acceleration Tradeoff between Momentum and
  Asynchrony in Nonconvex Stochastic Optimization
Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization
Tianyi Liu
Shiyang Li
Jianping Shi
Enlu Zhou
T. Zhao
60
9
0
04 Jun 2018
A disciplined approach to neural network hyper-parameters: Part 1 --
  learning rate, batch size, momentum, and weight decay
A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay
L. Smith
295
1,036
0
26 Mar 2018
1