ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,491 papers shown
Least Squares Auto-Tuning
Least Squares Auto-Tuning
Shane T. Barratt
Stephen P. Boyd
MoMe
150
27
0
10 Apr 2019
Generalizing from a Few Examples: A Survey on Few-Shot Learning
Generalizing from a Few Examples: A Survey on Few-Shot Learning
Yaqing Wang
Quanming Yao
James T. Kwok
L. Ni
558
2,008
0
10 Apr 2019
On the approximation of the solution of partial differential equations
  by artificial neural networks trained by a multilevel Levenberg-Marquardt
  method
On the approximation of the solution of partial differential equations by artificial neural networks trained by a multilevel Levenberg-Marquardt method
H. Calandra
Serge Gratton
E. Riccietti
X. Vasseur
101
8
0
09 Apr 2019
Convergence rates for the stochastic gradient descent method for
  non-convex objective functions
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Benjamin J. Fehrman
Benjamin Gess
Arnulf Jentzen
318
110
0
02 Apr 2019
Convergence rates for optimised adaptive importance samplers
Convergence rates for optimised adaptive importance samplers
Ömer Deniz Akyildiz
Joaquín Míguez
306
34
0
28 Mar 2019
OverSketched Newton: Fast Convex Optimization for Serverless Systems
OverSketched Newton: Fast Convex Optimization for Serverless Systems
Vipul Gupta
S. Kadhe
T. Courtade
Michael W. Mahoney
Kannan Ramchandran
247
34
0
21 Mar 2019
Noisy Accelerated Power Method for Eigenproblems with Applications
Noisy Accelerated Power Method for Eigenproblems with ApplicationsIEEE Transactions on Signal Processing (IEEE Trans. Signal Process.), 2019
Vien V. Mai
M. Johansson
111
3
0
20 Mar 2019
TATi-Thermodynamic Analytics ToolkIt: TensorFlow-based software for
  posterior sampling in machine learning applications
TATi-Thermodynamic Analytics ToolkIt: TensorFlow-based software for posterior sampling in machine learning applications
Frederik Heber
Zofia Trstanova
Benedict Leimkuhler
176
0
0
20 Mar 2019
Combining Model and Parameter Uncertainty in Bayesian Neural Networks
Combining Model and Parameter Uncertainty in Bayesian Neural Networks
A. Hubin
G. Storvik
UQCVBDL
188
13
0
18 Mar 2019
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
A Distributed Hierarchical SGD Algorithm with Sparse Global Reduction
Fan Zhou
Guojing Cong
175
8
0
12 Mar 2019
Recovery Bounds on Class-Based Optimal Transport: A Sum-of-Norms
  Regularization Framework
Recovery Bounds on Class-Based Optimal Transport: A Sum-of-Norms Regularization Framework
Arman Rahbar
Ashkan Panahi
M. Chehreghani
Devdatt Dubhashi
Hamid Krim
326
0
0
09 Mar 2019
SGD without Replacement: Sharper Rates for General Smooth Convex
  Functions
SGD without Replacement: Sharper Rates for General Smooth Convex FunctionsInternational Conference on Machine Learning (ICML), 2019
Prateek Jain
Dheeraj M. Nagaraj
Praneeth Netrapalli
280
96
0
04 Mar 2019
Time-Delay Momentum: A Regularization Perspective on the Convergence and Generalization of Stochastic Momentum for Deep Learning
Ziming Zhang
Wenju Xu
Alan Sullivan
285
1
0
02 Mar 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with
  Structured Covariance Noise
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise
Yeming Wen
Kevin Luk
Maxime Gazeau
Guodong Zhang
Harris Chan
Jimmy Ba
ODL
343
24
0
21 Feb 2019
Global Convergence of Adaptive Gradient Methods for An
  Over-parameterized Neural Network
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
Xiaoxia Wu
S. Du
Rachel A. Ward
255
67
0
19 Feb 2019
ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite
  Nonconvex Optimization
ProxSARAH: An Efficient Algorithmic Framework for Stochastic Composite Nonconvex Optimization
Nhan H. Pham
Lam M. Nguyen
Dzung Phan
Quoc Tran-Dinh
234
154
0
15 Feb 2019
Forward-backward-forward methods with variance reduction for stochastic
  variational inequalities
Forward-backward-forward methods with variance reduction for stochastic variational inequalities
R. Boț
P. Mertikopoulos
Mathias Staudigl
P. Vuong
157
24
0
09 Feb 2019
Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of
  Neural Networks
Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural Networks
P. Parpas
Corey Muir
OOD
158
12
0
07 Feb 2019
Negative eigenvalues of the Hessian in deep neural networks
Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain
Nicolas Le Roux
Pierre-Antoine Manzagol
152
44
0
06 Feb 2019
Riemannian adaptive stochastic gradient algorithms on matrix manifolds
Riemannian adaptive stochastic gradient algorithms on matrix manifolds
Hiroyuki Kasai
Pratik Jawanpuria
Bamdev Mishra
273
3
0
04 Feb 2019
Stochastic first-order methods: non-asymptotic and computer-aided
  analyses via potential functions
Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions
Adrien B. Taylor
Francis R. Bach
387
73
0
03 Feb 2019
Stochastic Gradient Descent for Nonconvex Learning without Bounded
  Gradient Assumptions
Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions
Yunwen Lei
Ting Hu
Guiying Li
Shengcai Liu
MLT
362
131
0
03 Feb 2019
Non-asymptotic Analysis of Biased Stochastic Approximation Scheme
Non-asymptotic Analysis of Biased Stochastic Approximation SchemeAnnual Conference Computational Learning Theory (COLT), 2019
Belhal Karimi
B. Miasojedow
Eric Moulines
Hoi-To Wai
226
95
0
02 Feb 2019
Multilevel Monte Carlo Variational Inference
Multilevel Monte Carlo Variational InferenceJournal of machine learning research (JMLR), 2019
Masahiro Fujisawa
Issei Sato
296
14
0
01 Feb 2019
MgNet: A Unified Framework of Multigrid and Convolutional Neural Network
MgNet: A Unified Framework of Multigrid and Convolutional Neural NetworkScience China Mathematics (Sci. China Math.), 2019
Juncai He
Jinchao Xu
193
66
0
29 Jan 2019
Variational Characterizations of Local Entropy and Heat Regularization
  in Deep Learning
Variational Characterizations of Local Entropy and Heat Regularization in Deep Learning
Nicolas García Trillos
Zachary T. Kaplan
D. Sanz-Alonso
ODL
125
3
0
29 Jan 2019
Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample
Quasi-Newton Methods for Machine Learning: Forget the Past, Just Sample
A. Berahas
Majid Jahani
Peter Richtárik
Martin Takávc
395
49
0
28 Jan 2019
SGD: General Analysis and Improved Rates
SGD: General Analysis and Improved Rates
Robert Mansel Gower
Nicolas Loizou
Xun Qian
Alibek Sailanbayev
Egor Shulgin
Peter Richtárik
417
438
0
27 Jan 2019
Estimate Sequences for Stochastic Composite Optimization: Variance
  Reduction, Acceleration, and Robustness to Noise
Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise
A. Kulunchakov
Julien Mairal
473
46
0
25 Jan 2019
Provable Smoothness Guarantees for Black-Box Variational Inference
Provable Smoothness Guarantees for Black-Box Variational Inference
Justin Domke
260
40
0
24 Jan 2019
To Relieve Your Headache of Training an MRF, Take AdVIL
To Relieve Your Headache of Training an MRF, Take AdVIL
Chongxuan Li
Chao Du
Kun Xu
Max Welling
Jun Zhu
Bo Zhang
163
9
0
24 Jan 2019
Large-Batch Training for LSTM and Beyond
Large-Batch Training for LSTM and Beyond
Yang You
Jonathan Hseu
Chris Ying
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
236
96
0
24 Jan 2019
Trajectory Normalized Gradients for Distributed Optimization
Trajectory Normalized Gradients for Distributed Optimization
Jianqiao Wangni
Ke Li
Jianbo Shi
Jitendra Malik
128
2
0
24 Jan 2019
Decoupled Greedy Learning of CNNs
Decoupled Greedy Learning of CNNs
Eugene Belilovsky
Michael Eickenberg
Edouard Oyallon
352
128
0
23 Jan 2019
Finite-Sum Smooth Optimization with SARAH
Finite-Sum Smooth Optimization with SARAH
Lam M. Nguyen
Marten van Dijk
Dzung Phan
Phuong Ha Nguyen
Tsui-Wei Weng
Jayant Kalagnanam
203
23
0
22 Jan 2019
DTN: A Learning Rate Scheme with Convergence Rate of O(1/t)\mathcal{O}(1/t)O(1/t) for SGD
Lam M. Nguyen
Phuong Ha Nguyen
Dzung Phan
Jayant Kalagnanam
Marten van Dijk
259
0
0
22 Jan 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep
  Neural Networks
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks
Jinrong Guo
Wantao Liu
Wang Wang
Q. Lu
Songlin Hu
Jizhong Han
Ruixuan Li
165
11
0
21 Jan 2019
Tuning parameter selection rules for nuclear norm regularized
  multivariate linear regression
Tuning parameter selection rules for nuclear norm regularized multivariate linear regression
Pan Shang
Lingchen Kong
172
1
0
19 Jan 2019
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach
  to Stochastic Convex Optimization
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
Sattar Vakili
Sudeep Salgia
Qing Zhao
266
7
0
17 Jan 2019
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor
  Factorization
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor Factorization
Xiao Fu
Shahana Ibrahim
Hoi-To Wai
Cheng Gao
Kejun Huang
397
42
0
16 Jan 2019
Optimization Problems for Machine Learning: A Survey
Optimization Problems for Machine Learning: A Survey
Claudio Gambella
Bissan Ghaddar
Joe Naoum-Sawaya
AI4CE
417
225
0
16 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU
  Servers
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers
A. Koliousis
Pijika Watcharapichat
Matthias Weidlich
Kai Zou
Paolo Costa
Peter R. Pietzuch
218
71
0
08 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
Yi Zhou
Junjie Yang
Huishuai Zhang
Yingbin Liang
Vahid Tarokh
247
79
0
02 Jan 2019
Exact Guarantees on the Absence of Spurious Local Minima for
  Non-negative Rank-1 Robust Principal Component Analysis
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis
Salar Fattahi
Somayeh Sojoudi
160
38
0
30 Dec 2018
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
553
912
0
19 Dec 2018
A stochastic approximation method for approximating the efficient
  frontier of chance-constrained nonlinear programs
A stochastic approximation method for approximating the efficient frontier of chance-constrained nonlinear programs
R. Kannan
James R. Luedtke
72
4
0
17 Dec 2018
An Empirical Model of Large-Batch Training
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
900
356
0
14 Dec 2018
Gradient Descent Happens in a Tiny Subspace
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
297
263
0
12 Dec 2018
Layer-Parallel Training of Deep Residual Neural Networks
Layer-Parallel Training of Deep Residual Neural Networks
Stefanie Günther
Lars Ruthotto
J. Schroder
E. Cyr
N. Gauger
209
100
0
11 Dec 2018
A probabilistic incremental proximal gradient method
A probabilistic incremental proximal gradient method
Ömer Deniz Akyildiz
Émilie Chouzenoux
Victor Elvira
Joaquín Míguez
167
3
0
04 Dec 2018
Previous
123...252627282930
Next
Page 26 of 30
Pageof 30