Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,491 papers shown
Weak error analysis for stochastic gradient descent optimization algorithms
A. Bercher
Lukas Gonon
Arnulf Jentzen
Diyora Salimova
274
4
0
03 Jul 2020
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems
Zhan Gao
Alec Koppel
Alejandro Ribeiro
183
14
0
02 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
459
318
0
02 Jul 2020
On the Outsized Importance of Learning Rates in Local Update Methods
Zachary B. Charles
Jakub Konecný
FedML
228
57
0
02 Jul 2020
Convolutional Neural Network Training with Distributed K-FAC
J. G. Pauloski
Zhao Zhang
Lei Huang
Weijia Xu
Ian Foster
185
34
0
01 Jul 2020
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Scott Pesme
Hadrien Hendrikx
Nicolas Flammarion
162
16
0
01 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
167
15
0
30 Jun 2020
A Multilevel Approach to Training
Vanessa Braglia
Alena Kopanicáková
Rolf Krause
146
3
0
28 Jun 2020
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
304
64
0
26 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
257
28
0
26 Jun 2020
DeltaGrad: Rapid retraining of machine learning models
Yinjun Wu
Guang Cheng
S. Davidson
MU
279
245
0
26 Jun 2020
Learning compositional functions via multiplicative weight updates
Jeremy Bernstein
Jiawei Zhao
M. Meister
Xuan Li
Anima Anandkumar
Yisong Yue
246
33
0
25 Jun 2020
Effective Elastic Scaling of Deep Learning Workloads
IEEE/ACM International Symposium on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems (MASCOTS), 2020
Vaibhav Saxena
K.R. Jayaram
Saurav Basu
Yogish Sabharwal
Ashish Verma
150
10
0
24 Jun 2020
Advances in Asynchronous Parallel and Distributed Optimization
Proceedings of the IEEE (Proc. IEEE), 2020
By Mahmoud Assran
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
Michael G. Rabbat
238
95
0
24 Jun 2020
Hyperparameter Ensembles for Robustness and Uncertainty Quantification
Neural Information Processing Systems (NeurIPS), 2020
F. Wenzel
Jasper Snoek
Dustin Tran
Rodolphe Jenatton
UQCV
543
238
0
24 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
183
21
0
24 Jun 2020
Continuous Submodular Function Maximization
Yatao Bian
J. M. Buhmann
Andreas Krause
204
22
0
24 Jun 2020
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms
Thinh T. Doan
FedML
212
10
0
24 Jun 2020
DeepTopPush: Simple and Scalable Method for Accuracy at the Top
V. Mácha
Lukáš Adam
Václav Smídl
240
4
0
22 Jun 2020
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
Samuel Horváth
Peter Richtárik
234
64
0
19 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
379
93
0
18 Jun 2020
A block coordinate descent optimizer for classification problems exploiting convexity
Ravi G. Patel
N. Trask
Mamikon A. Gulian
E. Cyr
ODL
127
8
0
17 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
521
64
0
16 Jun 2020
Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD
Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
211
11
0
15 Jun 2020
Scalable Control Variates for Monte Carlo Methods via Stochastic Optimization
Monte Carlo and Quasi-Monte Carlo Methods (MCQMC), 2020
Shijing Si
Chris J. Oates
Andrew B. Duncan
Lawrence Carin
F. Briol
BDL
171
22
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Journal of Complexity (J. Complexity), 2020
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
219
39
0
12 Jun 2020
Stochastic Optimization for Performative Prediction
Neural Information Processing Systems (NeurIPS), 2020
Celestine Mendler-Dünner
Juan C. Perdomo
Tijana Zrnic
Moritz Hardt
327
135
0
12 Jun 2020
Random Reshuffling: Simple Analysis with Vast Improvements
Neural Information Processing Systems (NeurIPS), 2020
Konstantin Mishchenko
Ahmed Khaled
Peter Richtárik
362
151
0
10 Jun 2020
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account
Khashayar Namdar
M. Haider
Farzad Khalvati
176
34
0
08 Jun 2020
The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization
Wei Tao
Zhisong Pan
Gao-wei Wu
Qing Tao
121
19
0
08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis
Courtney Paquette
B. V. Merrienboer
Elliot Paquette
Fabian Pedregosa
381
29
0
08 Jun 2020
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
Majid Jahani
M. Nazari
R. Tappenden
A. Berahas
Martin Takávc
ODL
154
10
0
06 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
269
7
0
05 Jun 2020
Scalable Plug-and-Play ADMM with Convergence Guarantees
Yu Sun
Zihui Wu
Xiaojian Xu
B. Wohlberg
Ulugbek S. Kamilov
BDL
314
97
0
05 Jun 2020
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Rémi Leluc
Franccois Portier
290
4
0
04 Jun 2020
A mathematical model for automatic differentiation in machine learning
Neural Information Processing Systems (NeurIPS), 2020
Jérôme Bolte
Edouard Pauwels
185
73
0
03 Jun 2020
Finite Difference Neural Networks: Fast Prediction of Partial Differential Equations
International Conference on Machine Learning and Applications (ICMLA), 2020
Zheng Shi
Nur Sila Gulgec
A. Berahas
S. Pakzad
Martin Takáč
160
11
0
02 Jun 2020
Carathéodory Sampling for Stochastic Gradient Descent
Francesco Cosentino
Harald Oberhauser
Alessandro Abate
161
1
0
02 Jun 2020
Improved SVRG for quadratic functions
N. Kahalé
253
0
0
01 Jun 2020
Artificial neural networks for neuroscientists: A primer
Neuron (Neuron), 2020
G. R. Yang
Xiao-Jing Wang
449
302
0
01 Jun 2020
Data-Driven Methods to Monitor, Model, Forecast and Control Covid-19 Pandemic: Leveraging Data Science, Epidemiology and Control Theory
Teodoro Alamo
Daniel Gutiérrez-Reina
P. Millán
136
30
0
01 Jun 2020
Pruning via Iterative Ranking of Sensitivity Statistics
Stijn Verdenius
M. Stol
Patrick Forré
AAML
174
42
0
01 Jun 2020
Better scalability under potentially heavy-tailed gradients
Matthew J. Holland
262
1
0
01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
AAAI Conference on Artificial Intelligence (AAAI), 2020
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
459
338
0
01 Jun 2020
A New Accelerated Stochastic Gradient Method with Momentum
Liang Liu
Xiaopeng Luo
ODL
70
5
0
31 May 2020
Complex Sequential Understanding through the Awareness of Spatial and Temporal Concepts
Nature Machine Intelligence (NMI), 2020
Bo Pang
Kaiwen Zha
Hanwen Cao
Jiajun Tang
Minghui Yu
Cewu Lu
176
27
0
30 May 2020
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated Annealing
Scientific Reports (Sci Rep), 2020
O. Borysenko
M. Byshkin
ODL
155
17
0
29 May 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism
USENIX Annual Technical Conference (USENIX ATC), 2020
Jay H. Park
Gyeongchan Yun
Chang Yi
N. T. Nguyen
Seungmin Lee
Jaesik Choi
S. Noh
Young-ri Choi
MoE
238
166
0
28 May 2020
Convergence Analysis of Riemannian Stochastic Approximation Schemes
Alain Durmus
P. Jiménez
Eric Moulines
Salem Said
Hoi-To Wai
261
10
0
27 May 2020
Scalable Privacy-Preserving Distributed Learning
D. Froelicher
J. Troncoso-Pastoriza
Apostolos Pyrgelis
Sinem Sav
João Sá Sousa
Jean-Philippe Bossuat
Jean-Pierre Hubaux
FedML
269
76
0
19 May 2020
Previous
1
2
3
...
20
21
22
...
28
29
30
Next
Page 21 of 30
Page
of 30
Go