ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,491 papers shown
Weak error analysis for stochastic gradient descent optimization
  algorithms
Weak error analysis for stochastic gradient descent optimization algorithms
A. Bercher
Lukas Gonon
Arnulf Jentzen
Diyora Salimova
274
4
0
03 Jul 2020
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
  Optimization Problems
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems
Zhan Gao
Alec Koppel
Alejandro Ribeiro
183
14
0
02 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp
  Guarantees
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
459
318
0
02 Jul 2020
On the Outsized Importance of Learning Rates in Local Update Methods
On the Outsized Importance of Learning Rates in Local Update Methods
Zachary B. Charles
Jakub Konecný
FedML
228
57
0
02 Jul 2020
Convolutional Neural Network Training with Distributed K-FAC
Convolutional Neural Network Training with Distributed K-FAC
J. G. Pauloski
Zhao Zhang
Lei Huang
Weijia Xu
Ian Foster
185
34
0
01 Jul 2020
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient
  Descent
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
Scott Pesme
Hadrien Hendrikx
Nicolas Flammarion
162
16
0
01 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
167
15
0
30 Jun 2020
A Multilevel Approach to Training
A Multilevel Approach to Training
Vanessa Braglia
Alena Kopanicáková
Rolf Krause
146
3
0
28 Jun 2020
Is SGD a Bayesian sampler? Well, almost
Is SGD a Bayesian sampler? Well, almost
Chris Mingard
Guillermo Valle Pérez
Joar Skalse
A. Louis
BDL
304
64
0
26 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq
  learners
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
257
28
0
26 Jun 2020
DeltaGrad: Rapid retraining of machine learning models
DeltaGrad: Rapid retraining of machine learning models
Yinjun Wu
Guang Cheng
S. Davidson
MU
279
245
0
26 Jun 2020
Learning compositional functions via multiplicative weight updates
Learning compositional functions via multiplicative weight updates
Jeremy Bernstein
Jiawei Zhao
M. Meister
Xuan Li
Anima Anandkumar
Yisong Yue
246
33
0
25 Jun 2020
Effective Elastic Scaling of Deep Learning Workloads
Effective Elastic Scaling of Deep Learning WorkloadsIEEE/ACM International Symposium on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems (MASCOTS), 2020
Vaibhav Saxena
K.R. Jayaram
Saurav Basu
Yogish Sabharwal
Ashish Verma
150
10
0
24 Jun 2020
Advances in Asynchronous Parallel and Distributed Optimization
Advances in Asynchronous Parallel and Distributed OptimizationProceedings of the IEEE (Proc. IEEE), 2020
By Mahmoud Assran
Arda Aytekin
Hamid Reza Feyzmahdavian
M. Johansson
Michael G. Rabbat
238
95
0
24 Jun 2020
Hyperparameter Ensembles for Robustness and Uncertainty Quantification
Hyperparameter Ensembles for Robustness and Uncertainty QuantificationNeural Information Processing Systems (NeurIPS), 2020
F. Wenzel
Jasper Snoek
Dustin Tran
Rodolphe Jenatton
UQCV
543
238
0
24 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
183
21
0
24 Jun 2020
Continuous Submodular Function Maximization
Continuous Submodular Function Maximization
Yatao Bian
J. M. Buhmann
Andreas Krause
204
22
0
24 Jun 2020
Local Stochastic Approximation: A Unified View of Federated Learning and
  Distributed Multi-Task Reinforcement Learning Algorithms
Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms
Thinh T. Doan
FedML
212
10
0
24 Jun 2020
DeepTopPush: Simple and Scalable Method for Accuracy at the Top
DeepTopPush: Simple and Scalable Method for Accuracy at the Top
V. Mácha
Lukáš Adam
Václav Smídl
240
4
0
22 Jun 2020
A Better Alternative to Error Feedback for Communication-Efficient
  Distributed Learning
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
Samuel Horváth
Peter Richtárik
234
64
0
19 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
379
93
0
18 Jun 2020
A block coordinate descent optimizer for classification problems
  exploiting convexity
A block coordinate descent optimizer for classification problems exploiting convexity
Ravi G. Patel
N. Trask
Mamikon A. Gulian
E. Cyr
ODL
127
8
0
17 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
521
64
0
16 Jun 2020
Spherical Motion Dynamics: Learning Dynamics of Neural Network with
  Normalization, Weight Decay, and SGD
Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD
Ruosi Wan
Zhanxing Zhu
Xiangyu Zhang
Jian Sun
211
11
0
15 Jun 2020
Scalable Control Variates for Monte Carlo Methods via Stochastic
  Optimization
Scalable Control Variates for Monte Carlo Methods via Stochastic OptimizationMonte Carlo and Quasi-Monte Carlo Methods (MCQMC), 2020
Shijing Si
Chris J. Oates
Andrew B. Duncan
Lawrence Carin
F. Briol
BDL
171
22
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep
  neural networks
Non-convergence of stochastic gradient descent in the training of deep neural networksJournal of Complexity (J. Complexity), 2020
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
219
39
0
12 Jun 2020
Stochastic Optimization for Performative Prediction
Stochastic Optimization for Performative PredictionNeural Information Processing Systems (NeurIPS), 2020
Celestine Mendler-Dünner
Juan C. Perdomo
Tijana Zrnic
Moritz Hardt
327
135
0
12 Jun 2020
Random Reshuffling: Simple Analysis with Vast Improvements
Random Reshuffling: Simple Analysis with Vast ImprovementsNeural Information Processing Systems (NeurIPS), 2020
Konstantin Mishchenko
Ahmed Khaled
Peter Richtárik
362
151
0
10 Jun 2020
A Modified AUC for Training Convolutional Neural Networks: Taking
  Confidence into Account
A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account
Khashayar Namdar
M. Haider
Farzad Khalvati
176
34
0
08 Jun 2020
The Strength of Nesterov's Extrapolation in the Individual Convergence
  of Nonsmooth Optimization
The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization
Wei Tao
Zhisong Pan
Gao-wei Wu
Qing Tao
121
19
0
08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property
  and Average-case Analysis
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis
Courtney Paquette
B. V. Merrienboer
Elliot Paquette
Fabian Pedregosa
381
29
0
08 Jun 2020
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
SONIA: A Symmetric Blockwise Truncated Optimization Algorithm
Majid Jahani
M. Nazari
R. Tappenden
A. Berahas
Martin Takávc
ODL
154
10
0
06 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
269
7
0
05 Jun 2020
Scalable Plug-and-Play ADMM with Convergence Guarantees
Scalable Plug-and-Play ADMM with Convergence Guarantees
Yu Sun
Zihui Wu
Xiaojian Xu
B. Wohlberg
Ulugbek S. Kamilov
BDL
314
97
0
05 Jun 2020
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Asymptotic Analysis of Conditioned Stochastic Gradient Descent
Rémi Leluc
Franccois Portier
290
4
0
04 Jun 2020
A mathematical model for automatic differentiation in machine learning
A mathematical model for automatic differentiation in machine learningNeural Information Processing Systems (NeurIPS), 2020
Jérôme Bolte
Edouard Pauwels
185
73
0
03 Jun 2020
Finite Difference Neural Networks: Fast Prediction of Partial
  Differential Equations
Finite Difference Neural Networks: Fast Prediction of Partial Differential EquationsInternational Conference on Machine Learning and Applications (ICMLA), 2020
Zheng Shi
Nur Sila Gulgec
A. Berahas
S. Pakzad
Martin Takáč
160
11
0
02 Jun 2020
Carathéodory Sampling for Stochastic Gradient Descent
Carathéodory Sampling for Stochastic Gradient Descent
Francesco Cosentino
Harald Oberhauser
Alessandro Abate
161
1
0
02 Jun 2020
Improved SVRG for quadratic functions
Improved SVRG for quadratic functions
N. Kahalé
253
0
0
01 Jun 2020
Artificial neural networks for neuroscientists: A primer
Artificial neural networks for neuroscientists: A primerNeuron (Neuron), 2020
G. R. Yang
Xiao-Jing Wang
449
302
0
01 Jun 2020
Data-Driven Methods to Monitor, Model, Forecast and Control Covid-19
  Pandemic: Leveraging Data Science, Epidemiology and Control Theory
Data-Driven Methods to Monitor, Model, Forecast and Control Covid-19 Pandemic: Leveraging Data Science, Epidemiology and Control Theory
Teodoro Alamo
Daniel Gutiérrez-Reina
P. Millán
136
30
0
01 Jun 2020
Pruning via Iterative Ranking of Sensitivity Statistics
Pruning via Iterative Ranking of Sensitivity Statistics
Stijn Verdenius
M. Stol
Patrick Forré
AAML
174
42
0
01 Jun 2020
Better scalability under potentially heavy-tailed gradients
Matthew J. Holland
262
1
0
01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine LearningAAAI Conference on Artificial Intelligence (AAAI), 2020
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
459
338
0
01 Jun 2020
A New Accelerated Stochastic Gradient Method with Momentum
A New Accelerated Stochastic Gradient Method with Momentum
Liang Liu
Xiaopeng Luo
ODL
70
5
0
31 May 2020
Complex Sequential Understanding through the Awareness of Spatial and
  Temporal Concepts
Complex Sequential Understanding through the Awareness of Spatial and Temporal ConceptsNature Machine Intelligence (NMI), 2020
Bo Pang
Kaiwen Zha
Hanwen Cao
Jiajun Tang
Minghui Yu
Cewu Lu
176
27
0
30 May 2020
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics
  with Simulated Annealing
CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated AnnealingScientific Reports (Sci Rep), 2020
O. Borysenko
M. Byshkin
ODL
155
17
0
29 May 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU
  Clusters through Integration of Pipelined Model Parallelism and Data
  Parallelism
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data ParallelismUSENIX Annual Technical Conference (USENIX ATC), 2020
Jay H. Park
Gyeongchan Yun
Chang Yi
N. T. Nguyen
Seungmin Lee
Jaesik Choi
S. Noh
Young-ri Choi
MoE
238
166
0
28 May 2020
Convergence Analysis of Riemannian Stochastic Approximation Schemes
Convergence Analysis of Riemannian Stochastic Approximation Schemes
Alain Durmus
P. Jiménez
Eric Moulines
Salem Said
Hoi-To Wai
261
10
0
27 May 2020
Scalable Privacy-Preserving Distributed Learning
Scalable Privacy-Preserving Distributed Learning
D. Froelicher
J. Troncoso-Pastoriza
Apostolos Pyrgelis
Sinem Sav
João Sá Sousa
Jean-Philippe Bossuat
Jean-Pierre Hubaux
FedML
269
76
0
19 May 2020
Previous
123...202122...282930
Next
Page 21 of 30
Pageof 30