v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,491 papers shown

Weak error analysis for stochastic gradient descent optimization algorithms

274

03 Jul 2020

Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems

Zhan Gao

Alec Koppel

Alejandro Ribeiro

183

02 Jul 2020

Federated Learning with Compression: Unified Analysis and Sharp Guarantees

Farzin Haddadpour

Mohammad Mahdi Kamani

Aryan Mokhtari

M. Mahdavi

FedML

459

318

02 Jul 2020

On the Outsized Importance of Learning Rates in Local Update Methods

Zachary B. Charles

Jakub Konecný

FedML

228

02 Jul 2020

Convolutional Neural Network Training with Distributed K-FAC

185

01 Jul 2020

On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent

Scott Pesme

Hadrien Hendrikx

Nicolas Flammarion

162

01 Jul 2020

AdaSGD: Bridging the gap between SGD and Adam

Jiaxuan Wang

Jenna Wiens

167

30 Jun 2020

A Multilevel Approach to Training

Vanessa Braglia

Alena Kopanicáková

Rolf Krause

146

28 Jun 2020

Is SGD a Bayesian sampler? Well, almost

Chris Mingard

Guillermo Valle Pérez

Joar Skalse

A. Louis

BDL

304

26 Jun 2020

What they do when in doubt: a study of inductive biases in seq2seq learners

Eugene Kharitonov

Rahma Chaabouni

257

26 Jun 2020

DeltaGrad: Rapid retraining of machine learning models

Yinjun Wu

Guang Cheng

S. Davidson

279

245

26 Jun 2020

Learning compositional functions via multiplicative weight updates

246

25 Jun 2020

Effective Elastic Scaling of Deep Learning WorkloadsIEEE/ACM International Symposium on Modeling, Analysis, and Simulation On Computer and Telecommunication Systems (MASCOTS), 2020

150

24 Jun 2020

Advances in Asynchronous Parallel and Distributed OptimizationProceedings of the IEEE (Proc. IEEE), 2020

By Mahmoud Assran

Arda Aytekin

Hamid Reza Feyzmahdavian

M. Johansson

Michael G. Rabbat

238

24 Jun 2020

Hyperparameter Ensembles for Robustness and Uncertainty QuantificationNeural Information Processing Systems (NeurIPS), 2020

543

238

24 Jun 2020

Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

Sheng Zha

183

24 Jun 2020

Continuous Submodular Function Maximization

Yatao Bian

J. M. Buhmann

Andreas Krause

204

24 Jun 2020

Local Stochastic Approximation: A Unified View of Federated Learning and Distributed Multi-Task Reinforcement Learning Algorithms

Thinh T. Doan

FedML

212

24 Jun 2020

DeepTopPush: Simple and Scalable Method for Accuracy at the Top

V. Mácha

Lukáš Adam

Václav Smídl

240

22 Jun 2020

A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning

Samuel Horváth

Peter Richtárik

234

19 Jun 2020

SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation

Robert Mansel Gower

Othmane Sebbouh

Nicolas Loizou

379

18 Jun 2020

A block coordinate descent optimizer for classification problems exploiting convexity

127

17 Jun 2020

Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training

521

16 Jun 2020

Spherical Motion Dynamics: Learning Dynamics of Neural Network with Normalization, Weight Decay, and SGD

Ruosi Wan

Zhanxing Zhu

Xiangyu Zhang

Jian Sun

211

15 Jun 2020

Scalable Control Variates for Monte Carlo Methods via Stochastic OptimizationMonte Carlo and Quasi-Monte Carlo Methods (MCQMC), 2020

Shijing Si

Chris J. Oates

Andrew B. Duncan

Lawrence Carin

F. Briol

BDL

171

12 Jun 2020

Non-convergence of stochastic gradient descent in the training of deep neural networksJournal of Complexity (J. Complexity), 2020

Patrick Cheridito

Arnulf Jentzen

Florian Rossmannek

219

12 Jun 2020

Stochastic Optimization for Performative PredictionNeural Information Processing Systems (NeurIPS), 2020

Celestine Mendler-Dünner

Juan C. Perdomo

Tijana Zrnic

Moritz Hardt

327

135

12 Jun 2020

Random Reshuffling: Simple Analysis with Vast ImprovementsNeural Information Processing Systems (NeurIPS), 2020

Konstantin Mishchenko

Ahmed Khaled

Peter Richtárik

362

151

10 Jun 2020

A Modified AUC for Training Convolutional Neural Networks: Taking Confidence into Account

Khashayar Namdar

M. Haider

Farzad Khalvati

176

08 Jun 2020

The Strength of Nesterov's Extrapolation in the Individual Convergence of Nonsmooth Optimization

121

08 Jun 2020

Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis

381

08 Jun 2020

SONIA: A Symmetric Blockwise Truncated Optimization Algorithm

154

06 Jun 2020

UFO-BLO: Unbiased First-Order Bilevel Optimization

Valerii Likhosherstov

269

05 Jun 2020

Scalable Plug-and-Play ADMM with Convergence Guarantees

Ulugbek S. Kamilov

314

05 Jun 2020

Asymptotic Analysis of Conditioned Stochastic Gradient Descent

Rémi Leluc

Franccois Portier

290

04 Jun 2020

A mathematical model for automatic differentiation in machine learningNeural Information Processing Systems (NeurIPS), 2020

Jérôme Bolte

Edouard Pauwels

185

03 Jun 2020

Finite Difference Neural Networks: Fast Prediction of Partial Differential EquationsInternational Conference on Machine Learning and Applications (ICMLA), 2020

160

02 Jun 2020

Carathéodory Sampling for Stochastic Gradient Descent

Francesco Cosentino

Harald Oberhauser

Alessandro Abate

161

02 Jun 2020

Improved SVRG for quadratic functions

N. Kahalé

253

01 Jun 2020

Artificial neural networks for neuroscientists: A primerNeuron (Neuron), 2020

G. R. Yang

Xiao-Jing Wang

449

302

01 Jun 2020

Data-Driven Methods to Monitor, Model, Forecast and Control Covid-19 Pandemic: Leveraging Data Science, Epidemiology and Control Theory

Teodoro Alamo

Daniel Gutiérrez-Reina

P. Millán

136

01 Jun 2020

Pruning via Iterative Ranking of Sensitivity Statistics

174

01 Jun 2020

Better scalability under potentially heavy-tailed gradients

Matthew J. Holland

262

01 Jun 2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine LearningAAAI Conference on Artificial Intelligence (AAAI), 2020

459

338

01 Jun 2020

A New Accelerated Stochastic Gradient Method with Momentum

Liang Liu

Xiaopeng Luo

ODL

31 May 2020

Complex Sequential Understanding through the Awareness of Spatial and Temporal ConceptsNature Machine Intelligence (NMI), 2020

176

30 May 2020

CoolMomentum: A Method for Stochastic Optimization by Langevin Dynamics with Simulated AnnealingScientific Reports (Sci Rep), 2020

O. Borysenko

M. Byshkin

ODL

155

29 May 2020

HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data ParallelismUSENIX Annual Technical Conference (USENIX ATC), 2020

238

166

28 May 2020

Convergence Analysis of Riemannian Stochastic Approximation Schemes

261

27 May 2020

Scalable Privacy-Preserving Distributed Learning

D. Froelicher

J. Troncoso-Pastoriza

Apostolos Pyrgelis

Sinem Sav

João Sá Sousa

Jean-Philippe Bossuat

Jean-Pierre Hubaux

FedML

269

19 May 2020