v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,490 papers shown

Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD

388

203

03 Mar 2018

Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency AnalysisACM Computing Surveys (CSUR), 2018

Tal Ben-Nun

Torsten Hoefler

GNN

318

772

26 Feb 2018

GPU Accelerated Sub-Sampled Newton's Method

Sudhir B. Kylasa

Farbod Roosta-Khorasani

Michael W. Mahoney

A. Grama

ODL

160

26 Feb 2018

Complex-valued Neural Networks with Non-parametric Activation Functions

Simone Scardapane

S. Van Vaerenbergh

Amir Hussain

A. Uncini

184

22 Feb 2018

Spurious Valleys in Two-layer Neural Network Optimization Landscapes

Luca Venturi

Afonso S. Bandeira

Joan Bruna

337

18 Feb 2018

Convergence of Online Mirror Descent

Yunwen Lei

Ding-Xuan Zhou

140

18 Feb 2018

Stochastic quasi-Newton with adaptive step lengths for large-scale problems

A. Wills

Thomas B. Schon

131

12 Feb 2018

SGD and Hogwild! Convergence Without the Bounded Gradients Assumption

274

241

11 Feb 2018

Estimating Heterogeneous Consumer Preferences for Restaurants and Travel Time Using Mobile Location Data

166

22 Jan 2018

Optimal Convergence for Distributed Learning with Stochastic Gradient Methods and Spectral Algorithms

Junhong Lin

Volkan Cevher

171

22 Jan 2018

Rover Descent: Learning to optimize by learning to navigate on prototypical loss surfaces

Louis Faury

Flavian Vasile

150

22 Jan 2018

When Does Stochastic Gradient Algorithm Work Well?

139

18 Jan 2018

MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning

142

11 Jan 2018

Gradient-based Optimization for Regression in the Functional Tensor-Train Format

Alex A. Gorodetsky

J. Jakeman

244

03 Jan 2018

A Stochastic Trust Region Algorithm Based on Careful Step NormalizationINFORMS Journal on Optimization (JIO), 2017

Frank E. Curtis

K. Scheinberg

R. Shi

196

29 Dec 2017

Geometrical Insights for Implicit Generative Modeling

Léon Bottou

Martín Arjovsky

David Lopez-Paz

Maxime Oquab

225

21 Dec 2017

Snake: a Stochastic Proximal Gradient Algorithm for Regularized Problems over Large GraphsIEEE Transactions on Automatic Control (TAC), 2017

Adil Salim

Pascal Bianchi

W. Hachem

162

19 Dec 2017

Parallel Complexity of Forward and Backward Propagation

Maxim Naumov

172

18 Dec 2017

The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning

Siyuan Ma

Raef Bassily

M. Belkin

312

313

18 Dec 2017

Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks

Shankar Krishnan

Ying Xiao

Rif A. Saurous

ODL

163

08 Dec 2017

AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks

289

149

06 Dec 2017

A two-dimensional decomposition approach for matrix completion through gossip

Mukul Bhutani

Bamdev Mishra

21 Nov 2017

Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

Ziming Zhang

M. Brand

107

20 Nov 2017

BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning

193

19 Nov 2017

Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization

221

10 Nov 2017

SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements

Francisco J. R. Ruiz

Susan Athey

David M. Blei

616

09 Nov 2017

Analysis of Biased Stochastic Gradient Descent Using Sequential Semidefinite Programs

Bin Hu

Peter M. Seiler

Laurent Lessard

294

03 Nov 2017

Don't Decay the Learning Rate, Increase the Batch Size

Samuel L. Smith

Pieter-Jan Kindermans

Chris Ying

Quoc V. Le

ODL

680

1,080

01 Nov 2017

Adaptive Sampling Strategies for Stochastic Optimization

Raghu Bollapragada

R. Byrd

J. Nocedal

111

128

30 Oct 2017

On the role of synaptic stochasticity in training low-precision neural networksPhysical Review Letters (PRL), 2017

198

26 Oct 2017

Avoiding Communication in Proximal Methods for Convex Optimization Problems

139

24 Oct 2017

Smart "Predict, then Optimize"Management Sciences (MS), 2017

Adam N. Elmachtoub

Paul Grigas

480

735

22 Oct 2017

Convergence diagnostics for stochastic gradient descent with constant step size

Jerry Chee

Panos Toulis

191

17 Oct 2017

AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition

Hongfa Wang

155

10 Oct 2017

SGD for robot motion? The effectiveness of stochastic optimization on a new benchmark for biped locomotion tasks

Martim Brandao

K. Hashimoto

A. Takanishi

09 Oct 2017

Training Feedforward Neural Networks with Standard Logistic Activations is Feasible

Emanuele Sansone

F. D. De Natale

106

03 Oct 2017

How regularization affects the critical points in linear networks

Amirhossein Taghvaei

Jin-Won Kim

P. Mehta

151

27 Sep 2017

On Principal Components Regression, Random Projections, and Column Subsampling

M. Slawski

171

23 Sep 2017

Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form

Maxim Naumov

16 Sep 2017

ClickBAIT: Click-based Accelerated Incremental Training of Convolutional Neural Networks

Ervin Teng

João Diogo Falcão

Bob Iannucci

136

15 Sep 2017

The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems

V. Patel

MLT

110

14 Sep 2017

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

Peng Xu

Farbod Roosta-Khorasani

Michael W. Mahoney

ODL

203

156

25 Aug 2017

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information

Peng Xu

Farbod Roosta-Khorasani

Michael W. Mahoney

580

220

23 Aug 2017

Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates

L. Smith

Nicholay Topin

AI4CE

429

526

23 Aug 2017

Regularizing and Optimizing LSTM Language ModelsInternational Conference on Learning Representations (ICLR), 2017

Stephen Merity

N. Keskar

R. Socher

344

1,147

07 Aug 2017

On the convergence properties of a

K

-step averaging stochastic gradient descent algorithm for nonconvex optimization

Fan Zhou

Guojing Cong

414

243

03 Aug 2017

A Robust Multi-Batch L-BFGS Method for Machine Learning

A. Berahas

Martin Takáč

AAML ODL

238

26 Jul 2017

Warped Riemannian metrics for location-scale models

Salem Said

Lionel Bombrun

Y. Berthoumieu

202

22 Jul 2017

Stochastic, Distributed and Federated Optimization for Machine Learning

Jakub Konecný

FedML

195

04 Jul 2017

Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning

Frank E. Curtis

K. Scheinberg

194

30 Jun 2017