ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,490 papers shown
Geometric Generalization Based Zero-Shot Learning Dataset Infinite
  World: Simple Yet Powerful
Geometric Generalization Based Zero-Shot Learning Dataset Infinite World: Simple Yet Powerful
R. Chidambaram
Michael C. Kampffmeyer
Willie Neiswanger
Xiaodan Liang
T. Lachmann
Eric Xing
96
0
0
10 Jul 2018
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path
  Integrated Differential Estimator
SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential EstimatorNeural Information Processing Systems (NeurIPS), 2018
Cong Fang
C. J. Li
Zhouchen Lin
Tong Zhang
445
641
0
04 Jul 2018
Quasi-Monte Carlo Variational Inference
Quasi-Monte Carlo Variational InferenceInternational Conference on Machine Learning (ICML), 2018
Alexander K. Buchholz
F. Wenzel
Stephan Mandt
BDL
222
63
0
04 Jul 2018
Trust-Region Algorithms for Training Responses: Machine Learning Methods
  Using Indefinite Hessian Approximations
Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations
Jennifer B. Erway
J. Griffin
Roummel F. Marcia
Riadh Omheni
357
26
0
01 Jul 2018
Algorithms for solving optimization problems arising from deep neural
  net models: smooth problems
Algorithms for solving optimization problems arising from deep neural net models: smooth problems
Vyacheslav Kungurtsev
Tomás Pevný
121
6
0
30 Jun 2018
Random Shuffling Beats SGD after Finite Epochs
Random Shuffling Beats SGD after Finite EpochsInternational Conference on Machine Learning (ICML), 2018
Jeff Z. HaoChen
S. Sra
245
105
0
26 Jun 2018
Pushing the boundaries of parallel Deep Learning -- A practical approach
Pushing the boundaries of parallel Deep Learning -- A practical approach
Paolo Viviani
M. Drocco
Marco Aldinucci
OOD
119
1
0
25 Jun 2018
Como funciona o Deep Learning
Como funciona o Deep Learning
M. Ponti
G. B. P. D. Costa
108
14
0
20 Jun 2018
Laplacian Smoothing Gradient Descent
Laplacian Smoothing Gradient Descent
Stanley Osher
Bao Wang
Penghang Yin
Xiyang Luo
Farzin Barekat
Minh Pham
A. Lin
ODL
341
46
0
17 Jun 2018
Stochastic Gradient Descent with Exponential Convergence Rates of
  Expected Classification Errors
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
Atsushi Nitanda
Taiji Suzuki
274
10
0
14 Jun 2018
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam
Mohammad Emtiyaz Khan
Didrik Nielsen
Voot Tangkaratt
Wu Lin
Y. Gal
Akash Srivastava
ODL
407
282
0
13 Jun 2018
When Will Gradient Methods Converge to Max-margin Classifier under ReLU
  Models?
When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?
Tengyu Xu
Yi Zhou
Kaiyi Ji
Yingbin Liang
248
19
0
12 Jun 2018
Fast Approximate Natural Gradient Descent in a Kronecker-factored
  Eigenbasis
Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis
Thomas George
César Laurent
Xavier Bouthillier
Nicolas Ballas
Pascal Vincent
ODL
246
178
0
11 Jun 2018
Dissipativity Theory for Accelerating Stochastic Variance Reduction: A
  Unified Analysis of SVRG and Katyusha Using Semidefinite Programs
Dissipativity Theory for Accelerating Stochastic Variance Reduction: A Unified Analysis of SVRG and Katyusha Using Semidefinite Programs
Bin Hu
S. Wright
Laurent Lessard
159
20
0
10 Jun 2018
Lightweight Stochastic Optimization for Minimizing Finite Sums with
  Infinite Data
Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data
Shuai Zheng
James T. Kwok
162
9
0
08 Jun 2018
A Finite Time Analysis of Temporal Difference Learning With Linear
  Function Approximation
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj Bhandari
Daniel Russo
Raghav Singal
402
372
0
06 Jun 2018
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes
Rachel A. Ward
Xiaoxia Wu
Léon Bottou
ODL
609
409
0
05 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a
  Fixed Learning Rate
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedMLMLT
283
108
0
05 Jun 2018
Backdrop: Stochastic Backpropagation
Backdrop: Stochastic Backpropagation
Siavash Golkar
Kyle Cranmer
142
2
0
04 Jun 2018
Global linear convergence of Newton's method without strong-convexity or
  Lipschitz gradients
Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients
Sai Praneeth Karimireddy
Sebastian U. Stich
Martin Jaggi
187
59
0
01 Jun 2018
Accelerating Incremental Gradient Optimization with Curvature
  Information
Accelerating Incremental Gradient Optimization with Curvature Information
Hoi-To Wai
Wei Shi
César A. Uribe
A. Nedić
Anna Scaglione
267
13
0
31 May 2018
DeepMiner: Discovering Interpretable Representations for Mammogram
  Classification and Explanation
DeepMiner: Discovering Interpretable Representations for Mammogram Classification and Explanation
Jimmy Wu
Bolei Zhou
D. Peck
S. Hsieh
V. Dialani
Lester W. Mackey
Genevieve Patterson
FAttMedIm
173
24
0
31 May 2018
On Consensus-Optimality Trade-offs in Collaborative Deep Learning
On Consensus-Optimality Trade-offs in Collaborative Deep Learning
Zhanhong Jiang
Aditya Balu
Chinmay Hegde
Soumik Sarkar
FedML
109
9
0
30 May 2018
Bayesian Learning with Wasserstein Barycenters
Bayesian Learning with Wasserstein Barycenters
Julio D. Backhoff Veraguas
J. Fontbona
Gonzalo Rios
Felipe A. Tobar
352
37
0
28 May 2018
Statistical Optimality of Stochastic Gradient Descent on Hard Learning
  Problems through Multiple Passes
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes
Loucas Pillaud-Vivien
Alessandro Rudi
Francis R. Bach
356
111
0
25 May 2018
Stochastic algorithms with descent guarantees for ICA
Stochastic algorithms with descent guarantees for ICA
Pierre Ablin
Alexandre Gramfort
J. Cardoso
Francis R. Bach
CML
183
8
0
25 May 2018
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed
  Learning
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning
Tianyi Chen
G. Giannakis
Tao Sun
W. Yin
257
315
0
25 May 2018
A Two-Stage Subspace Trust Region Approach for Deep Neural Network
  Training
A Two-Stage Subspace Trust Region Approach for Deep Neural Network Training
V. Dudar
Giovanni Chierchia
Émilie Chouzenoux
J. Pesquet
V. Semenov
64
5
0
23 May 2018
Predictive Local Smoothness for Stochastic Gradient Methods
Predictive Local Smoothness for Stochastic Gradient Methods
Jun Yu Li
Hongfu Liu
Bineng Zhong
Yue Wu
Y. Fu
ODL
179
1
0
23 May 2018
Efficient Stochastic Gradient Descent for Learning with Distributionally
  Robust Optimization
Efficient Stochastic Gradient Descent for Learning with Distributionally Robust Optimization
Soumyadip Ghosh
M. Squillante
Ebisa D. Wollega
OOD
79
10
0
22 May 2018
LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep
  Neural Networks
LMKL-Net: A Fast Localized Multiple Kernel Learning Solver via Deep Neural Networks
Ziming Zhang
ODL
126
1
0
22 May 2018
Stochastic modified equations for the asynchronous stochastic gradient
  descent
Stochastic modified equations for the asynchronous stochastic gradient descent
Jing An
Jian-wei Lu
Lexing Ying
173
78
0
21 May 2018
On the Convergence of Stochastic Gradient Descent with Adaptive
  Stepsizes
On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
Xiaoyun Li
Francesco Orabona
337
322
0
21 May 2018
Parallel and Distributed Successive Convex Approximation Methods for
  Big-Data Optimization
Parallel and Distributed Successive Convex Approximation Methods for Big-Data Optimization
G. Scutari
Ying Sun
288
74
0
17 May 2018
Decoupled Parallel Backpropagation with Convergence Guarantee
Decoupled Parallel Backpropagation with Convergence Guarantee
Zhouyuan Huo
Bin Gu
Qian Yang
Heng-Chiao Huang
275
99
0
27 Apr 2018
Revisiting Small Batch Training for Deep Neural Networks
Revisiting Small Batch Training for Deep Neural Networks
Dominic Masters
Carlo Luschi
ODL
176
735
0
20 Apr 2018
Constant Step Size Stochastic Gradient Descent for Probabilistic
  Modeling
Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling
Dmitry Babichev
Francis R. Bach
92
10
0
16 Apr 2018
E-commerce Anomaly Detection: A Bayesian Semi-Supervised Tensor
  Decomposition Approach using Natural Gradients
E-commerce Anomaly Detection: A Bayesian Semi-Supervised Tensor Decomposition Approach using Natural Gradients
Anil R. Yelundur
Srinivasan H. Sengamedu
Bamdev Mishra
169
1
0
11 Apr 2018
Sequence Training of DNN Acoustic Models With Natural Gradient
Sequence Training of DNN Acoustic Models With Natural Gradient
Adnan Haider
P. Woodland
60
7
0
06 Apr 2018
Probabilistic Contraction Analysis of Iterated Random Operators
Probabilistic Contraction Analysis of Iterated Random Operators
Abhishek Gupta
Rahul Jain
Peter Glynn
216
11
0
04 Apr 2018
A Constant Step Stochastic Douglas-Rachford Algorithm with Application
  to Non Separable Regularizations
A Constant Step Stochastic Douglas-Rachford Algorithm with Application to Non Separable Regularizations
Adil Salim
Pascal Bianchi
W. Hachem
127
2
0
03 Apr 2018
Training Tips for the Transformer Model
Training Tips for the Transformer Model
Martin Popel
Ondrej Bojar
482
326
0
01 Apr 2018
A Common Framework for Natural Gradient and Taylor based Optimisation
  using Manifold Theory
A Common Framework for Natural Gradient and Taylor based Optimisation using Manifold Theory
Adnan Haider
53
2
0
26 Mar 2018
Lower error bounds for the stochastic gradient descent optimization
  algorithm: Sharp convergence rates for slowly and fast decaying learning
  rates
Lower error bounds for the stochastic gradient descent optimization algorithm: Sharp convergence rates for slowly and fast decaying learning rates
Arnulf Jentzen
Philippe von Wurstemberger
191
33
0
22 Mar 2018
Group Normalization
Group Normalization
Yuxin Wu
Kaiming He
598
4,126
0
22 Mar 2018
Efficient FPGA Implementation of Conjugate Gradient Methods for
  Laplacian System using HLS
Efficient FPGA Implementation of Conjugate Gradient Methods for Laplacian System using HLSSymposium on Field Programmable Gate Arrays (FPGA), 2018
Sahithi Rampalli
N. Sehgal
Ishita Bindlish
Tanya Tyagi
Pawan Kumar
81
4
0
10 Mar 2018
A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex
  Optimization
A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization
Andre Milzarek
X. Xiao
Shicong Cen
Zaiwen Wen
M. Ulbrich
166
37
0
09 Mar 2018
WNGrad: Learn the Learning Rate in Gradient Descent
WNGrad: Learn the Learning Rate in Gradient Descent
Xiaoxia Wu
Rachel A. Ward
Léon Bottou
136
93
0
07 Mar 2018
Energy-entropy competition and the effectiveness of stochastic gradient
  descent in machine learning
Energy-entropy competition and the effectiveness of stochastic gradient descent in machine learning
Yao Zhang
Andrew M. Saxe
Madhu S. Advani
A. Lee
188
61
0
05 Mar 2018
DAGs with NO TEARS: Continuous Optimization for Structure Learning
DAGs with NO TEARS: Continuous Optimization for Structure Learning
Xun Zheng
Bryon Aragam
Pradeep Ravikumar
Eric Xing
NoLaCMLOffRL
383
1,171
0
04 Mar 2018
Previous
123...27282930
Next
Page 28 of 30
Pageof 30