ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

40 / 1,490 papers shown
Efficiency of quantum versus classical annealing in non-convex learning
  problems
Efficiency of quantum versus classical annealing in non-convex learning problems
Carlo Baldassi
R. Zecchina
221
54
0
26 Jun 2017
Faster independent component analysis by preconditioning with Hessian
  approximations
Faster independent component analysis by preconditioning with Hessian approximations
Pierre Ablin
J. Cardoso
Alexandre Gramfort
CML
97
156
0
25 Jun 2017
Collaborative Deep Learning in Fixed Topology Networks
Collaborative Deep Learning in Fixed Topology Networks
Zhanhong Jiang
Aditya Balu
Chinmay Hegde
Soumik Sarkar
FedML
194
191
0
23 Jun 2017
Improved Optimization of Finite Sums with Minibatch Stochastic Variance
  Reduced Proximal Iterations
Improved Optimization of Finite Sums with Minibatch Stochastic Variance Reduced Proximal Iterations
Jialei Wang
Tong Zhang
256
12
0
21 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Dong Yin
A. Pananjady
Max Lam
Dimitris Papailiopoulos
Kannan Ramchandran
Peter L. Bartlett
226
12
0
18 Jun 2017
Stochastic Training of Neural Networks via Successive Convex
  Approximations
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
Paolo Di Lorenzo
158
9
0
15 Jun 2017
Proximal Backpropagation
Proximal BackpropagationInternational Conference on Learning Representations (ICLR), 2017
Thomas Frerix
Thomas Möllenhoff
Michael Möller
Zorah Lähner
186
32
0
14 Jun 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
581
3,944
0
08 Jun 2017
Diagonal Rescaling For Neural Networks
Diagonal Rescaling For Neural Networks
Jean Lafond
Nicolas Vasilache
Léon Bottou
145
12
0
25 May 2017
Diminishing Batch Normalization
Diminishing Batch Normalization
Yintai Ma
Diego Klabjan
131
15
0
22 May 2017
On the diffusion approximation of nonconvex stochastic gradient descent
On the diffusion approximation of nonconvex stochastic gradient descent
Junyang Qian
C. J. Li
Lei Li
Jianguo Liu
DiffM
219
24
0
22 May 2017
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD
EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD
Mehmet A. Donmez
Maxim Raginsky
A. Singer
FedML
64
0
0
19 May 2017
An Investigation of Newton-Sketch and Subsampled Newton Methods
An Investigation of Newton-Sketch and Subsampled Newton Methods
A. Berahas
Raghu Bollapragada
J. Nocedal
264
118
0
17 May 2017
Efficient Parallel Methods for Deep Reinforcement Learning
Efficient Parallel Methods for Deep Reinforcement Learning
Alfredo V. Clemente
Humberto Nicolás Castejón Martínez
A. Chandra
182
117
0
13 May 2017
Stable Architectures for Deep Neural Networks
Stable Architectures for Deep Neural Networks
E. Haber
Lars Ruthotto
758
791
0
09 May 2017
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering
SEAGLE: Sparsity-Driven Image Reconstruction under Multiple Scattering
Hsiou-Yuan Liu
Dehong Liu
Hassan Mansour
P. Boufounos
Laura Waller
Ulugbek S. Kamilov
79
81
0
05 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Julia Kreutzer
Artem Sokolov
Stefan Riezler
189
51
0
21 Apr 2017
Deep Relaxation: partial differential equations for optimizing deep
  neural networks
Deep Relaxation: partial differential equations for optimizing deep neural networks
Pratik Chaudhari
Adam M. Oberman
Stanley Osher
Stefano Soatto
G. Carlier
301
159
0
17 Apr 2017
Inference via low-dimensional couplings
Inference via low-dimensional couplings
Alessio Spantini
Daniele Bigoni
Youssef Marzouk
360
124
0
17 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
436
834
0
15 Mar 2017
Riemannian stochastic quasi-Newton algorithm with variance reduction and
  its convergence analysis
Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis
Hiroyuki Kasai
Hiroyuki Sato
Bamdev Mishra
196
22
0
15 Mar 2017
Learning across scales - A multiscale method for Convolution Neural
  Networks
Learning across scales - A multiscale method for Convolution Neural Networks
E. Haber
Lars Ruthotto
E. Holtham
Seong-Hwan Jun
186
24
0
06 Mar 2017
Stochastic Functional Gradient for Motion Planning in Continuous
  Occupancy Maps
Stochastic Functional Gradient for Motion Planning in Continuous Occupancy Maps
Gilad Francis
Lionel Ott
F. Ramos
83
16
0
01 Mar 2017
SARAH: A Novel Method for Machine Learning Problems Using Stochastic
  Recursive Gradient
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
Lam M. Nguyen
Jie Liu
K. Scheinberg
Martin Takáč
ODL
423
679
0
01 Mar 2017
Stochastic Newton and Quasi-Newton Methods for Large Linear
  Least-squares Problems
Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems
Julianne Chung
Matthias Chung
J. T. Slagel
L. Tenorio
118
11
0
23 Feb 2017
On SGD's Failure in Practice: Characterizing and Overcoming Stalling
On SGD's Failure in Practice: Characterizing and Overcoming Stalling
V. Patel
175
1
0
01 Feb 2017
Stochastic Subsampling for Factorizing Huge Matrices
Stochastic Subsampling for Factorizing Huge MatricesIEEE Transactions on Signal Processing (IEEE TSP), 2017
A. Mensch
Julien Mairal
Bertrand Thirion
Gaël Varoquaux
214
32
0
19 Jan 2017
Towards Principled Methods for Training Generative Adversarial Networks
Towards Principled Methods for Training Generative Adversarial NetworksInternational Conference on Learning Representations (ICLR), 2017
Martín Arjovsky
M. Nault
GAN
266
2,223
0
17 Jan 2017
Stochastic Generative Hashing
Stochastic Generative HashingInternational Conference on Machine Learning (ICML), 2017
Bo Dai
Ruiqi Guo
Sanjiv Kumar
Niao He
Le Song
TPM
203
112
0
11 Jan 2017
Coupling Adaptive Batch Sizes with Learning Rates
Coupling Adaptive Batch Sizes with Learning RatesConference on Uncertainty in Artificial Intelligence (UAI), 2016
Lukas Balles
Javier Romero
Philipp Hennig
ODL
281
122
0
15 Dec 2016
Federated Optimization: Distributed Machine Learning for On-Device
  Intelligence
Federated Optimization: Distributed Machine Learning for On-Device Intelligence
Jakub Konecný
H. B. McMahan
Daniel Ramage
Peter Richtárik
FedML
485
2,126
0
08 Oct 2016
Stochastic Optimization with Variance Reduction for Infinite Datasets
  with Finite-Sum Structure
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite-Sum StructureNeural Information Processing Systems (NeurIPS), 2016
A. Bietti
Julien Mairal
521
36
0
04 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
1.3K
3,247
0
15 Sep 2016
Benchmarking State-of-the-Art Deep Learning Software Tools
Benchmarking State-of-the-Art Deep Learning Software ToolsInternational Conference on Cloud Computing and Big Data (ICCCBD), 2016
Shaoshuai Shi
Qiang-qiang Wang
Pengfei Xu
Xiaowen Chu
BDL
315
338
0
25 Aug 2016
DOOMED: Direct Online Optimization of Modeling Errors in Dynamics
DOOMED: Direct Online Optimization of Modeling Errors in Dynamics
Nathan D. Ratliff
Franziska Meier
Daniel Kappler
S. Schaal
146
20
0
01 Aug 2016
Tradeoffs between Convergence Speed and Reconstruction Accuracy in
  Inverse Problems
Tradeoffs between Convergence Speed and Reconstruction Accuracy in Inverse Problems
Raja Giryes
Yonina C. Eldar
A. Bronstein
Guillermo Sapiro
255
85
0
30 May 2016
FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods
FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods
Xiang Cheng
Farbod Roosta-Khorasani
Stefan Palombo
Peter L. Bartlett
Michael W. Mahoney
ODL
142
0
0
26 May 2016
A Multi-Batch L-BFGS Method for Machine Learning
A Multi-Batch L-BFGS Method for Machine Learning
A. Berahas
J. Nocedal
Martin Takáč
ODL
274
121
0
19 May 2016
The Proximal Robbins-Monro Method
The Proximal Robbins-Monro Method
Panos Toulis
Thibaut Horel
E. Airoldi
239
36
0
04 Oct 2015
Automatic differentiation in machine learning: a survey
Automatic differentiation in machine learning: a survey
A. G. Baydin
Barak A. Pearlmutter
Alexey Radul
J. Siskind
PINNAI4CEODL
653
3,292
0
20 Feb 2015
Previous
123...282930
Page 30 of 30
Pageof 30