ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,490 papers shown
Universal Adversarial Training
Universal Adversarial Training
A. Mendrik
Mahyar Najibi
Zheng Xu
John P. Dickerson
L. Davis
Tom Goldstein
AAMLOOD
252
205
0
27 Nov 2018
Forward Stability of ResNet and Its Variants
Forward Stability of ResNet and Its VariantsJournal of Mathematical Imaging and Vision (JMIV), 2018
Linan Zhang
Hayden Schaeffer
199
53
0
24 Nov 2018
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex
  optimization
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimizationStatistics and computing (Stat. Comput.), 2018
Ömer Deniz Akyildiz
Dan Crisan
Joaquín Míguez
192
8
0
23 Nov 2018
A Sufficient Condition for Convergences of Adam and RMSProp
A Sufficient Condition for Convergences of Adam and RMSPropComputer Vision and Pattern Recognition (CVPR), 2018
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
277
419
0
23 Nov 2018
Distributed Gradient Descent with Coded Partial Gradient Computations
Distributed Gradient Descent with Coded Partial Gradient ComputationsIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018
Emre Ozfatura
S. Ulukus
Deniz Gunduz
221
41
0
22 Nov 2018
New Convergence Aspects of Stochastic Gradient Algorithms
New Convergence Aspects of Stochastic Gradient Algorithms
Lam M. Nguyen
Phuong Ha Nguyen
Peter Richtárik
K. Scheinberg
Martin Takáč
Marten van Dijk
390
70
0
10 Nov 2018
A Bayesian Perspective of Statistical Machine Learning for Big Data
A Bayesian Perspective of Statistical Machine Learning for Big DataComputational statistics (Zeitschrift) (Comput. Stat.), 2018
R. Sambasivan
Sourish Das
S. Sahu
BDLGP
190
21
0
09 Nov 2018
Double Adaptive Stochastic Gradient Optimization
Double Adaptive Stochastic Gradient Optimization
Rajaditya Mukherjee
Jin Li
Shicheng Chu
Huamin Wang
ODL
129
0
0
06 Nov 2018
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent
Avetik G. Karagulyan
92
1
0
02 Nov 2018
Functional Nonlinear Sparse Models
Functional Nonlinear Sparse Models
Luiz F. O. Chamon
Yonina C. Eldar
Alejandro Ribeiro
354
11
0
01 Nov 2018
A general system of differential equations to model first order adaptive
  algorithms
A general system of differential equations to model first order adaptive algorithms
André Belotto da Silva
Maxime Gazeau
214
38
0
31 Oct 2018
Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic
  Optimization
Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization
James Vuckovic
ODL
136
16
0
29 Oct 2018
SpiderBoost and Momentum: Faster Stochastic Variance Reduction
  Algorithms
SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms
Zhe Wang
Kaiyi Ji
Yi Zhou
Yingbin Liang
Vahid Tarokh
ODL
205
83
0
25 Oct 2018
Condition Number Analysis of Logistic Regression, and its Implications
  for Standard First-Order Solution Methods
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods
R. Freund
Paul Grigas
Rahul Mazumder
139
10
0
20 Oct 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime
  Trade-off in Local-Update SGD
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD
Jianyu Wang
Gauri Joshi
FedML
207
245
0
19 Oct 2018
First-order and second-order variants of the gradient descent in a
  unified framework
First-order and second-order variants of the gradient descent in a unified framework
Thomas Pierrot
Nicolas Perrin
Olivier Sigaud
ODL
334
7
0
18 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning
Fault Tolerance in Iterative-Convergent Machine Learning
Aurick Qiao
Bryon Aragam
Bingjing Zhang
Eric Xing
201
47
0
17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural
  Networks
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Xiaodong Cui
Wei Zhang
Zoltán Tüske
M. Picheny
ODL
192
99
0
16 Oct 2018
Approximate Fisher Information Matrix to Characterise the Training of
  Deep Neural Networks
Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks
Zhibin Liao
Tom Drummond
Ian Reid
G. Carneiro
151
25
0
16 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLMOffRL
367
143
0
15 Oct 2018
Tight Dimension Independent Lower Bound on the Expected Convergence Rate
  for Diminishing Step Sizes in SGD
Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD
Phuong Ha Nguyen
Lam M. Nguyen
Marten van Dijk
LRM
231
36
0
10 Oct 2018
Characterization of Convex Objective Functions and Optimal Expected
  Convergence Rates for SGD
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
Marten van Dijk
Lam M. Nguyen
Phuong Ha Nguyen
Dzung Phan
235
6
0
09 Oct 2018
Information Geometry of Orthogonal Initializations and Training
Information Geometry of Orthogonal Initializations and Training
Piotr A. Sokól
Il-Su Park
AI4CE
294
17
0
09 Oct 2018
Principled Deep Neural Network Training through Linear Programming
Principled Deep Neural Network Training through Linear Programming
D. Bienstock
Gonzalo Muñoz
Sebastian Pokutta
269
25
0
07 Oct 2018
Accelerating Stochastic Gradient Descent Using Antithetic Sampling
Accelerating Stochastic Gradient Descent Using Antithetic Sampling
Jingchang Liu
Linli Xu
152
3
0
07 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms
Continuous-time Models for Stochastic Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
298
33
0
05 Oct 2018
Combining Natural Gradient with Hessian Free Methods for Sequence
  Training
Combining Natural Gradient with Hessian Free Methods for Sequence Training
Adnan Haider
P. Woodland
ODL
92
4
0
03 Oct 2018
Large batch size training of neural networks with adversarial training
  and second-order information
Large batch size training of neural networks with adversarial training and second-order information
Z. Yao
A. Gholami
Daiyaan Arfeen
Richard Liaw
Alfons Kemper
Kurt Keutzer
Michael W. Mahoney
ODL
288
46
0
02 Oct 2018
Privacy-preserving Stochastic Gradual Learning
Privacy-preserving Stochastic Gradual Learning
Bo Han
Ivor W. Tsang
Xiaokui Xiao
Ling-Hao Chen
S. Fung
C. Yu
NoLa
133
9
0
30 Sep 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Sangkug Lym
Armand Behroozi
W. Wen
Ge Li
Yongkee Kwon
M. Erez
121
27
0
30 Sep 2018
A fast quasi-Newton-type method for large-scale stochastic optimisation
A fast quasi-Newton-type method for large-scale stochastic optimisation
A. Wills
Carl Jidling
Thomas B. Schon
ODL
132
7
0
29 Sep 2018
A Quantitative Analysis of the Effect of Batch Normalization on Gradient
  Descent
A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent
Yongqiang Cai
Qianxiao Li
Zuowei Shen
94
3
0
29 Sep 2018
Fluctuation-dissipation relations for stochastic gradient descent
Fluctuation-dissipation relations for stochastic gradient descent
Sho Yaida
369
80
0
28 Sep 2018
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An OverviewIEEE Transactions on Signal Processing (IEEE Trans. Signal Process.), 2018
Yuejie Chi
Yue M. Lu
Yuxin Chen
453
460
0
25 Sep 2018
Predictive Collective Variable Discovery with Deep Bayesian Models
Predictive Collective Variable Discovery with Deep Bayesian Models
M. Schöberl
N. Zabaras
P. Koutsourelakis
227
36
0
18 Sep 2018
A Unified Batch Online Learning Framework for Click Prediction
A Unified Batch Online Learning Framework for Click Prediction
Rishabh K. Iyer
Nimit Acharya
Tanuja Bompada
Denis Xavier Charles
Eren Manavoglu
50
2
0
12 Sep 2018
MotherNets: Rapid Deep Ensemble Learning
MotherNets: Rapid Deep Ensemble Learning
Abdul Wasay
Brian Hentschel
Yuze Liao
Sanyuan Chen
Stratos Idreos
176
39
0
12 Sep 2018
MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for
  Efficient Object Detection
MDCN: Multi-Scale, Deep Inception Convolutional Neural Networks for Efficient Object Detection
Wenchi Ma
Yuanwei Wu
Zongbo Wang
Guanghui Wang
ObjD
156
25
0
06 Sep 2018
Compositional Stochastic Average Gradient for Machine Learning and
  Related Applications
Compositional Stochastic Average Gradient for Machine Learning and Related Applications
Tsung-Yu Hsieh
Y. El-Manzalawy
Yiwei Sun
Vasant Honavar
187
1
0
04 Sep 2018
Distributed Nonconvex Constrained Optimization over Time-Varying
  Digraphs
Distributed Nonconvex Constrained Optimization over Time-Varying Digraphs
G. Scutari
Ying Sun
210
192
0
04 Sep 2018
Sparsity in Deep Neural Networks - An Empirical Investigation with
  TensorQuant
Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant
D. Loroch
Franz-Josef Pfreundt
Norbert Wehn
J. Keuper
113
5
0
27 Aug 2018
Deep Learning: Computational Aspects
Deep Learning: Computational Aspects
Nicholas G. Polson
Vadim Sokolov
PINNBDLAI4CE
150
14
0
26 Aug 2018
Cooperative SGD: A unified Framework for the Design and Analysis of
  Communication-Efficient SGD Algorithms
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms
Jianyu Wang
Gauri Joshi
415
357
0
22 Aug 2018
Experiential Robot Learning with Accelerated Neuroevolution
Experiential Robot Learning with Accelerated Neuroevolution
Ahmed Aly
J. Dugan
75
1
0
16 Aug 2018
Backtracking gradient descent method for general $C^1$ functions, with
  applications to Deep Learning
Backtracking gradient descent method for general C1C^1C1 functions, with applications to Deep Learning
T. Truong
T. H. Nguyen
167
10
0
15 Aug 2018
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex
  Optimization
On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
Xiangyi Chen
Sijia Liu
Tian Ding
Mingyi Hong
282
352
0
08 Aug 2018
Stochastic Gradient Descent with Biased but Consistent Gradient
  Estimators
Stochastic Gradient Descent with Biased but Consistent Gradient Estimators
Jie Chen
Ronny Luss
266
46
0
31 Jul 2018
Particle Filtering Methods for Stochastic Optimization with Application
  to Large-Scale Empirical Risk Minimization
Particle Filtering Methods for Stochastic Optimization with Application to Large-Scale Empirical Risk Minimization
Bin Liu
517
11
0
23 Jul 2018
Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass
  Classification Problems
Newton-ADMM: A Distributed GPU-Accelerated Optimizer for Multiclass Classification ProblemsInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2018
Chih-Hao Fang
Sudhir B. Kylasa
Fred Roosta
Michael W. Mahoney
A. Grama
ODL
181
11
0
18 Jul 2018
Training Neural Networks Using Features Replay
Training Neural Networks Using Features Replay
Zhouyuan Huo
Bin Gu
Heng-Chiao Huang
292
75
0
12 Jul 2018
Previous
123...2627282930
Next
Page 27 of 30
Pageof 30