ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXivPDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,407 papers shown
Title
DTN: A Learning Rate Scheme with Convergence Rate of O(1/t)\mathcal{O}(1/t)O(1/t) for SGD
Lam M. Nguyen
Phuong Ha Nguyen
Dzung Phan
Jayant Kalagnanam
Marten van Dijk
33
0
0
22 Jan 2019
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep
  Neural Networks
AccUDNN: A GPU Memory Efficient Accelerator for Training Ultra-deep Neural Networks
Jinrong Guo
Wantao Liu
Wang Wang
Q. Lu
Songlin Hu
Jizhong Han
Ruixuan Li
16
9
0
21 Jan 2019
Tuning parameter selection rules for nuclear norm regularized
  multivariate linear regression
Tuning parameter selection rules for nuclear norm regularized multivariate linear regression
Pan Shang
Lingchen Kong
25
1
0
19 Jan 2019
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach
  to Stochastic Convex Optimization
Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
Sattar Vakili
Sudeep Salgia
Qing Zhao
22
7
0
17 Jan 2019
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor
  Factorization
Block-Randomized Stochastic Proximal Gradient for Low-Rank Tensor Factorization
Xiao Fu
Shahana Ibrahim
Hoi-To Wai
Cheng Gao
Kejun Huang
17
37
0
16 Jan 2019
Optimization Problems for Machine Learning: A Survey
Optimization Problems for Machine Learning: A Survey
Claudio Gambella
Bissan Ghaddar
Joe Naoum-Sawaya
AI4CE
30
178
0
16 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU
  Servers
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers
A. Koliousis
Pijika Watcharapichat
Matthias Weidlich
Luo Mai
Paolo Costa
Peter R. Pietzuch
21
69
0
08 Jan 2019
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
SGD Converges to Global Minimum in Deep Learning via Star-convex Path
Yi Zhou
Junjie Yang
Huishuai Zhang
Yingbin Liang
Vahid Tarokh
14
71
0
02 Jan 2019
Exact Guarantees on the Absence of Spurious Local Minima for
  Non-negative Rank-1 Robust Principal Component Analysis
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis
S. Fattahi
Somayeh Sojoudi
16
38
0
30 Dec 2018
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
46
807
0
19 Dec 2018
A stochastic approximation method for approximating the efficient
  frontier of chance-constrained nonlinear programs
A stochastic approximation method for approximating the efficient frontier of chance-constrained nonlinear programs
R. Kannan
James R. Luedtke
13
4
0
17 Dec 2018
An Empirical Model of Large-Batch Training
An Empirical Model of Large-Batch Training
Sam McCandlish
Jared Kaplan
Dario Amodei
OpenAI Dota Team
15
270
0
14 Dec 2018
Gradient Descent Happens in a Tiny Subspace
Gradient Descent Happens in a Tiny Subspace
Guy Gur-Ari
Daniel A. Roberts
Ethan Dyer
30
229
0
12 Dec 2018
Layer-Parallel Training of Deep Residual Neural Networks
Layer-Parallel Training of Deep Residual Neural Networks
Stefanie Günther
Lars Ruthotto
J. Schroder
E. Cyr
N. Gauger
24
90
0
11 Dec 2018
A probabilistic incremental proximal gradient method
A probabilistic incremental proximal gradient method
Ömer Deniz Akyildiz
Émilie Chouzenoux
Victor Elvira
Joaquín Míguez
10
3
0
04 Dec 2018
Image-based model parameter optimization using Model-Assisted Generative
  Adversarial Networks
Image-based model parameter optimization using Model-Assisted Generative Adversarial Networks
Saúl Alonso-Monsalve
L. Whitehead
GAN
16
30
0
30 Nov 2018
Universal Adversarial Training
Universal Adversarial Training
A. Mendrik
Mahyar Najibi
Zheng Xu
John P. Dickerson
L. Davis
Tom Goldstein
AAML
OOD
18
189
0
27 Nov 2018
Forward Stability of ResNet and Its Variants
Forward Stability of ResNet and Its Variants
Linan Zhang
Hayden Schaeffer
30
47
0
24 Nov 2018
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex
  optimization
Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization
Ömer Deniz Akyildiz
Dan Crisan
Joaquín Míguez
17
5
0
23 Nov 2018
A Sufficient Condition for Convergences of Adam and RMSProp
A Sufficient Condition for Convergences of Adam and RMSProp
Fangyu Zou
Li Shen
Zequn Jie
Weizhong Zhang
Wei Liu
33
364
0
23 Nov 2018
Distributed Gradient Descent with Coded Partial Gradient Computations
Distributed Gradient Descent with Coded Partial Gradient Computations
Emre Ozfatura
S. Ulukus
Deniz Gunduz
19
40
0
22 Nov 2018
New Convergence Aspects of Stochastic Gradient Algorithms
New Convergence Aspects of Stochastic Gradient Algorithms
Lam M. Nguyen
Phuong Ha Nguyen
Peter Richtárik
K. Scheinberg
Martin Takáč
Marten van Dijk
23
66
0
10 Nov 2018
A Bayesian Perspective of Statistical Machine Learning for Big Data
A Bayesian Perspective of Statistical Machine Learning for Big Data
R. Sambasivan
Sourish Das
S. Sahu
BDL
GP
16
19
0
09 Nov 2018
Double Adaptive Stochastic Gradient Optimization
Double Adaptive Stochastic Gradient Optimization
Rajaditya Mukherjee
Jin Li
Shicheng Chu
Huamin Wang
ODL
24
0
0
06 Nov 2018
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent
Non-Asymptotic Guarantees For Sampling by Stochastic Gradient Descent
Avetik G. Karagulyan
11
1
0
02 Nov 2018
Functional Nonlinear Sparse Models
Functional Nonlinear Sparse Models
Luiz F. O. Chamon
Yonina C. Eldar
Alejandro Ribeiro
13
11
0
01 Nov 2018
A general system of differential equations to model first order adaptive
  algorithms
A general system of differential equations to model first order adaptive algorithms
André Belotto da Silva
Maxime Gazeau
16
33
0
31 Oct 2018
Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic
  Optimization
Kalman Gradient Descent: Adaptive Variance Reduction in Stochastic Optimization
James Vuckovic
ODL
16
15
0
29 Oct 2018
SpiderBoost and Momentum: Faster Stochastic Variance Reduction
  Algorithms
SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms
Zhe Wang
Kaiyi Ji
Yi Zhou
Yingbin Liang
Vahid Tarokh
ODL
35
81
0
25 Oct 2018
Condition Number Analysis of Logistic Regression, and its Implications
  for Standard First-Order Solution Methods
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods
R. Freund
Paul Grigas
Rahul Mazumder
20
10
0
20 Oct 2018
Adaptive Communication Strategies to Achieve the Best Error-Runtime
  Trade-off in Local-Update SGD
Adaptive Communication Strategies to Achieve the Best Error-Runtime Trade-off in Local-Update SGD
Jianyu Wang
Gauri Joshi
FedML
33
231
0
19 Oct 2018
First-order and second-order variants of the gradient descent in a
  unified framework
First-order and second-order variants of the gradient descent in a unified framework
Thomas Pierrot
Nicolas Perrin
Olivier Sigaud
ODL
30
7
0
18 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning
Fault Tolerance in Iterative-Convergent Machine Learning
Aurick Qiao
Bryon Aragam
Bingjing Zhang
Eric Xing
26
41
0
17 Oct 2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural
  Networks
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks
Xiaodong Cui
Wei Zhang
Zoltán Tüske
M. Picheny
ODL
16
89
0
16 Oct 2018
Approximate Fisher Information Matrix to Characterise the Training of
  Deep Neural Networks
Approximate Fisher Information Matrix to Characterise the Training of Deep Neural Networks
Zhibin Liao
Tom Drummond
Ian Reid
G. Carneiro
14
21
0
16 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLM
OffRL
28
144
0
15 Oct 2018
Tight Dimension Independent Lower Bound on the Expected Convergence Rate
  for Diminishing Step Sizes in SGD
Tight Dimension Independent Lower Bound on the Expected Convergence Rate for Diminishing Step Sizes in SGD
Phuong Ha Nguyen
Lam M. Nguyen
Marten van Dijk
LRM
14
31
0
10 Oct 2018
Characterization of Convex Objective Functions and Optimal Expected
  Convergence Rates for SGD
Characterization of Convex Objective Functions and Optimal Expected Convergence Rates for SGD
Marten van Dijk
Lam M. Nguyen
Phuong Ha Nguyen
Dzung Phan
36
6
0
09 Oct 2018
Information Geometry of Orthogonal Initializations and Training
Information Geometry of Orthogonal Initializations and Training
Piotr A. Sokól
Il-Su Park
AI4CE
80
16
0
09 Oct 2018
Principled Deep Neural Network Training through Linear Programming
Principled Deep Neural Network Training through Linear Programming
D. Bienstock
Gonzalo Muñoz
Sebastian Pokutta
35
24
0
07 Oct 2018
Accelerating Stochastic Gradient Descent Using Antithetic Sampling
Accelerating Stochastic Gradient Descent Using Antithetic Sampling
Jingchang Liu
Linli Xu
19
2
0
07 Oct 2018
Continuous-time Models for Stochastic Optimization Algorithms
Continuous-time Models for Stochastic Optimization Algorithms
Antonio Orvieto
Aurelien Lucchi
19
31
0
05 Oct 2018
Combining Natural Gradient with Hessian Free Methods for Sequence
  Training
Combining Natural Gradient with Hessian Free Methods for Sequence Training
Adnan Haider
P. Woodland
ODL
20
4
0
03 Oct 2018
Large batch size training of neural networks with adversarial training
  and second-order information
Large batch size training of neural networks with adversarial training and second-order information
Z. Yao
A. Gholami
Daiyaan Arfeen
Richard Liaw
Joseph E. Gonzalez
Kurt Keutzer
Michael W. Mahoney
ODL
6
42
0
02 Oct 2018
Privacy-preserving Stochastic Gradual Learning
Privacy-preserving Stochastic Gradual Learning
Bo Han
Ivor W. Tsang
Xiaokui Xiao
Ling-Hao Chen
S. Fung
C. Yu
NoLa
8
8
0
30 Sep 2018
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Mini-batch Serialization: CNN Training with Inter-layer Data Reuse
Sangkug Lym
Armand Behroozi
W. Wen
Ge Li
Yongkee Kwon
M. Erez
12
25
0
30 Sep 2018
A fast quasi-Newton-type method for large-scale stochastic optimisation
A fast quasi-Newton-type method for large-scale stochastic optimisation
A. Wills
Carl Jidling
Thomas B. Schon
ODL
31
7
0
29 Sep 2018
A Quantitative Analysis of the Effect of Batch Normalization on Gradient
  Descent
A Quantitative Analysis of the Effect of Batch Normalization on Gradient Descent
Yongqiang Cai
Qianxiao Li
Zuowei Shen
14
3
0
29 Sep 2018
Fluctuation-dissipation relations for stochastic gradient descent
Fluctuation-dissipation relations for stochastic gradient descent
Sho Yaida
32
73
0
28 Sep 2018
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Yuejie Chi
Yue M. Lu
Yuxin Chen
39
416
0
25 Sep 2018
Previous
123...242526272829
Next