ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,491 papers shown
An Introduction to Deep Generative Modeling
An Introduction to Deep Generative ModelingGAMM-Mitteilungen (GAMM-Mitteilungen), 2021
Lars Ruthotto
E. Haber
AI4CE
353
281
0
09 Mar 2021
On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum
  Optimization
On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum OptimizationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
N. Emmenegger
Rasmus Kyng
Ahad N. Zehmakan
213
2
0
08 Mar 2021
Bayesian imaging using Plug & Play priors: when Langevin meets Tweedie
Bayesian imaging using Plug & Play priors: when Langevin meets TweedieSIAM Journal of Imaging Sciences (SIAM J. Imaging Sci.), 2021
R. Laumont
Valentin De Bortoli
Andrés Almansa
J. Delon
Alain Durmus
Marcelo Pereyra
610
139
0
08 Mar 2021
A Retrospective Approximation Approach for Smooth Stochastic
  Optimization
A Retrospective Approximation Approach for Smooth Stochastic OptimizationMathematics of Operations Research (MOR), 2021
David Newton
Raghu Bollapragada
R. Pasupathy
N. Yip
285
3
0
07 Mar 2021
On the Importance of Sampling in Training GCNs: Tighter Analysis and
  Variance Reduction
On the Importance of Sampling in Training GCNs: Tighter Analysis and Variance Reduction
Weilin Cong
M. Ramezani
M. Mahdavi
242
6
0
03 Mar 2021
Critical Parameters for Scalable Distributed Learning with Large Batches
  and Asynchronous Updates
Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous UpdatesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Sebastian U. Stich
Amirkeivan Mohtashami
Martin Jaggi
178
25
0
03 Mar 2021
Deep Recurrent Encoder: A scalable end-to-end network to model brain
  signals
Deep Recurrent Encoder: A scalable end-to-end network to model brain signalsNeurons, Behavior, Data analysis, and Theory (NBDT), 2021
O. Chehab
Alexandre Défossez
Jean-Christophe Loiseau
Alexandre Gramfort
J. King
AI4TS
208
12
0
03 Mar 2021
Adaptive Transmission Scheduling in Wireless Networks for Asynchronous
  Federated Learning
Adaptive Transmission Scheduling in Wireless Networks for Asynchronous Federated LearningIEEE Journal on Selected Areas in Communications (JSAC), 2021
Hyun-Suk Lee
Jang-Won Lee
210
68
0
02 Mar 2021
Gradient Descent on Neural Networks Typically Occurs at the Edge of
  Stability
Gradient Descent on Neural Networks Typically Occurs at the Edge of StabilityInternational Conference on Learning Representations (ICLR), 2021
Jeremy M. Cohen
Simran Kaur
Yuanzhi Li
J. Zico Kolter
Ameet Talwalkar
ODL
483
343
0
26 Feb 2021
Wirelessly Powered Federated Edge Learning: Optimal Tradeoffs Between
  Convergence and Power Transfer
Wirelessly Powered Federated Edge Learning: Optimal Tradeoffs Between Convergence and Power TransferIEEE Transactions on Wireless Communications (IEEE TWC), 2021
Qunsong Zeng
Yuqing Du
Kaibin Huang
247
42
0
24 Feb 2021
Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement
  Learning via Frank-Wolfe Policy Optimization
Escaping from Zero Gradient: Revisiting Action-Constrained Reinforcement Learning via Frank-Wolfe Policy OptimizationConference on Uncertainty in Artificial Intelligence (UAI), 2021
Jyun-Li Lin
Wei-Ting Hung
Shangtong Yang
Ping-Chun Hsieh
Xi Liu
257
18
0
22 Feb 2021
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods
Zheng Shi
Abdurakhmon Sadiev
Nicolas Loizou
Peter Richtárik
Martin Takávc
ODL
329
16
0
19 Feb 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic Algorithm
Finite-Sample Analysis of Off-Policy Natural Actor-Critic AlgorithmInternational Conference on Machine Learning (ICML), 2021
S. Khodadadian
Zaiwei Chen
S. T. Maguluri
CMLOffRL
263
32
0
18 Feb 2021
Differential Privacy and Byzantine Resilience in SGD: Do They Add Up?
Differential Privacy and Byzantine Resilience in SGD: Do They Add Up?ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC), 2021
R. Guerraoui
Nirupam Gupta
Rafael Pinot
Sébastien Rouault
John Stephan
271
38
0
16 Feb 2021
Learning by Turning: Neural Architecture Aware Optimisation
Learning by Turning: Neural Architecture Aware OptimisationInternational Conference on Machine Learning (ICML), 2021
Yang Liu
Jeremy Bernstein
M. Meister
Yisong Yue
ODL
337
30
0
14 Feb 2021
Newton Method over Networks is Fast up to the Statistical Precision
Newton Method over Networks is Fast up to the Statistical PrecisionInternational Conference on Machine Learning (ICML), 2021
Amir Daneshmand
G. Scutari
Pavel Dvurechensky
Alexander Gasnikov
178
22
0
12 Feb 2021
Straggler-Resilient Distributed Machine Learning with Dynamic Backup
  Workers
Straggler-Resilient Distributed Machine Learning with Dynamic Backup Workers
Efstathia Soufleri
Gang Yan
Rahul Singh
Jian Li
121
14
0
11 Feb 2021
An Adaptive Stochastic Sequential Quadratic Programming with
  Differentiable Exact Augmented Lagrangians
An Adaptive Stochastic Sequential Quadratic Programming with Differentiable Exact Augmented LagrangiansMathematical programming (Math. Program.), 2021
Sen Na
M. Anitescu
Mladen Kolar
297
56
0
10 Feb 2021
Attentive Gaussian processes for probabilistic time-series generation
Attentive Gaussian processes for probabilistic time-series generation
Kuilin Chen
Chi-Guhn Lee
AI4TS
88
1
0
10 Feb 2021
Consensus Control for Decentralized Deep Learning
Consensus Control for Decentralized Deep LearningInternational Conference on Machine Learning (ICML), 2021
Lingjing Kong
Tao Lin
Anastasia Koloskova
Martin Jaggi
Sebastian U. Stich
250
91
0
09 Feb 2021
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
  Heterogeneous Data
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous DataInternational Conference on Machine Learning (ICML), 2021
Tao Lin
Sai Praneeth Karimireddy
Sebastian U. Stich
Martin Jaggi
FedML
314
109
0
09 Feb 2021
Large-Scale Training System for 100-Million Classification at Alibaba
Large-Scale Training System for 100-Million Classification at AlibabaKnowledge Discovery and Data Mining (KDD), 2020
Liuyihan Song
Pan Pan
Kang Zhao
Hao Yang
Yiming Chen
Yingya Zhang
Yinghui Xu
Rong Jin
204
26
0
09 Feb 2021
Adaptive Quantization of Model Updates for Communication-Efficient
  Federated Learning
Adaptive Quantization of Model Updates for Communication-Efficient Federated LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Divyansh Jhunjhunwala
Advait Gadhikar
Gauri Joshi
Yonina C. Eldar
FedMLMQ
222
128
0
08 Feb 2021
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize
  Criticality
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize CriticalityAnnual Conference Computational Learning Theory (COLT), 2021
Courtney Paquette
Kiwon Lee
Fabian Pedregosa
Elliot Paquette
165
40
0
08 Feb 2021
Federated Learning on the Road: Autonomous Controller Design for
  Connected and Autonomous Vehicles
Federated Learning on the Road: Autonomous Controller Design for Connected and Autonomous VehiclesIEEE Transactions on Wireless Communications (IEEE TWC), 2021
Tengchan Zeng
Omid Semiariy
Mingzhe Chen
Walid Saad
M. Bennis
FedML
147
106
0
05 Feb 2021
Local Critic Training for Model-Parallel Learning of Deep Neural
  Networks
Local Critic Training for Model-Parallel Learning of Deep Neural NetworksIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Hojung Lee
Cho-Jui Hsieh
Jong-Seok Lee
143
16
0
03 Feb 2021
The Min-Max Complexity of Distributed Stochastic Convex Optimization
  with Intermittent Communication
The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent CommunicationAnnual Conference Computational Learning Theory (COLT), 2021
Blake E. Woodworth
Brian Bullins
Ohad Shamir
Nathan Srebro
277
49
0
02 Feb 2021
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous
  Q-Learning and TD-Learning Variants
A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants
Zaiwei Chen
S. T. Maguluri
Sanjay Shakkottai
Karthikeyan Shanmugam
OffRL
372
62
0
02 Feb 2021
Stochastic Online Convex Optimization. Application to probabilistic time
  series forecasting
Stochastic Online Convex Optimization. Application to probabilistic time series forecastingElectronic Journal of Statistics (EJS), 2021
Olivier Wintenberger
AI4TS
298
14
0
01 Feb 2021
Parameter-free Stochastic Optimization of Variationally Coherent
  Functions
Parameter-free Stochastic Optimization of Variationally Coherent Functions
Francesco Orabona
Dávid Pál
195
24
0
30 Jan 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
  in Imbalanced Training
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced TrainingProceedings of the National Academy of Sciences of the United States of America (PNAS), 2021
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
478
207
0
29 Jan 2021
Byzantine Fault-Tolerance in Peer-to-Peer Distributed Gradient-Descent
Byzantine Fault-Tolerance in Peer-to-Peer Distributed Gradient-Descent
Nirupam Gupta
Nitin H. Vaidya
141
18
0
28 Jan 2021
Achieving Linear Speedup with Partial Worker Participation in Non-IID
  Federated Learning
Achieving Linear Speedup with Partial Worker Participation in Non-IID Federated LearningInternational Conference on Learning Representations (ICLR), 2021
Haibo Yang
Minghong Fang
Jia Liu
FedML
356
297
0
27 Jan 2021
SGD-Net: Efficient Model-Based Deep Learning with Theoretical Guarantees
SGD-Net: Efficient Model-Based Deep Learning with Theoretical GuaranteesIEEE Transactions on Computational Imaging (IEEE Trans. Comput. Imaging), 2021
Jiaming Liu
Yu Sun
Weijie Gan
Xiaojian Xu
B. Wohlberg
Ulugbek S. Kamilov
FedMLMedIm
285
38
0
22 Jan 2021
Approximate Byzantine Fault-Tolerance in Distributed Optimization
Approximate Byzantine Fault-Tolerance in Distributed OptimizationACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC), 2021
Shuo Liu
Nirupam Gupta
Nitin H. Vaidya
469
48
0
22 Jan 2021
Gravity Optimizer: a Kinematic Approach on Optimization in Deep Learning
Gravity Optimizer: a Kinematic Approach on Optimization in Deep Learning
Dariush Bahrami
Sadegh Pouriyan Zadeh
ODL
128
5
0
22 Jan 2021
Linear Regression with Distributed Learning: A Generalization Error
  Perspective
Linear Regression with Distributed Learning: A Generalization Error PerspectiveIEEE Transactions on Signal Processing (IEEE TSP), 2021
Martin Hellkvist
Ayça Özçelikkale
Anders Ahlén
FedML
327
10
0
22 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Clairvoyant Prefetching for Distributed Machine Learning I/OInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2021
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
224
69
0
21 Jan 2021
Learning DNN networks using un-rectifying ReLU with compressed sensing
  application
Learning DNN networks using un-rectifying ReLU with compressed sensing application
W. Hwang
Shih-Shuo Tung
122
3
0
18 Jan 2021
Towards Practical Adam: Non-Convexity, Convergence Theory, and
  Mini-Batch Acceleration
Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch AccelerationJournal of machine learning research (JMLR), 2021
Congliang Chen
Li Shen
Fangyu Zou
Wei Liu
198
36
0
14 Jan 2021
Machine learning classification of non-Markovian noise disturbing
  quantum dynamics
Machine learning classification of non-Markovian noise disturbing quantum dynamicsPhysica Scripta (Phys. Scr.), 2021
Stefano Martina
S. Gherardini
Filippo Caruso
158
12
0
08 Jan 2021
Delayed Projection Techniques for Linearly Constrained Problems:
  Convergence Rates, Acceleration, and Applications
Delayed Projection Techniques for Linearly Constrained Problems: Convergence Rates, Acceleration, and Applications
Xiang Li
Zhihua Zhang
158
4
0
05 Jan 2021
Advances in Electron Microscopy with Deep Learning
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
710
3
0
04 Jan 2021
First-Order Methods for Convex Optimization
First-Order Methods for Convex OptimizationEURO Journal on Computational Optimization (EJCO), 2021
Pavel Dvurechensky
Mathias Staudigl
Shimrit Shtern
ODL
290
28
0
04 Jan 2021
An iterative K-FAC algorithm for Deep Learning
An iterative K-FAC algorithm for Deep Learning
Yingshi Chen
ODL
132
2
0
01 Jan 2021
CADA: Communication-Adaptive Distributed Adam
CADA: Communication-Adaptive Distributed AdamInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Tianyi Chen
Ziye Guo
Yuejiao Sun
W. Yin
ODL
180
24
0
31 Dec 2020
Unbiased Gradient Estimation for Distributionally Robust Learning
Unbiased Gradient Estimation for Distributionally Robust Learning
Soumyadip Ghosh
M. Squillante
OOD
179
7
0
22 Dec 2020
Image-Based Jet Analysis
Image-Based Jet Analysis
Michael Kagan
253
11
0
17 Dec 2020
Are we Forgetting about Compositional Optimisers in Bayesian
  Optimisation?
Are we Forgetting about Compositional Optimisers in Bayesian Optimisation?Journal of machine learning research (JMLR), 2020
Antoine Grosnit
Alexander I. Cowen-Rivers
Rasul Tutunov
Ryan-Rhys Griffiths
Jun Wang
Haitham Bou-Ammar
219
17
0
15 Dec 2020
Better scalability under potentially heavy-tailed feedback
Better scalability under potentially heavy-tailed feedback
Matthew J. Holland
205
1
0
14 Dec 2020
Previous
123...171819...282930
Next
Page 18 of 30
Pageof 30