Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,490 papers shown
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang
Lei Wu
435
3
0
01 Oct 2023
Robust Stochastic Optimization via Gradient Quantile Clipping
Ibrahim Merad
Stéphane Gaïffas
201
3
0
29 Sep 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
179
7
0
28 Sep 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Neural Information Processing Systems (NeurIPS), 2023
Bingcong Li
G. Giannakis
AAML
453
34
0
27 Sep 2023
Revisiting LARS for Large Batch Training Generalization of Neural Networks
IEEE Transactions on Artificial Intelligence (IEEE TAI), 2023
K. Do
Duong Nguyen
Hoa Nguyen
Long Tran-Thanh
Nguyen-Hoang Tran
Quoc-Viet Pham
AI4CE
ODL
354
6
0
25 Sep 2023
Robust Distributed Learning: Tight Error Bounds and Breakdown Point under Data Heterogeneity
Neural Information Processing Systems (NeurIPS), 2023
Youssef Allouah
R. Guerraoui
Nirupam Gupta
Rafael Pinot
Geovani Rizk
OOD
289
25
0
24 Sep 2023
A Novel Gradient Methodology with Economical Objective Function Evaluations for Data Science Applications
Christian Varner
Vivak Patel
363
2
0
19 Sep 2023
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Hao-Jun Michael Shi
Tsung-Hsien Lee
Shintaro Iwasaki
Jose Gallego-Posada
Zhijing Li
Kaushik Rangadurai
Dheevatsa Mudigere
Michael Rabbat
ODL
258
45
0
12 Sep 2023
Derivation of Coordinate Descent Algorithms from Optimal Control Theory
I. Michael Ross
59
2
0
07 Sep 2023
Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent
Journal of Computational Dynamics (J. Comput. Dyn.), 2023
Stefano Di Giovacchino
D. Higham
K. Zygalakis
179
2
0
05 Sep 2023
Majorization-Minimization for sparse SVMs
A. Benfenati
Émilie Chouzenoux
Giorgia Franchini
Salla Latva-Aijo
Dominik Narnhofer
J. Pesquet
S. J. Scott
M. Yousefi
139
1
0
31 Aug 2023
Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Tianchi Cai
Shenliao Bao
Jiyan Jiang
Shiji Zhou
Wenpeng Zhang
Lihong Gu
Jinjie Gu
Guannan Zhang
OffRL
176
3
0
25 Aug 2023
SGMM: Stochastic Approximation to Generalized Method of Moments
Xiaohong Chen
S. Lee
Yuan Liao
M. Seo
Youngki Shin
Myunghyun Song
169
7
0
25 Aug 2023
We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond
A. Khadangi
ODL
238
1
0
21 Aug 2023
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Xiaoge Deng
Li Shen
Shengwei Li
Tao Sun
Dongsheng Li
Dacheng Tao
351
3
0
18 Aug 2023
Max-affine regression via first-order methods
SIAM Journal on Mathematics of Data Science (SIMODS), 2023
Seonho Kim
Kiryung Lee
154
3
0
15 Aug 2023
Quantile Optimization via Multiple Timescale Local Search for Black-box Functions
Operational Research (OR), 2023
Jiaqiao Hu
Meichen Song
Michael Fu
56
13
0
15 Aug 2023
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Neural Information Processing Systems (NeurIPS), 2023
Xiao-Yan Jiang
Sebastian U. Stich
243
30
0
11 Aug 2023
Almost-sure convergence of iterates and multipliers in stochastic sequential quadratic optimization
Journal of Optimization Theory and Applications (JOTA), 2023
Frank E. Curtis
Xin Jiang
Qi Wang
191
8
0
07 Aug 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
Shaoshuai Shi
Yue Liu
221
1
0
04 Aug 2023
Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity
IEEE Transactions on Wireless Communications (IEEE TWC), 2023
Md Ferdous Pervej
Richeng Jin
H. Dai
357
23
0
03 Aug 2023
From continuous-time formulations to discretization schemes: tensor trains and robust regression for BSDEs and parabolic PDEs
Journal of machine learning research (JMLR), 2023
Lorenz Richter
Leon Sallandt
Nikolas Nusken
195
8
0
28 Jul 2023
The Marginal Value of Momentum for Small Learning Rate SGD
International Conference on Learning Representations (ICLR), 2023
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
242
10
0
27 Jul 2023
High Probability Analysis for Non-Convex Stochastic Optimization with Clipping
European Conference on Artificial Intelligence (ECAI), 2023
Shaojie Li
Yong Liu
220
5
0
25 Jul 2023
Federated Distributionally Robust Optimization with Non-Convex Objectives: Algorithm and Analysis
IEEE Transactions on Mobile Computing (IEEE TMC), 2023
Yang Jiao
Kai Yang
Dongjin Song
351
4
0
25 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Machine-mediated learning (ML), 2023
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
235
14
0
20 Jul 2023
Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training
European Association for Machine Translation Conferences/Workshops (EAMT), 2023
Nathaniel Berger
M. Exel
Matthias Huck
Stefan Riezler
238
2
0
17 Jul 2023
Decentralized Local Updates with Dual-Slow Estimation and Momentum-based Variance-Reduction for Non-Convex Optimization
European Conference on Artificial Intelligence (ECAI), 2023
Kangyang Luo
Kunkun Zhang
Sheng Zhang
Xiang Li
Ming Gao
127
2
0
17 Jul 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
Wei Biao Wu
353
6
0
13 Jul 2023
Transgressing the boundaries: towards a rigorous understanding of deep learning and its (non-)robustness
C. Hartmann
Lorenz Richter
AAML
206
2
0
05 Jul 2023
TablEye: Seeing small Tables through the Lens of Images
Seungeun Lee
Sang-Chul Lee
LMTD
244
2
0
04 Jul 2023
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Peng Mi
Li Shen
Tianhe Ren
Weihao Ye
Tianshuo Xu
Xiaoshuai Sun
Tongliang Liu
Rongrong Ji
Dacheng Tao
AAML
256
3
0
30 Jun 2023
Training Deep Surrogate Models with Large Scale Online Learning
International Conference on Machine Learning (ICML), 2023
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
3DGS
AI4CE
181
8
0
28 Jun 2023
G-TRACER: Expected Sharpness Optimization
John R. Williams
Stephen J. Roberts
148
0
0
24 Jun 2023
Efficient preconditioned stochastic gradient descent for estimation in latent variable models
International Conference on Machine Learning (ICML), 2023
C. Baey
Maud Delattre
E. Kuhn
Jean-Benoist Léger
Sarah Lemler
148
6
0
22 Jun 2023
Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Neural Information Processing Systems (NeurIPS), 2023
Leonardo Galli
Holger Rauhut
Mark Schmidt
215
17
0
22 Jun 2023
Empirical Risk Minimization with Shuffled SGD: A Primal-Dual Perspective and Improved Bounds
Xu Cai
Cheuk Yin Lin
Jelena Diakonikolas
FedML
250
6
0
21 Jun 2023
MimiC: Combating Client Dropouts in Federated Learning by Mimicking Central Updates
IEEE Transactions on Mobile Computing (IEEE TMC), 2023
Yuchang Sun
Yuyi Mao
Jinchao Zhang
FedML
263
23
0
21 Jun 2023
Adaptive Federated Learning with Auto-Tuned Clients
International Conference on Learning Representations (ICLR), 2023
Junhyung Lyle Kim
Taha Toghani
César A. Uribe
Anastasios Kyrillidis
FedML
557
14
0
19 Jun 2023
Bootstrapped Representations in Reinforcement Learning
International Conference on Machine Learning (ICML), 2023
Charline Le Lan
Stephen Tu
Mark Rowland
Anna Harutyunyan
Rishabh Agarwal
Marc G. Bellemare
Will Dabney
OffRL
OOD
SSL
254
12
0
16 Jun 2023
Schema-learning and rebinding as mechanisms of in-context learning and emergence
Neural Information Processing Systems (NeurIPS), 2023
Siva K. Swaminathan
Antoine Dedieu
Rajkumar Vasudeva Raju
Murray Shanahan
Miguel Lazaro-Gredilla
Dileep George
223
22
0
16 Jun 2023
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Xianbiao Qi
Jianan Wang
Lei Zhang
212
0
0
15 Jun 2023
Robustly Learning a Single Neuron via Sharpness
International Conference on Machine Learning (ICML), 2023
Puqian Wang
Nikos Zarifis
Ilias Diakonikolas
Jelena Diakonikolas
188
13
0
13 Jun 2023
GQFedWAvg: Optimization-Based Quantized Federated Learning in General Edge Computing Systems
IEEE Transactions on Wireless Communications (IEEE TWC), 2023
Yangchen Li
Ying Cui
Vincent K. N. Lau
FedML
253
4
0
13 Jun 2023
Analysis of the Relative Entropy Asymmetry in the Regularization of Empirical Risk Minimization
International Symposium on Information Theory (ISIT), 2023
Francisco Daunas
I. Esnaola
S. Perlaza
H. Vincent Poor
248
23
0
12 Jun 2023
Straggler-Resilient Decentralized Learning via Adaptive Asynchronous Updates
ACM Interational Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc), 2023
Efstathia Soufleri
Gang Yan
Maroun Touma
Jian Li
260
7
0
11 Jun 2023
Improving Accelerated Federated Learning with Compression and Importance Sampling
Michal Grudzieñ
Grigory Malinovsky
Peter Richtárik
FedML
280
11
0
05 Jun 2023
Integrated Sensing, Computation, and Communication for UAV-assisted Federated Edge Learning
IEEE Transactions on Wireless Communications (IEEE TWC), 2023
Yao Tang
Guangxu Zhu
Wei Xu
M. H. Cheung
T. Lok
Shuguang Cui
171
17
0
05 Jun 2023
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
International Conference on Machine Learning (ICML), 2023
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Weilong Dai
Dacheng Tao
663
19
0
05 Jun 2023
Toward Understanding Why Adam Converges Faster Than SGD for Transformers
Yan Pan
Yuanzhi Li
304
54
0
31 May 2023
Previous
1
2
3
...
7
8
9
...
28
29
30
Next
Page 8 of 30
Page
of 30
Go