Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.04838
Cited By
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,407 papers shown
Title
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
Sumedh Gupte
A. PrashanthL.
Sanjay P. Bhat
20
1
0
28 Oct 2023
Contextual Stochastic Bilevel Optimization
Yifan Hu
Jie Wang
Yao Xie
Andreas Krause
Daniel Kuhn
34
12
0
27 Oct 2023
Performative Prediction: Past and Future
Moritz Hardt
Celestine Mendler-Dünner
31
21
0
25 Oct 2023
Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Tao Sun
Congliang Chen
Peng Qiao
Li Shen
Xinwang Liu
Dongsheng Li
36
3
0
23 Oct 2023
Graph Neural Networks and Applied Linear Algebra
Nicholas S. Moore
Eric C. Cyr
Peter Ohm
C. Siefert
R. Tuminaro
30
4
0
21 Oct 2023
Exponential weight averaging as damped harmonic motion
J. Patsenker
Henry Li
Y. Kluger
18
0
0
20 Oct 2023
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency for Federated Learning with Static and Streaming Dataset
Weijie Liu
Xiaoxi Zhang
Jingpu Duan
Carlee Joe-Wong
Zhi Zhou
Xu Chen
26
8
0
20 Oct 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
26
0
0
19 Oct 2023
LASER: Linear Compression in Wireless Distributed Optimization
Ashok Vardhan Makkuva
Marco Bondaschi
Thijs Vogels
Martin Jaggi
Hyeji Kim
Michael C. Gastpar
87
3
0
19 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Ruoyu Sun
Zhimin Luo
27
52
0
16 Oct 2023
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei Chen
Khaled B. Letaief
FedML
23
11
0
16 Oct 2023
Federated Multi-Objective Learning
Haibo Yang
Zhuqing Liu
Jia-Wei Liu
Chaosheng Dong
Michinari Momma
FedML
31
7
0
15 Oct 2023
Fast Sampling and Inference via Preconditioned Langevin Dynamics
Riddhiman Bhattacharya
Tiefeng Jiang
24
1
0
11 Oct 2023
Quantum Shadow Gradient Descent for Quantum Learning
Mohsen Heidari
M. Naved
Wenbo Xie
Arjun Jacob Grama
Wojtek Szpankowski
22
1
0
10 Oct 2023
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Cong Ma
Xingyu Xu
Tian Tong
Yuejie Chi
18
9
0
09 Oct 2023
Learning Layer-wise Equivariances Automatically using Gradients
Tycho F. A. van der Ouderaa
Alexander Immer
Mark van der Wilk
MLT
50
12
0
09 Oct 2023
On the Parallel Complexity of Multilevel Monte Carlo in Stochastic Gradient Descent
Kei Ishikawa
BDL
63
0
0
03 Oct 2023
Epidemic Learning: Boosting Decentralized Learning with Randomized Communication
M. Vos
Sadegh Farhadkhani
R. Guerraoui
Anne-Marie Kermarrec
Rafael Pires
Rishi Sharma
28
15
0
03 Oct 2023
Batch-less stochastic gradient descent for compressive learning of deep regularization for image denoising
Hui Shi
Yann Traonmilin
Jean-François Aujol
6
0
0
02 Oct 2023
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang
Lei Wu
35
3
0
01 Oct 2023
Robust Stochastic Optimization via Gradient Quantile Clipping
Ibrahim Merad
Stéphane Gaïffas
39
2
0
29 Sep 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
27
5
0
28 Sep 2023
Multi-unit soft sensing permits few-shot learning
B. Grimstad
Fadhil G. Al-Amran
Maitham G. Yousif
31
1
0
27 Sep 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Bingcong Li
G. Giannakis
AAML
31
19
0
27 Sep 2023
Revisiting LARS for Large Batch Training Generalization of Neural Networks
K. Do
Duong Nguyen
Hoa Nguyen
Long Tran-Thanh
Nguyen-Hoang Tran
Viet Quoc Pham
AI4CE
ODL
31
0
0
25 Sep 2023
Robust Distributed Learning: Tight Error Bounds and Breakdown Point under Data Heterogeneity
Youssef Allouah
R. Guerraoui
Nirupam Gupta
Rafael Pinot
Geovani Rizk
OOD
34
15
0
24 Sep 2023
A Novel Gradient Methodology with Economical Objective Function Evaluations for Data Science Applications
Christian Varner
Vivak Patel
34
2
0
19 Sep 2023
A Distributed Data-Parallel PyTorch Implementation of the Distributed Shampoo Optimizer for Training Neural Networks At-Scale
Hao-Jun Michael Shi
Tsung-Hsien Lee
Shintaro Iwasaki
Jose Gallego-Posada
Zhijing Li
Kaushik Rangadurai
Dheevatsa Mudigere
Michael Rabbat
ODL
25
22
0
12 Sep 2023
Derivation of Coordinate Descent Algorithms from Optimal Control Theory
I. Michael Ross
20
1
0
07 Sep 2023
Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent
Stefano Di Giovacchino
D. Higham
K. Zygalakis
26
1
0
05 Sep 2023
Majorization-Minimization for sparse SVMs
A. Benfenati
Émilie Chouzenoux
Giorgia Franchini
Salla Latva-Aijo
Dominik Narnhofer
J. Pesquet
S. J. Scott
M. Yousefi
13
1
0
31 Aug 2023
Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems
Tianchi Cai
Shenliao Bao
Jiyan Jiang
Shiji Zhou
Wenpeng Zhang
Lihong Gu
Jinjie Gu
Guannan Zhang
OffRL
31
2
0
25 Aug 2023
SGMM: Stochastic Approximation to Generalized Method of Moments
Xiaohong Chen
S. Lee
Yuan Liao
M. Seo
Youngki Shin
Myunghyun Song
19
6
0
25 Aug 2023
We Don't Need No Adam, All We Need Is EVE: On The Variance of Dual Learning Rate And Beyond
A. Khadangi
ODL
13
0
0
21 Aug 2023
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Xiaoge Deng
Li Shen
Shengwei Li
Tao Sun
Dongsheng Li
Dacheng Tao
28
3
0
18 Aug 2023
Max-affine regression via first-order methods
Seonho Kim
Kiryung Lee
33
2
0
15 Aug 2023
Quantile Optimization via Multiple Timescale Local Search for Black-box Functions
Jiaqiao Hu
Meichen Song
Michael Fu
11
6
0
15 Aug 2023
Adaptive SGD with Polyak stepsize and Line-search: Robust Convergence and Variance Reduction
Xiao-Yan Jiang
Sebastian U. Stich
33
18
0
11 Aug 2023
Almost-sure convergence of iterates and multipliers in stochastic sequential quadratic optimization
Frank E. Curtis
Xin Jiang
Qi Wang
32
4
0
07 Aug 2023
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
S. Shi
Bo-wen Li
28
1
0
04 Aug 2023
Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity
Md Ferdous Pervej
Richeng Jin
H. Dai
32
9
0
03 Aug 2023
From continuous-time formulations to discretization schemes: tensor trains and robust regression for BSDEs and parabolic PDEs
Lorenz Richter
Leon Sallandt
Nikolas Nusken
21
4
0
28 Jul 2023
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
50
8
0
27 Jul 2023
High Probability Analysis for Non-Convex Stochastic Optimization with Clipping
Shaojie Li
Yong Liu
35
2
0
25 Jul 2023
Federated Distributionally Robust Optimization with Non-Convex Objectives: Algorithm and Analysis
Yang Jiao
Kai Yang
Dongjin Song
31
1
0
25 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic Case
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
25
8
0
20 Jul 2023
Enhancing Supervised Learning with Contrastive Markings in Neural Machine Translation Training
Nathaniel Berger
M. Exel
Matthias Huck
Stefan Riezler
26
2
0
17 Jul 2023
Decentralized Local Updates with Dual-Slow Estimation and Momentum-based Variance-Reduction for Non-Convex Optimization
Kangyang Luo
Kunkun Zhang
Sheng Zhang
Xiang Li
Ming Gao
27
2
0
17 Jul 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
Wei Biao Wu
27
3
0
13 Jul 2023
Transgressing the boundaries: towards a rigorous understanding of deep learning and its (non-)robustness
C. Hartmann
Lorenz Richter
AAML
27
2
0
05 Jul 2023
Previous
1
2
3
...
5
6
7
...
27
28
29
Next