Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,490 papers shown
Shuffling Momentum Gradient Algorithm for Convex Optimization
Trang H. Tran
Quoc Tran-Dinh
Lam M. Nguyen
222
2
0
05 Mar 2024
SOFIM: Stochastic Optimization Using Regularized Fisher Information Matrix
Mrinmay Sen
A. K. Qin
Gayathri C
Raghu Kishore N
Yen-Wei Chen
Balasubramanian Raman
195
7
0
05 Mar 2024
Beyond Single-Model Views for Deep Learning: Optimization versus Generalizability of Stochastic Optimization Algorithms
Toki Tahmid Inan
Mingrui Liu
Amarda Shehu
223
0
0
01 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
369
3
0
28 Feb 2024
Gradient-based Discrete Sampling with Automatic Cyclical Scheduling
Patrick Pynadath
Riddhiman Bhattacharya
Arun Hariharan
Ruqi Zhang
179
7
0
27 Feb 2024
Efficient Backpropagation with Variance-Controlled Adaptive Sampling
Ziteng Wang
Jianfei Chen
Jun Zhu
BDL
256
5
0
27 Feb 2024
On the connection between Noise-Contrastive Estimation and Contrastive Divergence
Amanda Olmin
Jakob Lindqvist
Lennart Svensson
Fredrik Lindsten
243
2
0
26 Feb 2024
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning
Dhananjay Saikumar
Blesson Varghese
238
2
0
21 Feb 2024
Revisiting Convergence of AdaGrad with Relaxed Assumptions
Yusu Hong
Junhong Lin
288
13
0
21 Feb 2024
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates
Youssef Allouah
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
Rafael Pinot
Geovani Rizk
S. Voitovych
FedML
276
13
0
20 Feb 2024
Tracking the Median of Gradients with a Stochastic Proximal Point Method
Fabian Schaipp
Guillaume Garrigos
Umut Simsekli
Robert M. Gower
318
1
0
20 Feb 2024
OptEx: Expediting First-Order Optimization with Approximately Parallelized Iterations
Yao Shu
Jiongfeng Fang
Y. He
Fei Richard Yu
165
0
0
18 Feb 2024
AdAdaGrad: Adaptive Batch Size Schemes for Adaptive Gradient Methods
Tim Tsz-Kit Lau
Han Liu
Mladen Kolar
ODL
399
9
0
17 Feb 2024
An Accelerated Distributed Stochastic Gradient Method with Momentum
Kun-Yen Huang
Shi Pu
Angelia Nedić
387
14
0
15 Feb 2024
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Siyuan Li
Zicheng Liu
Juanxi Tian
Ge Wang
Zedong Wang
...
Cheng Tan
Tao Lin
Yang Liu
Baigui Sun
Stan Z. Li
205
11
0
14 Feb 2024
Corridor Geometry in Gradient-Based Optimization
Benoit Dherin
M. Rosca
174
1
0
13 Feb 2024
Preconditioners for the Stochastic Training of Neural Fields
Shin-Fang Chng
Hemanth Saratchandran
Simon Lucey
331
0
0
13 Feb 2024
Tuning-Free Stochastic Optimization
Ahmed Khaled
Chi Jin
249
13
0
12 Feb 2024
Accelerating Distributed Deep Learning using Lossless Homomorphic Compression
Haoyu Li
Yuchen Xu
Jiayi Chen
Rohit Dwivedula
Wenfei Wu
Keqiang He
Aditya Akella
Daehyeok Kim
FedML
AI4CE
164
6
0
12 Feb 2024
Scalable Kernel Logistic Regression with Nyström Approximation: Theoretical Analysis and Application to Discrete Choice Modelling
José Ángel Martín-Baos
Ricardo García-Ródenas
Luis Rodriguez-Benitez
Michel Bierlaire
165
2
0
09 Feb 2024
Feed-Forward Neural Networks as a Mixed-Integer Program
Navid Aftabi
Nima Moradi
Fatemeh Mahroo
133
8
0
09 Feb 2024
On the Convergence of Zeroth-Order Federated Tuning for Large Language Models
Zhenqing Ling
Daoyuan Chen
Liuyi Yao
Yaliang Li
Ying Shen
FedML
318
24
0
08 Feb 2024
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization
Ling Liang
Zusen Xu
Kim-Chuan Toh
Jia Jie Zhu
397
4
0
08 Feb 2024
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
Neural Information Processing Systems (NeurIPS), 2024
Yusu Hong
Junhong Lin
413
17
0
06 Feb 2024
Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective
International Conference on Machine Learning (ICML), 2024
Wu Lin
Felix Dangel
Runa Eschenhagen
Juhan Bae
Richard Turner
Alireza Makhzani
ODL
975
19
0
05 Feb 2024
Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
ODL
159
1
0
05 Feb 2024
Frugal Actor-Critic: Sample Efficient Off-Policy Deep Reinforcement Learning Using Unique Experiences
Nikhil Kumar Singh
Indranil Saha
OffRL
127
0
0
05 Feb 2024
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Sobihan Surendran
Antoine Godichon-Baggioni
Adeline Fermanian
Sylvain Le Corff
304
3
0
05 Feb 2024
Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates
Zhuanghua Liu
Luo Luo
K. H. Low
272
3
0
04 Feb 2024
Momentum Does Not Reduce Stochastic Noise in Stochastic Gradient Descent
Naoki Sato
Hideaki Iiduka
ODL
450
1
0
04 Feb 2024
Emergence of heavy tails in homogenized stochastic gradient descent
Zhe Jiao
Martin Keller-Ressel
165
3
0
02 Feb 2024
Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning
Guangfeng Yan
Tan Li
Yuanzhang Xiao
Hanxu Hou
Linqi Song
MQ
194
1
0
02 Feb 2024
Truncated Non-Uniform Quantization for Distributed SGD
Guangfeng Yan
Tan Li
Yuanzhang Xiao
Congduan Li
Linqi Song
MQ
121
1
0
02 Feb 2024
Towards Quantum-Safe Federated Learning via Homomorphic Encryption: Learning with Gradients
Guangfeng Yan
Shanxiang Lyu
Hanxu Hou
Zhiyong Zheng
Linqi Song
FedML
80
3
0
02 Feb 2024
HawkEye: Advancing Robust Regression with Bounded, Smooth, and Insensitive Loss Function
M. Akhtar
Muhammad Tanveer
Mohd. Arshad
134
4
0
30 Jan 2024
Leveraging Nested MLMC for Sequential Neural Posterior Estimation with Intractable Likelihoods
Xiliang Yang
Yifei Xiong
Zhijian He
429
0
0
30 Jan 2024
Low-resolution Prior Equilibrium Network for CT Reconstruction
Inverse Problems (IP), 2024
Yijie Yang
Qifeng Gao
Yuping Duan
264
0
0
28 Jan 2024
Finite-Time Analysis of On-Policy Heterogeneous Federated Reinforcement Learning
International Conference on Learning Representations (ICLR), 2024
Chenyu Zhang
Han Wang
Aritra Mitra
James Anderson
293
31
0
27 Jan 2024
How to Collaborate: Towards Maximizing the Generalization Performance in Cross-Silo Federated Learning
IEEE Transactions on Mobile Computing (IEEE TMC), 2024
Yuchang Sun
Marios Kountouris
Jun Zhang
FedML
274
4
0
24 Jan 2024
Continuous-time Riemannian SGD and SVRG Flows on Wasserstein Probabilistic Space
Mingyang Yi
Bohan Wang
348
0
0
24 Jan 2024
Fast Nonlinear Two-Time-Scale Stochastic Approximation: Achieving
O
(
1
/
k
)
O(1/k)
O
(
1/
k
)
Finite-Sample Complexity
Thinh T. Doan
379
12
0
23 Jan 2024
Accelerating Distributed Stochastic Optimization via Self-Repellent Random Walks
Jie Hu
Vishwaraj Doshi
Do Young Eun
299
4
0
18 Jan 2024
Central Limit Theorem for Two-Timescale Stochastic Approximation with Markovian Noise: Theory and Applications
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Jie Hu
Vishwaraj Doshi
Do Young Eun
381
6
0
17 Jan 2024
GD doesn't make the cut: Three ways that non-differentiability affects neural network training
Siddharth Krishna Kumar
AAML
356
5
0
16 Jan 2024
Stochastic optimization with arbitrary recurrent data sampling
International Conference on Machine Learning (ICML), 2024
William G. Powell
Hanbaek Lyu
243
1
0
15 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
315
5
0
14 Jan 2024
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
A. F. M. Saif
Xiaodong Cui
Han Shen
Songtao Lu
Brian Kingsbury
Tianyi Chen
232
7
0
13 Jan 2024
Differential Equations for Continuous-Time Deep Learning
Notices of the American Mathematical Society (Not. Amer. Math. Soc.), 2024
Lars Ruthotto
AI4TS
AI4CE
SyDa
BDL
170
11
0
08 Jan 2024
A Robbins--Monro Sequence That Can Exploit Prior Information For Faster Convergence
Siwei Liu
Ke Ma
Stephan M. Goetz
85
2
0
06 Jan 2024
On the numerical reliability of nonsmooth autodiff: a MaxPool case study
Ryan Boustany
276
1
0
05 Jan 2024
Previous
1
2
3
...
5
6
7
...
28
29
30
Next
Page 6 of 30
Page
of 30
Go