Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,485 papers shown
Title
Continual Learning: Forget-free Winning Subnetworks for Video Representations
Haeyong Kang
Jaehong Yoon
Sung Ju Hwang
Chang D. Yoo
CLL
428
5
0
19 Dec 2023
DePRL: Achieving Linear Convergence Speedup in Personalized Decentralized Learning with Shared Representations
Efstathia Soufleri
Gang Yan
Maroun Touma
Jian Li
207
9
0
17 Dec 2023
Physics-Informed Deep Learning of Rate-and-State Fault Friction
Computer Methods in Applied Mechanics and Engineering (CMAME), 2023
Cody Rucker
Brittany A. Erickson
PINN
AI4CE
203
14
0
14 Dec 2023
Layered Randomized Quantization for Communication-Efficient and Privacy-Preserving Distributed Learning
Guangfeng Yan
Tan Li
Tian-Shing Lan
Kui Wu
Linqi Song
195
9
0
12 Dec 2023
An
L
D
L
T
LDL^T
L
D
L
T
Trust-Region Quasi-Newton Method
John Brust
Philip E. Gill
47
0
0
11 Dec 2023
ELSA: Partial Weight Freezing for Overhead-Free Sparse Network Deployment
Paniz Halvachi
Alexandra Peste
Dan Alistarh
Christoph H. Lampert
126
0
0
11 Dec 2023
Fake It Till Make It: Federated Learning with Consensus-Oriented Generation
Rui Ye
Yaxin Du
Zhenyang Ni
Siheng Chen
Yanfeng Wang
FedML
133
8
0
10 Dec 2023
TaskMet: Task-Driven Metric Learning for Model Learning
Neural Information Processing Systems (NeurIPS), 2023
Dishank Bansal
Ricky T. Q. Chen
Mustafa Mukadam
Brandon Amos
FedML
232
15
0
08 Dec 2023
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications
Journal of Optimization Theory and Applications (JOTA), 2023
Rajeeva Laxman Karandikar
M. Vidyasagar
271
17
0
05 Dec 2023
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization
Junwen Qiu
Xiao Li
Andre Milzarek
445
3
0
02 Dec 2023
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
231
4
0
29 Nov 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent
Frederik Köhne
Leonie Kreis
Anton Schiela
Roland A. Herzog
245
2
0
28 Nov 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
150
0
0
27 Nov 2023
Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia
IEEE Access (IEEE Access), 2023
Milad Baghalzadeh Shishehgarkhaneh
R. Moehler
Yihai Fang
Amer A. Hijazi
Hamed Aboutorab
239
15
0
23 Nov 2023
Soft Random Sampling: A Theoretical and Empirical Analysis
Xiaodong Cui
Ashish R. Mittal
Songtao Lu
Wei Zhang
G. Saon
Brian Kingsbury
212
2
0
21 Nov 2023
Infinite forecast combinations based on Dirichlet process
Yinuo Ren
Feng Li
Yanfei Kang
Jue Wang
AI4TS
157
0
0
21 Nov 2023
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
253
3
0
20 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
312
4
0
15 Nov 2023
Non-Uniform Smoothness for Gradient Descent
A. Berahas
Lindon Roberts
Fred Roosta
143
5
0
15 Nov 2023
Robust softmax aggregation on blockchain based federated learning with convergence guarantee
Huiyu Wu
Diego Klabjan
FedML
219
3
0
13 Nov 2023
Differentiable Cutting-plane Layers for Mixed-integer Linear Optimization
Gabriele Dragotto
Stefan Clarke
J. F. Fisac
Bartolomeo Stellato
412
7
0
06 Nov 2023
Parameter-Agnostic Optimization under Relaxed Smoothness
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Florian Hübler
Junchi Yang
Xiang Li
Niao He
228
27
0
06 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
508
0
0
06 Nov 2023
High Probability Convergence of Adam Under Unbounded Gradients and Affine Variance Noise
Yusu Hong
Junhong Lin
169
11
0
03 Nov 2023
Learning to optimize by multi-gradient for multi-objective optimization
Linxi Yang
Xinmin Yang
L. Tang
228
1
0
01 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
108
0
0
31 Oct 2023
High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise
Aleksandar Armacki
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
554
10
0
28 Oct 2023
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
IEEE Conference on Decision and Control (CDC), 2023
Sumedh Gupte
A. PrashanthL.
Sanjay P. Bhat
140
2
0
28 Oct 2023
Contextual Stochastic Bilevel Optimization
Neural Information Processing Systems (NeurIPS), 2023
Yifan Hu
Jie Wang
Yao Xie
Andreas Krause
Daniel Kuhn
152
18
0
27 Oct 2023
Performative Prediction: Past and Future
Statistical Science (Statist. Sci.), 2023
Moritz Hardt
Celestine Mendler-Dünner
319
38
0
25 Oct 2023
Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Tao Sun
Congliang Chen
Peng Qiao
Li Shen
Xinwang Liu
Dongsheng Li
157
6
0
23 Oct 2023
Graph Neural Networks and Applied Linear Algebra
Nicholas S. Moore
Eric C. Cyr
Peter Ohm
C. Siefert
R. Tuminaro
195
6
0
21 Oct 2023
Exponential weight averaging as damped harmonic motion
J. Patsenker
Henry Li
Y. Kluger
163
0
0
20 Oct 2023
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency for Federated Learning with Static and Streaming Dataset
Weijie Liu
Xiaoxi Zhang
Jingpu Duan
Carlee Joe-Wong
Zhi Zhou
Xu Chen
173
19
0
20 Oct 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
207
1
0
19 Oct 2023
LASER: Linear Compression in Wireless Distributed Optimization
Ashok Vardhan Makkuva
Marco Bondaschi
Thijs Vogels
Martin Jaggi
Hyeji Kim
Michael C. Gastpar
304
6
0
19 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
International Conference on Machine Learning (ICML), 2023
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Tian Ding
Zhimin Luo
299
119
0
16 Oct 2023
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei Chen
Khaled B. Letaief
FedML
357
21
0
16 Oct 2023
Federated Multi-Objective Learning
Haibo Yang
Zhuqing Liu
Jia-Wei Liu
Chaosheng Dong
Michinari Momma
FedML
180
18
0
15 Oct 2023
Fast Sampling and Inference via Preconditioned Langevin Dynamics
Riddhiman Bhattacharya
Tiefeng Jiang
110
3
0
11 Oct 2023
Quantum Shadow Gradient Descent for Quantum Learning
Mohsen Heidari
M. Naved
Wenbo Xie
Arjun Jacob Grama
Wojtek Szpankowski
119
0
0
10 Oct 2023
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Cong Ma
Xingyu Xu
Tian Tong
Yuejie Chi
220
12
0
09 Oct 2023
Learning Layer-wise Equivariances Automatically using Gradients
Neural Information Processing Systems (NeurIPS), 2023
Tycho F. A. van der Ouderaa
Alexander Immer
Mark van der Wilk
MLT
228
19
0
09 Oct 2023
On the Parallel Complexity of Multilevel Monte Carlo in Stochastic Gradient Descent
Kei Ishikawa
BDL
217
0
0
03 Oct 2023
Epidemic Learning: Boosting Decentralized Learning with Randomized Communication
Neural Information Processing Systems (NeurIPS), 2023
M. Vos
Sadegh Farhadkhani
R. Guerraoui
Anne-Marie Kermarrec
Rafael Pires
Rishi Sharma
231
24
0
03 Oct 2023
Batch-less stochastic gradient descent for compressive learning of deep regularization for image denoising
Journal of Mathematical Imaging and Vision (JMIV), 2023
Hui Shi
Yann Traonmilin
Jean-François Aujol
157
1
0
02 Oct 2023
A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent
Mingze Wang
Lei Wu
321
3
0
01 Oct 2023
Robust Stochastic Optimization via Gradient Quantile Clipping
Ibrahim Merad
Stéphane Gaïffas
158
3
0
29 Sep 2023
High Throughput Training of Deep Surrogates from Large Ensemble Runs
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
AI4CE
129
7
0
28 Sep 2023
Enhancing Sharpness-Aware Optimization Through Variance Suppression
Neural Information Processing Systems (NeurIPS), 2023
Bingcong Li
G. Giannakis
AAML
296
31
0
27 Sep 2023
Previous
1
2
3
...
6
7
8
...
28
29
30
Next