Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1606.04838
Cited By
v1
v2
v3 (latest)
Optimization Methods for Large-Scale Machine Learning
15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Optimization Methods for Large-Scale Machine Learning"
50 / 1,490 papers shown
FedNS: A Fast Sketching Newton-Type Algorithm for Federated Learning
Jian Li
Yong Liu
Wei Wang
Haoran Wu
Weiping Wang
FedML
277
6
0
05 Jan 2024
Online Continual Domain Adaptation for Semantic Image Segmentation Using Internal Representations
Serban Stan
Mohammad Rostami
OOD
CLL
243
0
0
02 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
297
5
0
28 Dec 2023
Parallel Trust-Region Approaches in Neural Network Training: Beyond Traditional Methods
Ken Trotti
Samuel A. Cruz Alegría
Alena Kopanicáková
Rolf Krause
218
2
0
21 Dec 2023
Continual Learning: Forget-free Winning Subnetworks for Video Representations
Haeyong Kang
Jaehong Yoon
Sung Ju Hwang
Chang D. Yoo
CLL
533
5
0
19 Dec 2023
DePRL: Achieving Linear Convergence Speedup in Personalized Decentralized Learning with Shared Representations
Efstathia Soufleri
Gang Yan
Maroun Touma
Jian Li
302
9
0
17 Dec 2023
Physics-Informed Deep Learning of Rate-and-State Fault Friction
Computer Methods in Applied Mechanics and Engineering (CMAME), 2023
Cody Rucker
Brittany A. Erickson
PINN
AI4CE
268
14
0
14 Dec 2023
Layered Randomized Quantization for Communication-Efficient and Privacy-Preserving Distributed Learning
Guangfeng Yan
Tan Li
Tian-Shing Lan
Kui Wu
Linqi Song
262
12
0
12 Dec 2023
An
L
D
L
T
LDL^T
L
D
L
T
Trust-Region Quasi-Newton Method
John Brust
Philip E. Gill
47
0
0
11 Dec 2023
ELSA: Partial Weight Freezing for Overhead-Free Sparse Network Deployment
Paniz Halvachi
Alexandra Peste
Dan Alistarh
Christoph H. Lampert
182
0
0
11 Dec 2023
Fake It Till Make It: Federated Learning with Consensus-Oriented Generation
Rui Ye
Yaxin Du
Zhenyang Ni
Siheng Chen
Yanfeng Wang
FedML
184
8
0
10 Dec 2023
TaskMet: Task-Driven Metric Learning for Model Learning
Neural Information Processing Systems (NeurIPS), 2023
Dishank Bansal
Ricky T. Q. Chen
Mustafa Mukadam
Brandon Amos
FedML
264
15
0
08 Dec 2023
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications
Journal of Optimization Theory and Applications (JOTA), 2023
Rajeeva Laxman Karandikar
M. Vidyasagar
353
19
0
05 Dec 2023
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization
Junwen Qiu
Xiao Li
Andre Milzarek
598
3
0
02 Dec 2023
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
281
4
0
29 Nov 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent
Frederik Köhne
Leonie Kreis
Anton Schiela
Roland A. Herzog
283
2
0
28 Nov 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
208
0
0
27 Nov 2023
Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in Australia
IEEE Access (IEEE Access), 2023
Milad Baghalzadeh Shishehgarkhaneh
R. Moehler
Yihai Fang
Amer A. Hijazi
Hamed Aboutorab
267
31
0
23 Nov 2023
Soft Random Sampling: A Theoretical and Empirical Analysis
Xiaodong Cui
Ashish R. Mittal
Songtao Lu
Wei Zhang
G. Saon
Brian Kingsbury
275
2
0
21 Nov 2023
Infinite forecast combinations based on Dirichlet process
Yinuo Ren
Feng Li
Yanfei Kang
Jue Wang
AI4TS
172
0
0
21 Nov 2023
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
295
4
0
20 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
390
4
0
15 Nov 2023
Non-Uniform Smoothness for Gradient Descent
A. Berahas
Lindon Roberts
Fred Roosta
168
5
0
15 Nov 2023
Robust softmax aggregation on blockchain based federated learning with convergence guarantee
Huiyu Wu
Diego Klabjan
FedML
267
3
0
13 Nov 2023
Differentiable Cutting-plane Layers for Mixed-integer Linear Optimization
Gabriele Dragotto
Stefan Clarke
J. F. Fisac
Bartolomeo Stellato
519
7
0
06 Nov 2023
Parameter-Agnostic Optimization under Relaxed Smoothness
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Florian Hübler
Junchi Yang
Xiang Li
Niao He
265
31
0
06 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
669
0
0
06 Nov 2023
High Probability Convergence of Adam Under Unbounded Gradients and Affine Variance Noise
Yusu Hong
Junhong Lin
241
11
0
03 Nov 2023
Learning to optimize by multi-gradient for multi-objective optimization
Linxi Yang
Xinmin Yang
L. Tang
267
1
0
01 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
145
0
0
31 Oct 2023
High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise
Aleksandar Armacki
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
704
10
0
28 Oct 2023
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
IEEE Conference on Decision and Control (CDC), 2023
Sumedh Gupte
A. PrashanthL.
Sanjay P. Bhat
180
2
0
28 Oct 2023
Contextual Stochastic Bilevel Optimization
Neural Information Processing Systems (NeurIPS), 2023
Yifan Hu
Jie Wang
Yao Xie
Andreas Krause
Daniel Kuhn
239
20
0
27 Oct 2023
Performative Prediction: Past and Future
Statistical Science (Statist. Sci.), 2023
Moritz Hardt
Celestine Mendler-Dünner
420
43
0
25 Oct 2023
Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Tao Sun
Congliang Chen
Peng Qiao
Li Shen
Xinwang Liu
Dongsheng Li
192
6
0
23 Oct 2023
Graph Neural Networks and Applied Linear Algebra
Nicholas S. Moore
Eric C. Cyr
Peter Ohm
C. Siefert
R. Tuminaro
246
6
0
21 Oct 2023
Exponential weight averaging as damped harmonic motion
J. Patsenker
Henry Li
Y. Kluger
186
0
0
20 Oct 2023
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency for Federated Learning with Static and Streaming Dataset
Weijie Liu
Xiaoxi Zhang
Jingpu Duan
Carlee Joe-Wong
Zhi Zhou
Xu Chen
244
22
0
20 Oct 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
270
1
0
19 Oct 2023
LASER: Linear Compression in Wireless Distributed Optimization
Ashok Vardhan Makkuva
Marco Bondaschi
Thijs Vogels
Martin Jaggi
Hyeji Kim
Michael C. Gastpar
394
7
0
19 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
International Conference on Machine Learning (ICML), 2023
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Tian Ding
Zhimin Luo
481
138
0
16 Oct 2023
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei Chen
Khaled B. Letaief
FedML
458
25
0
16 Oct 2023
Federated Multi-Objective Learning
Haibo Yang
Zhuqing Liu
Jia-Wei Liu
Chaosheng Dong
Michinari Momma
FedML
327
20
0
15 Oct 2023
Fast Sampling and Inference via Preconditioned Langevin Dynamics
Riddhiman Bhattacharya
Tiefeng Jiang
155
3
0
11 Oct 2023
Quantum Shadow Gradient Descent for Quantum Learning
Mohsen Heidari
M. Naved
Wenbo Xie
Arjun Jacob Grama
Wojtek Szpankowski
169
0
0
10 Oct 2023
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Cong Ma
Xingyu Xu
Tian Tong
Yuejie Chi
304
12
0
09 Oct 2023
Learning Layer-wise Equivariances Automatically using Gradients
Neural Information Processing Systems (NeurIPS), 2023
Tycho F. A. van der Ouderaa
Alexander Immer
Mark van der Wilk
MLT
311
21
0
09 Oct 2023
On the Parallel Complexity of Multilevel Monte Carlo in Stochastic Gradient Descent
Kei Ishikawa
BDL
226
0
0
03 Oct 2023
Epidemic Learning: Boosting Decentralized Learning with Randomized Communication
Neural Information Processing Systems (NeurIPS), 2023
M. Vos
Sadegh Farhadkhani
R. Guerraoui
Anne-Marie Kermarrec
Rafael Pires
Rishi Sharma
310
25
0
03 Oct 2023
Batch-less stochastic gradient descent for compressive learning of deep regularization for image denoising
Journal of Mathematical Imaging and Vision (JMIV), 2023
Hui Shi
Yann Traonmilin
Jean-François Aujol
176
1
0
02 Oct 2023
Previous
1
2
3
...
6
7
8
...
28
29
30
Next
Page 7 of 30
Page
of 30
Go