ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.04838
  4. Cited By
Optimization Methods for Large-Scale Machine Learning
v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016
Léon Bottou
Frank E. Curtis
J. Nocedal
ArXiv (abs)PDFHTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,490 papers shown
FedNS: A Fast Sketching Newton-Type Algorithm for Federated Learning
FedNS: A Fast Sketching Newton-Type Algorithm for Federated Learning
Jian Li
Yong Liu
Wei Wang
Haoran Wu
Weiping Wang
FedML
277
6
0
05 Jan 2024
Online Continual Domain Adaptation for Semantic Image Segmentation Using
  Internal Representations
Online Continual Domain Adaptation for Semantic Image Segmentation Using Internal Representations
Serban Stan
Mohammad Rostami
OODCLL
243
0
0
02 Jan 2024
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant
  Stochastic Algorithms
SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms
Farshed Abdukhakimov
Chulu Xiang
Dmitry Kamzolov
Robert Mansel Gower
Martin Takáč
297
5
0
28 Dec 2023
Parallel Trust-Region Approaches in Neural Network Training: Beyond
  Traditional Methods
Parallel Trust-Region Approaches in Neural Network Training: Beyond Traditional Methods
Ken Trotti
Samuel A. Cruz Alegría
Alena Kopanicáková
Rolf Krause
218
2
0
21 Dec 2023
Continual Learning: Forget-free Winning Subnetworks for Video
  Representations
Continual Learning: Forget-free Winning Subnetworks for Video Representations
Haeyong Kang
Jaehong Yoon
Sung Ju Hwang
Chang D. Yoo
CLL
533
5
0
19 Dec 2023
DePRL: Achieving Linear Convergence Speedup in Personalized
  Decentralized Learning with Shared Representations
DePRL: Achieving Linear Convergence Speedup in Personalized Decentralized Learning with Shared Representations
Efstathia Soufleri
Gang Yan
Maroun Touma
Jian Li
302
9
0
17 Dec 2023
Physics-Informed Deep Learning of Rate-and-State Fault Friction
Physics-Informed Deep Learning of Rate-and-State Fault FrictionComputer Methods in Applied Mechanics and Engineering (CMAME), 2023
Cody Rucker
Brittany A. Erickson
PINNAI4CE
268
14
0
14 Dec 2023
Layered Randomized Quantization for Communication-Efficient and
  Privacy-Preserving Distributed Learning
Layered Randomized Quantization for Communication-Efficient and Privacy-Preserving Distributed Learning
Guangfeng Yan
Tan Li
Tian-Shing Lan
Kui Wu
Linqi Song
262
12
0
12 Dec 2023
An $LDL^T$ Trust-Region Quasi-Newton Method
An LDLTLDL^TLDLT Trust-Region Quasi-Newton Method
John Brust
Philip E. Gill
47
0
0
11 Dec 2023
ELSA: Partial Weight Freezing for Overhead-Free Sparse Network
  Deployment
ELSA: Partial Weight Freezing for Overhead-Free Sparse Network Deployment
Paniz Halvachi
Alexandra Peste
Dan Alistarh
Christoph H. Lampert
182
0
0
11 Dec 2023
Fake It Till Make It: Federated Learning with Consensus-Oriented
  Generation
Fake It Till Make It: Federated Learning with Consensus-Oriented Generation
Rui Ye
Yaxin Du
Zhenyang Ni
Siheng Chen
Yanfeng Wang
FedML
184
8
0
10 Dec 2023
TaskMet: Task-Driven Metric Learning for Model Learning
TaskMet: Task-Driven Metric Learning for Model LearningNeural Information Processing Systems (NeurIPS), 2023
Dishank Bansal
Ricky T. Q. Chen
Mustafa Mukadam
Brandon Amos
FedML
264
15
0
08 Dec 2023
Convergence Rates for Stochastic Approximation: Biased Noise with
  Unbounded Variance, and Applications
Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and ApplicationsJournal of Optimization Theory and Applications (JOTA), 2023
Rajeeva Laxman Karandikar
M. Vidyasagar
353
19
0
05 Dec 2023
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization
A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization
Junwen Qiu
Xiao Li
Andre Milzarek
598
3
0
02 Dec 2023
On Adaptive Stochastic Optimization for Streaming Data: A Newton's
  Method with O(dN) Operations
On Adaptive Stochastic Optimization for Streaming Data: A Newton's Method with O(dN) Operations
Antoine Godichon-Baggioni
Nicklas Werge
ODL
281
4
0
29 Nov 2023
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent
Adaptive Step Sizes for Preconditioned Stochastic Gradient Descent
Frederik Köhne
Leonie Kreis
Anton Schiela
Roland A. Herzog
283
2
0
28 Nov 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
208
0
0
27 Nov 2023
Transformer-based Named Entity Recognition in Construction Supply Chain
  Risk Management in Australia
Transformer-based Named Entity Recognition in Construction Supply Chain Risk Management in AustraliaIEEE Access (IEEE Access), 2023
Milad Baghalzadeh Shishehgarkhaneh
R. Moehler
Yihai Fang
Amer A. Hijazi
Hamed Aboutorab
267
31
0
23 Nov 2023
Soft Random Sampling: A Theoretical and Empirical Analysis
Soft Random Sampling: A Theoretical and Empirical Analysis
Xiaodong Cui
Ashish R. Mittal
Songtao Lu
Wei Zhang
G. Saon
Brian Kingsbury
275
2
0
21 Nov 2023
Infinite forecast combinations based on Dirichlet process
Infinite forecast combinations based on Dirichlet process
Yinuo Ren
Feng Li
Yanfei Kang
Jue Wang
AI4TS
172
0
0
21 Nov 2023
High Probability Guarantees for Random Reshuffling
High Probability Guarantees for Random Reshuffling
Hengxu Yu
Xiao Li
295
4
0
20 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions:
  Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
390
4
0
15 Nov 2023
Non-Uniform Smoothness for Gradient Descent
Non-Uniform Smoothness for Gradient Descent
A. Berahas
Lindon Roberts
Fred Roosta
168
5
0
15 Nov 2023
Robust softmax aggregation on blockchain based federated learning with
  convergence guarantee
Robust softmax aggregation on blockchain based federated learning with convergence guarantee
Huiyu Wu
Diego Klabjan
FedML
267
3
0
13 Nov 2023
Differentiable Cutting-plane Layers for Mixed-integer Linear
  Optimization
Differentiable Cutting-plane Layers for Mixed-integer Linear Optimization
Gabriele Dragotto
Stefan Clarke
J. F. Fisac
Bartolomeo Stellato
519
7
0
06 Nov 2023
Parameter-Agnostic Optimization under Relaxed Smoothness
Parameter-Agnostic Optimization under Relaxed SmoothnessInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Florian Hübler
Junchi Yang
Xiang Li
Niao He
265
31
0
06 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
669
0
0
06 Nov 2023
High Probability Convergence of Adam Under Unbounded Gradients and
  Affine Variance Noise
High Probability Convergence of Adam Under Unbounded Gradients and Affine Variance Noise
Yusu Hong
Junhong Lin
241
11
0
03 Nov 2023
Learning to optimize by multi-gradient for multi-objective optimization
Learning to optimize by multi-gradient for multi-objective optimization
Linxi Yang
Xinmin Yang
L. Tang
267
1
0
01 Nov 2023
Information-Theoretic Trust Regions for Stochastic Gradient-Based
  Optimization
Information-Theoretic Trust Regions for Stochastic Gradient-Based Optimization
Philipp Dahlinger
P. Becker
Maximilian Hüttenrauch
Gerhard Neumann
145
0
0
31 Oct 2023
High-probability Convergence Bounds for Nonlinear Stochastic Gradient
  Descent Under Heavy-tailed Noise
High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise
Aleksandar Armacki
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
704
10
0
28 Oct 2023
Optimization of utility-based shortfall risk: A non-asymptotic viewpoint
Optimization of utility-based shortfall risk: A non-asymptotic viewpointIEEE Conference on Decision and Control (CDC), 2023
Sumedh Gupte
A. PrashanthL.
Sanjay P. Bhat
180
2
0
28 Oct 2023
Contextual Stochastic Bilevel Optimization
Contextual Stochastic Bilevel OptimizationNeural Information Processing Systems (NeurIPS), 2023
Yifan Hu
Jie Wang
Yao Xie
Andreas Krause
Daniel Kuhn
239
20
0
27 Oct 2023
Performative Prediction: Past and Future
Performative Prediction: Past and FutureStatistical Science (Statist. Sci.), 2023
Moritz Hardt
Celestine Mendler-Dünner
420
43
0
25 Oct 2023
Rethinking SIGN Training: Provable Nonconvex Acceleration without First-
  and Second-Order Gradient Lipschitz
Rethinking SIGN Training: Provable Nonconvex Acceleration without First- and Second-Order Gradient Lipschitz
Tao Sun
Congliang Chen
Peng Qiao
Li Shen
Xinwang Liu
Dongsheng Li
192
6
0
23 Oct 2023
Graph Neural Networks and Applied Linear Algebra
Graph Neural Networks and Applied Linear Algebra
Nicholas S. Moore
Eric C. Cyr
Peter Ohm
C. Siefert
R. Tuminaro
246
6
0
21 Oct 2023
Exponential weight averaging as damped harmonic motion
Exponential weight averaging as damped harmonic motion
J. Patsenker
Henry Li
Y. Kluger
186
0
0
20 Oct 2023
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency
  for Federated Learning with Static and Streaming Dataset
DYNAMITE: Dynamic Interplay of Mini-Batch Size and Aggregation Frequency for Federated Learning with Static and Streaming Dataset
Weijie Liu
Xiaoxi Zhang
Jingpu Duan
Carlee Joe-Wong
Zhi Zhou
Xu Chen
244
22
0
20 Oct 2023
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Demystifying the Myths and Legends of Nonconvex Convergence of SGD
Aritra Dutta
El Houcine Bergou
Soumia Boucherouite
Nicklas Werge
M. Kandemir
Xin Li
270
1
0
19 Oct 2023
LASER: Linear Compression in Wireless Distributed Optimization
LASER: Linear Compression in Wireless Distributed Optimization
Ashok Vardhan Makkuva
Marco Bondaschi
Thijs Vogels
Martin Jaggi
Hyeji Kim
Michael C. Gastpar
394
7
0
19 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method
  for Aligning Large Language Models
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Tian Ding
Zhimin Luo
481
138
0
16 Oct 2023
Over-the-Air Federated Learning and Optimization
Over-the-Air Federated Learning and Optimization
Jingyang Zhu
Yuanming Shi
Yong Zhou
Chunxiao Jiang
Wei Chen
Khaled B. Letaief
FedML
458
25
0
16 Oct 2023
Federated Multi-Objective Learning
Federated Multi-Objective Learning
Haibo Yang
Zhuqing Liu
Jia-Wei Liu
Chaosheng Dong
Michinari Momma
FedML
327
20
0
15 Oct 2023
Fast Sampling and Inference via Preconditioned Langevin Dynamics
Fast Sampling and Inference via Preconditioned Langevin Dynamics
Riddhiman Bhattacharya
Tiefeng Jiang
155
3
0
11 Oct 2023
Quantum Shadow Gradient Descent for Quantum Learning
Quantum Shadow Gradient Descent for Quantum Learning
Mohsen Heidari
M. Naved
Wenbo Xie
Arjun Jacob Grama
Wojtek Szpankowski
169
0
0
10 Oct 2023
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled
  Gradient Descent, Even with Overparameterization
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization
Cong Ma
Xingyu Xu
Tian Tong
Yuejie Chi
304
12
0
09 Oct 2023
Learning Layer-wise Equivariances Automatically using Gradients
Learning Layer-wise Equivariances Automatically using GradientsNeural Information Processing Systems (NeurIPS), 2023
Tycho F. A. van der Ouderaa
Alexander Immer
Mark van der Wilk
MLT
311
21
0
09 Oct 2023
On the Parallel Complexity of Multilevel Monte Carlo in Stochastic
  Gradient Descent
On the Parallel Complexity of Multilevel Monte Carlo in Stochastic Gradient Descent
Kei Ishikawa
BDL
226
0
0
03 Oct 2023
Epidemic Learning: Boosting Decentralized Learning with Randomized
  Communication
Epidemic Learning: Boosting Decentralized Learning with Randomized CommunicationNeural Information Processing Systems (NeurIPS), 2023
M. Vos
Sadegh Farhadkhani
R. Guerraoui
Anne-Marie Kermarrec
Rafael Pires
Rishi Sharma
310
25
0
03 Oct 2023
Batch-less stochastic gradient descent for compressive learning of deep
  regularization for image denoising
Batch-less stochastic gradient descent for compressive learning of deep regularization for image denoisingJournal of Mathematical Imaging and Vision (JMIV), 2023
Hui Shi
Yann Traonmilin
Jean-François Aujol
176
1
0
02 Oct 2023
Previous
123...678...282930
Next
Page 7 of 30
Pageof 30