Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1606.04838
Cited By

Optimization Methods for Large-Scale Machine Learning

v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Frank E. Curtis

ArXiv (abs)PDF HTML

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,490 papers shown

Towards Exact Gradient-based Training on Analog In-memory Computing

Towards Exact Gradient-based Training on Analog In-memory ComputingNeural Information Processing Systems (NeurIPS), 2024

294

6

0

18 Jun 2024

Generative vs. Discriminative modeling under the lens of uncertainty
quantification

Generative vs. Discriminative modeling under the lens of uncertainty quantification

Elouan Argouarc'h

François Desbouvries

216

1

0

13 Jun 2024

Loss Gradient Gaussian Width based Generalization and Optimization Guarantees

Loss Gradient Gaussian Width based Generalization and Optimization Guarantees

442

0

0

11 Jun 2024

A Generalized Version of Chung's Lemma and its Applications

A Generalized Version of Chung's Lemma and its Applications

224

1

0

09 Jun 2024

Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex Optimization

Provable Complexity Improvement of AdaGrad over SGD: Upper and Lower Bounds in Stochastic Non-Convex OptimizationAnnual Conference Computational Learning Theory (COLT), 2024

Devyani Maladkar

421

6

0

07 Jun 2024

Efficient Data-Parallel Continual Learning with Asynchronous Distributed
Rehearsal Buffers

Efficient Data-Parallel Continual Learning with Asynchronous Distributed Rehearsal Buffers

Bogdan Nicolae

Alexandru Costan

Gabriel Antoniu

198

2

0

05 Jun 2024

Demystifying SGD with Doubly Stochastic Gradients

Demystifying SGD with Doubly Stochastic Gradients

Jacob R. Gardner

401

2

0

03 Jun 2024

Privacy-Aware Randomized Quantization via Linear Programming

Privacy-Aware Randomized Quantization via Linear Programming

Mohammad Mahdi Khalili

378

2

0

01 Jun 2024

Enhancing Efficiency of Safe Reinforcement Learning via Sample
Manipulation

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation

Adam Wierman

Ming Jin

288

7

0

31 May 2024

Symmetries in Overparametrized Neural Networks: A Mean-Field View

Symmetries in Overparametrized Neural Networks: A Mean-Field View

Joaquin Fontbona

528

4

0

30 May 2024

A Pontryagin Perspective on Reinforcement Learning

A Pontryagin Perspective on Reinforcement Learning

Michael Muehlebach

383

4

0

28 May 2024

WASH: Train your Ensemble with Communication-Efficient Weight Shuffling,
then Average

WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average

Masih Aminbeidokhti

Eugene Belilovsky

Edouard Oyallon

340

7

0

27 May 2024

Derivatives of Stochastic Gradient Descent

Derivatives of Stochastic Gradient Descent

Edouard Pauwels

195

1

0

24 May 2024

Kronecker-Factored Approximate Curvature for Physics-Informed Neural
Networks

Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

Johannes Müller

Marius Zeinhofer

357

18

0

24 May 2024

Exact Gauss-Newton Optimization for Training Deep Neural Networks

Exact Gauss-Newton Optimization for Training Deep Neural Networks

Adeyemi Damilare Adeoye

Alberto Bemporad

301

6

0

23 May 2024

Thermodynamic Natural Gradient Descent

Thermodynamic Natural Gradient Descent

Kaelan Donatella

Samuel Duffield

Patrick J. Coles

139

4

0

22 May 2024

Almost sure convergence rates of stochastic gradient methods under gradient domination

Almost sure convergence rates of stochastic gradient methods under gradient domination

Simon Weissmann

350

6

0

22 May 2024

Energy-Efficient Federated Edge Learning with Streaming Data: A Lyapunov
Optimization Approach

Energy-Efficient Federated Edge Learning with Streaming Data: A Lyapunov Optimization Approach

Erik G. Larsson

207

8

0

20 May 2024

Reinforcement learning

Reinforcement learning

Florentin Wörgötter

628

2,932

0

16 May 2024

Minimisation of Polyak-Łojasewicz Functions Using Random Zeroth-Order
Oracles

Minimisation of Polyak-Łojasewicz Functions Using Random Zeroth-Order OraclesEuropean Control Conference (ECC), 2024

Amir Ali Farzin

150

3

0

15 May 2024

Robust Semi-supervised Learning by Wisely Leveraging Open-set Data

Robust Semi-supervised Learning by Wisely Leveraging Open-set Data

300

27

0

11 May 2024

Optimal Baseline Corrections for Off-Policy Contextual Bandits

Optimal Baseline Corrections for Off-Policy Contextual BanditsACM Conference on Recommender Systems (RecSys), 2024

Harrie Oosterhuis

Maarten de Rijke

290

11

0

09 May 2024

Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based
Meta-solving

Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solvingInternational Conference on Machine Learning (ICML), 2024

210

0

0

05 May 2024

A Full Adagrad algorithm with O(Nd) operations

A Full Adagrad algorithm with O(Nd) operations

Antoine Godichon-Baggioni

291

0

0

03 May 2024

The Privacy Power of Correlated Noise in Decentralized Learning

The Privacy Power of Correlated Noise in Decentralized Learning

Youssef Allouah

Anastasia Koloskova

Aymane El Firdoussi

274

18

0

02 May 2024

On the Relevance of Byzantine Robust Optimization Against Data Poisoning

On the Relevance of Byzantine Robust Optimization Against Data Poisoning

Sadegh Farhadkhani

256

2

0

01 May 2024

IID Relaxation by Logical Expressivity: A Research Agenda for Fitting
Logics to Neurosymbolic Requirements

IID Relaxation by Logical Expressivity: A Research Agenda for Fitting Logics to Neurosymbolic Requirements

Alessandra Mileo

207

1

0

30 Apr 2024

Advancing Supervised Learning with the Wave Loss Function: A Robust and
Smooth Approach

Advancing Supervised Learning with the Wave Loss Function: A Robust and Smooth Approach

Muhammad Tanveer

238

28

0

28 Apr 2024

Second-order Information Promotes Mini-Batch Robustness in
Variance-Reduced Gradients

Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients

Michal Dereziñski

232

2

0

23 Apr 2024

Rate Analysis of Coupled Distributed Stochastic Approximation for
Misspecified Optimization

Rate Analysis of Coupled Distributed Stochastic Approximation for Misspecified Optimization

190

0

0

21 Apr 2024

FedMeS: Personalized Federated Continual Learning Leveraging Local
Memory

FedMeS: Personalized Federated Continual Learning Leveraging Local Memory

262

1

0

19 Apr 2024

DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series

DACAD: Domain Adaptation Contrastive Learning for Anomaly Detection in Multivariate Time Series

Zahra Zamanzadeh Darban

Geoffrey I. Webb

Charu C. Aggarwal

495

12

0

17 Apr 2024

I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey

I/O in Machine Learning Applications on HPC Systems: A 360-degree Survey

538

4

0

16 Apr 2024

Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of
Overfitting

Minimizing Chebyshev Prototype Risk Magically Mitigates the Perils of Overfitting

Nathaniel R. Dean

207

0

0

10 Apr 2024

Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model

Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model

Jonathan P. Keating

356

10

0

09 Apr 2024

Stochastic Online Optimization for Cyber-Physical and Robotic Systems

Stochastic Online Optimization for Cyber-Physical and Robotic Systems

Hao Ma

Melanie Zeilinger

Michael Muehlebach

233

3

0

08 Apr 2024

A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network

A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network

869

2

0

07 Apr 2024

Optimal Batch Allocation for Wireless Federated Learning

Optimal Batch Allocation for Wireless Federated LearningIEEE Internet of Things Journal (IEEE IoT J.), 2024

211

1

0

03 Apr 2024

Satellite Federated Edge Learning: Architecture Design and Convergence
Analysis

Satellite Federated Edge Learning: Architecture Design and Convergence AnalysisIEEE Transactions on Wireless Communications (IEEE TWC), 2024

Khaled B. Letaief

246

25

0

02 Apr 2024

What Can Transformer Learn with Varying Depth? Case Studies on Sequence
Learning Tasks

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning TasksInternational Conference on Machine Learning (ICML), 2024

271

20

0

02 Apr 2024

DRIVE: Dual Gradient-Based Rapid Iterative Pruning

DRIVE: Dual Gradient-Based Rapid Iterative Pruning

Dhananjay Saikumar

Blesson Varghese

215

3

0

01 Apr 2024

Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance

Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance

436

13

0

01 Apr 2024

Communication Efficient Distributed Training with Distributed Lion

Communication Efficient Distributed Training with Distributed Lion

Raghuraman Krishnamoorthi

316

11

0

30 Mar 2024

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded
Graph Neural Networks

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

Michal Dereziñski

183

0

0

26 Mar 2024

Computational Approaches for Exponential-Family Factor Analysis

Computational Approaches for Exponential-Family Factor Analysis

Liang Wang

246

2

0

22 Mar 2024

AI and Memory Wall

AI and Memory Wall

Sehoon Kim

Coleman Hooper

Michael W. Mahoney

255

268

0

21 Mar 2024

PETScML: Second-order solvers for training regression problems in
Scientific Machine Learning

PETScML: Second-order solvers for training regression problems in Scientific Machine LearningPlatform for Advanced Scientific Computing Conference (PASC), 2024

Stefano Zampini

Umberto Zerbinati

George Turkyyiah

225

6

0

18 Mar 2024

Nonsmooth Implicit Differentiation: Deterministic and Stochastic
Convergence Rates

Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence RatesInternational Conference on Machine Learning (ICML), 2024

Riccardo Grazzi

Massimiliano Pontil

344

3

0

18 Mar 2024

A Selective Review on Statistical Methods for Massive Data Computation:
Distributed Computing, Subsampling, and Minibatch Techniques

A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

...

204

17

0

17 Mar 2024

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization
with Loopless Variance Reduction

Streamlining in the Riemannian Realm: Efficient Riemannian Optimization with Loopless Variance Reduction

Yury Demidovich

Grigory Malinovsky

Peter Richtárik

243

3

0

11 Mar 2024

1 2 3 4 5 6...28 29 30

Page 5 of 30

Pageof 30