v1v2v3 (latest)

Optimization Methods for Large-Scale Machine Learning

15 June 2016

Papers citing "Optimization Methods for Large-Scale Machine Learning"

50 / 1,490 papers shown

Cooperative SGD with Dynamic Mixing Matrices

Soumya Sarkar

Shweta Jain

171

20 Aug 2025

Domain-Generalization to Improve Learning in Meta-Learning Algorithms

201

13 Aug 2025

A Spin Glass Characterization of Neural Networks

Jun Li

103

10 Aug 2025

Why Does Stochastic Gradient Descent Slow Down in Low-Precision Training?

Vincent-Daniel Yun

198

10 Aug 2025

Cumulative Learning Rate Adaptation: Revisiting Path-Based Schedules for SGD and Adam

07 Aug 2025

Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization

Wei Liu

Anweshit Panda

Ujwal Pandey

Christopher Brissette

118

07 Aug 2025

Learning Latent Graph Geometry via Fixed-Point Schrödinger-Type Activation: A Theoretical Study

Dmitry Pasechnyuk-Vilensky

Daniil Doroshenko

27 Jul 2025

The Price equation reveals a universal force-metric-bias law of algorithmic learning and natural selection

Steven A. Frank

FedML

380

24 Jul 2025

Physics-aware Truck and Drone Delivery Planning Using Optimization & Machine Learning

Yineng Sun

Armin Fügenschuh

Vikrant Vaze

112

22 Jul 2025

Whom to Trust? Adaptive Collaboration in Personalized Federated Learning

387

30 Jun 2025

Rethinking LLM Training through Information Geometry and Quantum Metrics

Riccardo Di Sipio

172

18 Jun 2025

ImprovDML: Improved Trade-off in Private Byzantine-Resilient Distributed Machine Learning

163

18 Jun 2025

Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting

Duc Toan Nguyen

Trang H. Tran

Lam M. Nguyen

146

14 Jun 2025

Convergence of Momentum-Based Optimization Algorithms with Time-Varying Parameters

Mathukumalli Vidyasagar

303

13 Jun 2025

Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise

175

12 Jun 2025

NDCG-Consistent Softmax Approximation with Accelerated Convergence

181

11 Jun 2025

Neural Tangent Kernel Analysis to Probe Convergence in Physics-informed Neural Solvers: PIKANs vs. PINNs

Salah A Faroughi

Farinaz Mostajeran

129

09 Jun 2025

A Stable Whitening Optimizer for Efficient Neural Network Training

Kevin Frans

Sergey Levine

Pieter Abbeel

383

08 Jun 2025

Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner

297

04 Jun 2025

Deformable registration and generative modelling of aortic anatomies by auto-decoders and neural ODEs

185

01 Jun 2025

Rethinking Regularization Methods for Knowledge Graph Completion

201

29 May 2025

Moment Expansions of the Energy Distance

Ian Langmore

188

27 May 2025

Stationary MMD Points for Cubature

301

27 May 2025

Empirical Investigation of Latent Representational Dynamics in Large Language Models: A Manifold Evolution Perspective

Yukun Zhang

Qi Dong

AI4CE

185

24 May 2025

Implicit Neural Shape Optimization for 3D High-Contrast Electrical Impedance Tomography

Junqing Chen

Haibo Liu

411

22 May 2025

MDVT: Enhancing Multimodal Recommendation with Model-Agnostic Multimodal-Driven Virtual Triplets

189

22 May 2025

TranSUN: A Preemptive Paradigm to Eradicate Retransformation Bias Intrinsically from Regression Models in Recommender Systems

291

20 May 2025

Never Skip a Batch: Continuous Training of Temporal GNNs via Adaptive Pseudo-Supervision

269

18 May 2025

$On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm$

On the

O(\frac{\sqrt{d}}{K^{1/4}})

Convergence Rate of AdamW Measured by

\ell_1

456

17 May 2025

Dynamic Perturbed Adaptive Method for Infinite Task-Conflicting Time Series

210

17 May 2025

Sharp Gaussian approximations for Decentralized Federated Learning

327

12 May 2025

A stochastic gradient method for trilevel optimization

Tommaso Giovannelli

G. Kent

Luis Nunes Vicente

184

11 May 2025

Entropy-Guided Sampling of Flat Modes in Discrete Spaces

Pinaki Mohanty

Riddhiman Bhattacharya

Ruqi Zhang

994

05 May 2025

Online Functional Principal Component Analysis on a Multidimensional Domain

Muye Nanshan

Nan Zhang

Jiguo Cao

176

04 May 2025

DHO$_2$: Accelerating Distributed Hybrid Order Optimization via Model Parallelism and ADMM

DHO

_2

: Accelerating Distributed Hybrid Order Optimization via Model Parallelism and ADMM

229

02 May 2025

Ultra-fast feature learning for the training of two-layer neural networks in the two-timescale regime

Raphael Barboni

Gabriel Peyré

François-Xavier Vialard

MLT

313

25 Apr 2025

TACO: Tackling Over-correction in Federated Learning with Tailored Adaptive CorrectionIEEE International Conference on Distributed Computing Systems (ICDCS), 2025

400

24 Apr 2025

OptimAI: Optimization from Natural Language Using LLM-Powered AI Agents

548

23 Apr 2025

MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design

427

22 Apr 2025

AlphaGrad: Non-Linear Gradient Normalization Optimizer

Soham Sane

ODL

406

22 Apr 2025

Mixed-Precision Conjugate Gradient Solvers with RL-Driven Precision Tuning

Xinye Chen

275

19 Apr 2025

Matrix-free Second-order Optimization of Gaussian Splats with Residual Sampling

Hamza Pehlivan

Andrea Boscolo Camiletto

465

17 Apr 2025

Stochastic Gradient Descent in Non-Convex Problems: Asymptotic Convergence with Relaxed Step-Size via Stopping Time Methods

206

17 Apr 2025

Towards Weaker Variance Assumptions for Stochastic Optimization

Ahmet Alacaoglu

Yura Malitsky

Stephen J. Wright

220

14 Apr 2025

A Tale of Two Learning Algorithms: Multiple Stream Random Walk and Asynchronous Gossip

Peyman Gholami

H. Seferoglu

174

14 Apr 2025

A Piecewise Lyapunov Analysis of Sub-quadratic SGD: Applications to Robust and Quantile RegressionMeasurement and Modeling of Computer Systems (SIGMETRICS), 2025

285

11 Apr 2025

Min-Max Optimisation for Nonconvex-Nonconcave Functions Using a Random Zeroth-Order Extragradient Algorithm

Amir Ali Farzin

Yuen-Man Pun

Philipp Braun

Antoine Lesage-Landry

Youssef Diouane

Iman Shames

292

10 Apr 2025

ZIP: An Efficient Zeroth-order Prompt Tuning for Black-box Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2025

309

09 Apr 2025

Decentralized Domain Generalization with Style Sharing: Formal Model and Convergence Analysis

Shahryar Zehtabi

Dong-Jun Han

Seyyedali Hosseinalipour

Christopher G. Brinton

FedML AI4CE

510

08 Apr 2025

Universal Collection of Euclidean Invariants between Pairs of Position-Orientations

Gijs Bellaard

B. Smets

R. Duits

364

04 Apr 2025