v1v2 (latest)

Scalable Second Order Optimization for Deep Learning

20 February 2020

Papers citing "Scalable Second Order Optimization for Deep Learning"

21 / 21 papers shown

Fundamentals of Regression

Miguel A. Mendez

AI4CE

27 Nov 2025

FUSE: First-Order and Second-Order Unified SynthEsis in Stochastic OptimizationConference on Algebraic Informatics (AI), 2025

285

06 Mar 2025

PETScML: Second-order solvers for training regression problems in Scientific Machine LearningPlatform for Advanced Scientific Computing Conference (PASC), 2024

257

18 Mar 2024

Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

475

09 Mar 2023

VeLO: Training Versatile Learned Optimizers by Scaling Up

...

Jascha Narain Sohl-Dickstein

362

17 Nov 2022

Compute-Efficient Deep Learning: Algorithmic Trends and OpportunitiesJournal of machine learning research (JMLR), 2022

Brian Bartoldson

B. Kailkhura

Davis W. Blalock

448

13 Oct 2022

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

...

345

12 Sep 2022

Hessian-Free Second-Order Adversarial Examples for Adversarial Learning

336

04 Jul 2022

Practical tradeoffs between memory, compute, and performance in learned optimizers

Jascha Narain Sohl-Dickstein

517

22 Mar 2022

Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive Step Size

Adityanarayanan Radhakrishnan

M. Belkin

Caroline Uhler

ODL

123

30 Dec 2021

Real-time Neural Radiance Caching for Path Tracing

371

215

23 Jun 2021

A Generalizable Approach to Learning Optimizers

393

02 Jun 2021

AsymptoticNG: A regularized natural gradient optimization algorithm with look-ahead strategy

Hao Li

174

24 Dec 2020

Second-order Neural Network Training Using Complex-step Directional Derivative

163

15 Sep 2020

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

916

195

03 Jul 2020

Training (Overparametrized) Neural Networks in Near-Linear Time

356

20 Jun 2020

Daydream: Accurately Estimating the Efficacy of Optimizations for DNN Training

Hongyu Zhu

Amar Phanishayee

Gennady Pekhimenko

384

05 Jun 2020

PyHessian: Neural Networks Through the Lens of the Hessian

489

367

16 Dec 2019

Zap Q-Learning With Nonlinear Function ApproximationNeural Information Processing Systems (NeurIPS), 2019

253

11 Oct 2019

An Adaptive Remote Stochastic Gradient Method for Training Neural Networks

625

04 May 2019

OverSketched Newton: Fast Convex Optimization for Serverless Systems

286

21 Mar 2019