Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Neural Information Processing Systems (NeurIPS), 2014

10 June 2014

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 633 papers shown

A Saddle Point Remedy: Power of Variable Elimination in Non-convex Optimization

03 Nov 2025

Non-Singularity of the Gradient Descent map for Neural Networks with Piecewise Analytic Activations

Alexandru Crăciun

Debarghya Ghoshdastidar

MLT

113

28 Oct 2025

Nonlinear discretizations and Newton's method: characterizing stationary points of regression objectives

Conor Rowan

ODL

231

13 Oct 2025

Long-tailed Recognition with Model Rebalancing

150

09 Oct 2025

AutoBalance: An Automatic Balancing Framework for Training Physics-Informed Neural Networks

118

08 Oct 2025

How Does Preconditioning Guide Feature Learning in Deep Neural Networks?

Kotaro Yoshida

Atsushi Nitanda

243

30 Sep 2025

Visualization and Analysis of the Loss Landscape in Graph Neural NetworksInternational Conference on Artificial Neural Networks (ICANN), 2025

129

15 Sep 2025

An Analysis of Layer-Freezing Strategies for Enhanced Transfer Learning in YOLO Architectures

Andrzej D. Dobrzycki

Ana M. Bernardos

José Ramón Casar

109

05 Sep 2025

Globally aware optimization with resurgence

Wei Bu

01 Sep 2025

Adaptive Heavy-Tailed Stochastic Gradient Descent

Bodu Gong

Gustavo Enrique Batista

Pierre Lafaye de Micheaux

157

29 Aug 2025

Saddle Hierarchy in Dense Associative Memory

Robin Thériault

Daniele Tantari

26 Aug 2025

Algebraic Approach to Ridge-Regularized Mean Squared Error Minimization in Minimal ReLU Neural Network

Ryoya Fukasaku

Y. Kabata

Akifumi Okuno

139

25 Aug 2025

Understanding Data Influence with Differential Approximation

274

20 Aug 2025

A Spin Glass Characterization of Neural Networks

Jun Li

100

10 Aug 2025

Communication-Efficient Distributed Training for Collaborative Flat Optima Recovery in Deep Learning

Tolga Dimlioglu

A. Choromańska

FedML

296

27 Jul 2025

Dimer-Enhanced Optimization: A First-Order Approach to Escaping Saddle Points in Neural Network Training

306

26 Jul 2025

Fishers for Free? Approximating the Fisher Information Matrix by Recycling the Squared Gradient Accumulator

217

24 Jul 2025

Towards Robust Surrogate Models: Benchmarking Machine Learning Approaches to Expediting Phase Field Simulations of Brittle Fracture

Erfan Hamdi

Emma Lejeune

OOD AI4CE

256

09 Jul 2025

HiPreNets: High-Precision Neural Networks through Progressive Training

Ethan Mulle

W. Kang

Q. Gong

266

18 Jun 2025

A Study of Hybrid and Evolutionary Metaheuristics for Single Hidden Layer Feedforward Neural Network Architecture

Gautam Siddharth Kashyap

Md. Tabrez Nafis

S. Wazir

276

17 Jun 2025

Flat Channels to Infinity in Neural Loss Landscapes

314

17 Jun 2025

Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?

Sigma Jahan

Mohammad Masudur Rahman

191

09 Jun 2025

Come Together, But Not Right Now: A Progressive Strategy to Boost Low-Rank Adaptation

...

285

06 Jun 2025

A projection-based framework for gradient-free and parallel learning

240

06 Jun 2025

Look Within or Look Beyond? A Theoretical Comparison Between Parameter-Efficient and Full Fine-Tuning

192

28 May 2025

Understanding Differential Transformer Unchains Pretrained Self-Attentions

Chaerin Kong

Jiho Jang

Nojun Kwak

462

22 May 2025

Block-Biased Mamba for Long-Range Sequence Processing

Annan Yu

N. Benjamin Erichson

Mamba

354

13 May 2025

Phase Transitions between Accuracy Regimes in L2 regularized Deep Neural Networks

Ibrahim Talha Ersoy

Karoline Wiesner

275

10 May 2025

Towards Quantifying the Hessian Structure of Neural Networks

Zhaorui Dong

Yushun Zhang

Jianfeng Yao

303

05 May 2025

Connecting Parameter Magnitudes and Hessian Eigenspaces at Scale using Sketched Methods

353

20 Apr 2025

SDEIT: Semantic-Driven Electrical Impedance Tomography

271

05 Apr 2025

Identifying Sparsely Active Circuits Through Local Loss Landscape Decomposition

Brianna Chrisman

Lucius Bushnaq

Lee D. Sharkey

351

31 Mar 2025

Almost Bayesian: The Fractal Dynamics of Stochastic Gradient Descent

Max Hennick

Stijn De Baerdemacker

229

28 Mar 2025

High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise

Yuchen Fang

Javad Lavaei

Katya Scheinberg

289

24 Mar 2025

A Framework for Finding Local Saddle Points in Two-Player Zero-Sum Black-Box Games

305

23 Mar 2025

From Equations to Insights: Unraveling Symbolic Structures in PDEs with LLMs

297

13 Mar 2025

Hamiltonian Neural Networks for Robust Out-of-Time Credit Scoring

Javier Marín

431

13 Mar 2025

SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation

1.3K

25 Feb 2025

Verification and Validation for Trustworthy Scientific Machine Learning

John D. Jakeman

Lorena A. Barba

J. Martins

Thomas O'Leary-Roseberry

AI4CE

466

21 Feb 2025

Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks

Josua Faller

Jörg Martin

BDL

361

04 Feb 2025

Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic LearningInternational Conference on Learning Representations (ICLR), 2025

353

29 Jan 2025

SSE-SAM: Balancing Head and Tail Classes Gradually through Stage-Wise SAMAAAI Conference on Artificial Intelligence (AAAI), 2024

520

18 Dec 2024

Causal Invariance Learning via Efficient Nonconvex Optimization

450

16 Dec 2024

Curvature in the Looking-Glass: Optimal Methods to Exploit Curvature of Expectation in the Loss Landscape

348

25 Nov 2024

Neural Network-based High-index Saddle Dynamics Method for Searching Saddle Points and Solution Landscape

Yuankai Liu

Lei Zhang

Jin Zhao

161

25 Nov 2024

Don't Be So Positive: Negative Step Sizes in Second-Order Methods

Betty Shea

Mark Schmidt

ODL

273

18 Nov 2024

Data movement limits to frontier model training

Ege Erdil

David Schneider-Joseph

378

02 Nov 2024

CopRA: A Progressive LoRA Training Strategy

Xiequn Wang

Yu Zhang

250

30 Oct 2024

A Mathematical Analysis of Neural Operator Behaviors

Vu-Anh Le

Mehmet Dik

AI4CE

197

28 Oct 2024

Trust-Region Eigenvalue Filtering for Projected NewtonACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2024

David I. W. Levin

185

14 Oct 2024