ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Neural Information Processing Systems (NeurIPS), 2014
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Dong Wang
Surya Ganguli
Yoshua Bengio
    ODL
ArXiv (abs)PDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 631 papers shown
Title
Extended convexity and smoothness and their applications in deep learning
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
326
0
0
08 Oct 2024
Rank Matters: Understanding and Defending Model Inversion Attacks via Low-Rank Feature Filtering
Rank Matters: Understanding and Defending Model Inversion Attacks via Low-Rank Feature Filtering
Hongyao Yu
Yixiang Qiu
Hao Fang
Tianqu Zhuang
Sijin Yu
Bin Wang
Shu-Tao Xia
Ke Xu
Ke Xu
211
3
0
08 Oct 2024
Does Worst-Performing Agent Lead the Pack? Analyzing Agent Dynamics in
  Unified Distributed SGD
Does Worst-Performing Agent Lead the Pack? Analyzing Agent Dynamics in Unified Distributed SGDNeural Information Processing Systems (NeurIPS), 2024
Jie Hu
Yi-Ting Ma
Do Young Eun
FedML
265
2
0
26 Sep 2024
Super Level Sets and Exponential Decay: A Synergistic Approach to Stable
  Neural Network Training
Super Level Sets and Exponential Decay: A Synergistic Approach to Stable Neural Network TrainingJournal of Artificial Intelligence Research (JAIR), 2024
J. Chaudhary
Dipak Nidhi
J. Heikkonen
H. Merisaari
R. Kanth
129
0
0
25 Sep 2024
Trust-Region Sequential Quadratic Programming for Stochastic
  Optimization with Random Models
Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models
Yuchen Fang
Sen Na
Michael W. Mahoney
Mladen Kolar
205
3
0
24 Sep 2024
NGD converges to less degenerate solutions than SGD
NGD converges to less degenerate solutions than SGD
Moosa Saghir
N. R. Raghavendra
Zihe Liu
Evan Ryan Gunter
164
0
0
07 Sep 2024
Quantum Natural Stochastic Pairwise Coordinate Descent
Quantum Natural Stochastic Pairwise Coordinate Descent
Mohammad Aamir Sohail
M. H. Khoozani
S. S. Pradhan
179
4
0
18 Jul 2024
Correlations Are Ruining Your Gradient Descent
Correlations Are Ruining Your Gradient Descent
Nasir Ahmad
354
8
0
15 Jul 2024
Stabler Neo-Hookean Simulation: Absolute Eigenvalue Filtering for
  Projected Newton
Stabler Neo-Hookean Simulation: Absolute Eigenvalue Filtering for Projected NewtonInternational Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2024
Honglin Chen
Hsueh-Ti Derek Liu
David I. W. Levin
Changxi Zheng
Alec Jacobson
144
8
0
09 Jun 2024
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under
  Distribution Shifts
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
Weijian Deng
Jianfeng Zhang
Bo An
321
3
0
29 May 2024
Geometry of Critical Sets and Existence of Saddle Branches for Two-layer
  Neural Networks
Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks
Leyang Zhang
Yao Zhang
Yaoyu Zhang
194
0
0
26 May 2024
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Yichu Xu
Xin-Chun Li
Lan Li
De-Chuan Zhan
296
2
0
21 May 2024
Multi-fidelity Hamiltonian Monte Carlo
Multi-fidelity Hamiltonian Monte Carlo
Dhruv V. Patel
Jonghyun Lee
Matthew W. Farthing
P. Kitanidis
Eric F. Darve
221
0
0
08 May 2024
cuFastTuckerPlus: A Stochastic Parallel Sparse FastTucker Decomposition
  Using GPU Tensor Cores
cuFastTuckerPlus: A Stochastic Parallel Sparse FastTucker Decomposition Using GPU Tensor Cores
Zixuan Li
Mingxing Duan
Huizhang Luo
Wangdong Yang
KenLi Li
Keqin Li
212
0
0
15 Apr 2024
Unifying Low Dimensional Observations in Deep Learning Through the Deep
  Linear Unconstrained Feature Model
Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model
Connall Garrod
Jonathan P. Keating
296
10
0
09 Apr 2024
Statistical Mechanics and Artificial Neural Networks: Principles,
  Models, and Applications
Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications
Lucas Böttcher
Gregory R. Wheeler
251
0
0
05 Apr 2024
Quantization Avoids Saddle Points in Distributed Optimization
Quantization Avoids Saddle Points in Distributed Optimization
Yanan Bo
Yongqiang Wang
MQ
151
6
0
15 Mar 2024
Escaping Local Optima in Global Placement
Escaping Local Optima in Global Placement
Ke Xue
Xi Lin
Yunqi Shi
Shixiong Kai
Siyuan Xu
Chao Qian
222
6
0
28 Feb 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural
  Architectures
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
111
0
0
19 Feb 2024
Strong convexity-guided hyper-parameter optimization for flatter losses
Strong convexity-guided hyper-parameter optimization for flatter losses
Rahul Yedida
Snehanshu Saha
264
0
0
07 Feb 2024
Challenges in Training PINNs: A Loss Landscape Perspective
Challenges in Training PINNs: A Loss Landscape Perspective
Pratik Rathore
Weimu Lei
Zachary Frangella
Lu Lu
Madeleine Udell
AI4CEPINNODL
217
105
0
02 Feb 2024
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning:
  Theory, Algorithms and Implementations
The Definitive Guide to Policy Gradients in Deep Reinforcement Learning: Theory, Algorithms and Implementations
Matthias Lehmann
227
7
0
24 Jan 2024
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker
Frederick Altrock
Benjamin Risse
429
9
0
22 Jan 2024
GD doesn't make the cut: Three ways that non-differentiability affects
  neural network training
GD doesn't make the cut: Three ways that non-differentiability affects neural network training
Siddharth Krishna Kumar
AAML
252
4
0
16 Jan 2024
A topological description of loss surfaces based on Betti Numbers
A topological description of loss surfaces based on Betti NumbersNeural Networks (NN), 2024
Maria Sofia Bucarelli
Giuseppe Alessio D’Inverno
Monica Bianchini
F. Scarselli
Fabrizio Silvestri
115
4
0
08 Jan 2024
On the Necessity of Metalearning: Learning Suitable Parameterizations
  for Learning Processes
On the Necessity of Metalearning: Learning Suitable Parameterizations for Learning Processes
Massinissa Hamidi
A. Osmani
149
0
0
31 Dec 2023
Signal Processing Meets SGD: From Momentum to Filter
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
560
0
0
06 Nov 2023
Escaping Saddle Points in Heterogeneous Federated Learning via
  Distributed SGD with Communication Compression
Escaping Saddle Points in Heterogeneous Federated Learning via Distributed SGD with Communication CompressionInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Sijin Chen
Zhize Li
Yuejie Chi
FedML
201
5
0
29 Oct 2023
Studying K-FAC Heuristics by Viewing Adam through a Second-Order Lens
Studying K-FAC Heuristics by Viewing Adam through a Second-Order LensInternational Conference on Machine Learning (ICML), 2023
Ross M. Clarke
José Miguel Hernández-Lobato
292
2
0
23 Oct 2023
Series of Hessian-Vector Products for Tractable Saddle-Free Newton
  Optimisation of Neural Networks
Series of Hessian-Vector Products for Tractable Saddle-Free Newton Optimisation of Neural Networks
E. T. Oldewage
Ross M. Clarke
José Miguel Hernández-Lobato
ODL
146
1
0
23 Oct 2023
Subject-specific Deep Neural Networks for Count Data with
  High-cardinality Categorical Features
Subject-specific Deep Neural Networks for Count Data with High-cardinality Categorical Features
Hangbin Lee
I. Ha
Changha Hwang
Youngjo Lee
111
1
0
18 Oct 2023
AdaLomo: Low-memory Optimization with Adaptive Learning Rate
AdaLomo: Low-memory Optimization with Adaptive Learning RateAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Kai Lv
Hang Yan
Qipeng Guo
Haijun Lv
Xipeng Qiu
ODL
296
29
0
16 Oct 2023
Going Beyond Neural Network Feature Similarity: The Network Feature
  Complexity and Its Interpretation Using Category Theory
Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category TheoryInternational Conference on Learning Representations (ICLR), 2023
Yiting Chen
Zhanpeng Zhou
Junchi Yan
248
10
0
10 Oct 2023
Approximating Nash Equilibria in Normal-Form Games via Stochastic
  Optimization
Approximating Nash Equilibria in Normal-Form Games via Stochastic OptimizationInternational Conference on Learning Representations (ICLR), 2023
I. Gemp
Luke Marris
Georgios Piliouras
309
11
0
10 Oct 2023
Enhancing Accuracy in Deep Learning Using Random Matrix Theory
Enhancing Accuracy in Deep Learning Using Random Matrix TheoryJournal of Machine Learning (JML), 2023
Leonid Berlyand
Etienne Sandier
Yitzchak Shmalo
Lei Zhang
AAML
226
6
0
04 Oct 2023
Deep Model Fusion: A Survey
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedMLMoMe
261
86
0
27 Sep 2023
SAMN: A Sample Attention Memory Network Combining SVM and NN in One
  Architecture
SAMN: A Sample Attention Memory Network Combining SVM and NN in One Architecture
Qiaoling Yang
Linkai Luo
Haotong Zhang
Hong Peng
Ziyang Chen
60
0
0
25 Sep 2023
Asymmetric Momentum: A Rethinking of Gradient Descent
Asymmetric Momentum: A Rethinking of Gradient Descent
Gongyue Zhang
Dinghuang Zhang
Shuwen Zhao
Donghan Liu
Carrie M. Toptan
Honghai Liu
ODL
163
1
0
05 Sep 2023
Linear Oscillation: A Novel Activation Function for Vision Transformer
Juyoung Yun
LLMSV
185
0
0
25 Aug 2023
A Critical Review of Physics-Informed Machine Learning Applications in
  Subsurface Energy Systems
A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems
Abdeldjalil Latrach
M. L. Malki
Misael Morales
Mohamed Mehana
M. Rabiei
PINNAI4CE
169
59
0
06 Aug 2023
Fading memory as inductive bias in residual recurrent networks
Fading memory as inductive bias in residual recurrent networksNeural Networks (Neural Netw.), 2023
I. Dubinin
Felix Effenberger
192
9
0
27 Jul 2023
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters
  and Non-ergodic Case
Convergence of Adam for Non-convex Objectives: Relaxed Hyperparameters and Non-ergodic CaseMachine-mediated learning (ML), 2023
Meixuan He
Yuqing Liang
Jinlan Liu
Dongpo Xu
203
13
0
20 Jul 2023
Accelerating Inexact HyperGradient Descent for Bilevel Optimization
Accelerating Inexact HyperGradient Descent for Bilevel Optimization
Hai-Long Yang
Luo Luo
C. J. Li
Michael I. Jordan
244
17
0
30 Jun 2023
Black holes and the loss landscape in machine learning
Black holes and the loss landscape in machine learningJournal of High Energy Physics (JHEP), 2023
P. Kumar
Taniya Mandal
Swapnamay Mondal
171
2
0
26 Jun 2023
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles
Le‐Yu Chen
Yaohua Ma
J.N. Zhang
390
9
0
26 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High
  Dimensions
The RL Perceptron: Generalisation Dynamics of Policy Learning in High DimensionsPhysical Review X (PRX), 2023
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
379
6
0
17 Jun 2023
Towards Better Orthogonality Regularization with Disentangled Norm in
  Training Deep CNNs
Towards Better Orthogonality Regularization with Disentangled Norm in Training Deep CNNs
Changhao Wu
Shenan Zhang
Fangsong Long
Ziliang Yin
Tuo Leng
119
2
0
16 Jun 2023
Unveiling the Hessian's Connection to the Decision Boundary
Unveiling the Hessian's Connection to the Decision Boundary
Mahalakshmi Sabanayagam
Freya Behrens
Urte Adomaityte
Anna Dawid
132
7
0
12 Jun 2023
Hidden symmetries of ReLU networks
Hidden symmetries of ReLU networksInternational Conference on Machine Learning (ICML), 2023
J. E. Grigsby
Kathryn A. Lindsey
David Rolnick
202
26
0
09 Jun 2023
Machine learning with tree tensor networks, CP rank constraints, and tensor dropout
Machine learning with tree tensor networks, CP rank constraints, and tensor dropoutIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hao Chen
T. Barthel
236
19
0
30 May 2023
Previous
12345...111213
Next