ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Neural Information Processing Systems (NeurIPS), 2014
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Dong Wang
Surya Ganguli
Yoshua Bengio
    ODL
ArXiv (abs)PDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 633 papers shown
Sub-Optimal Local Minima Exist for Neural Networks with Almost All
  Non-Linear Activations
Sub-Optimal Local Minima Exist for Neural Networks with Almost All Non-Linear Activations
Tian Ding
Dawei Li
Tian Ding
335
14
0
04 Nov 2019
Efficiently avoiding saddle points with zero order methods: No gradients
  required
Efficiently avoiding saddle points with zero order methods: No gradients requiredNeural Information Processing Systems (NeurIPS), 2019
Lampros Flokas
Emmanouil-Vasileios Vlatakis-Gkaragkounis
Georgios Piliouras
146
35
0
29 Oct 2019
A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training
  of DNNs
A Simple Dynamic Learning Rate Tuning Algorithm For Automated Training of DNNs
Koyel Mukherjee
Alind Khare
Ashish Verma
144
20
0
25 Oct 2019
Geometry of learning neural quantum states
Geometry of learning neural quantum states
Chae-Yeun Park
M. Kastoryano
145
71
0
24 Oct 2019
Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural
  Networks Difficult
Vanishing Nodes: Another Phenomenon That Makes Training Deep Neural Networks Difficult
Wen-Yu Chang
Tsung-Nan Lin
GNN
137
0
0
22 Oct 2019
On Distributed Stochastic Gradient Algorithms for Global Optimization
On Distributed Stochastic Gradient Algorithms for Global OptimizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Brian Swenson
Anirudh Sridhar
H. Vincent Poor
213
10
0
21 Oct 2019
Hidden Unit Specialization in Layered Neural Networks: ReLU vs.
  Sigmoidal Activation
Hidden Unit Specialization in Layered Neural Networks: ReLU vs. Sigmoidal Activation
Elisa Oostwal
Michiel Straat
Michael Biehl
MLT
252
66
0
16 Oct 2019
Emergent properties of the local geometry of neural loss landscapes
Emergent properties of the local geometry of neural loss landscapes
Stanislav Fort
Surya Ganguli
213
54
0
14 Oct 2019
The Expressivity and Training of Deep Neural Networks: toward the Edge
  of Chaos?
The Expressivity and Training of Deep Neural Networks: toward the Edge of Chaos?
Gege Zhang
Gang-cheng Li
Ningwei Shen
Weidong Zhang
166
7
0
11 Oct 2019
NGBoost: Natural Gradient Boosting for Probabilistic Prediction
NGBoost: Natural Gradient Boosting for Probabilistic PredictionInternational Conference on Machine Learning (ICML), 2019
Tony Duan
Anand Avati
D. Ding
Khanh K. Thai
S. Basu
A. Ng
Alejandro Schuler
BDL
379
388
0
08 Oct 2019
Generalization Bounds for Convolutional Neural Networks
Generalization Bounds for Convolutional Neural Networks
Shan Lin
Jingwei Zhang
MLT
133
36
0
03 Oct 2019
Escaping Saddle Points for Zeroth-order Nonconvex Optimization using
  Estimated Gradient Descent
Escaping Saddle Points for Zeroth-order Nonconvex Optimization using Estimated Gradient DescentAnnual Conference on Information Sciences and Systems (CISS), 2019
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
142
7
0
03 Oct 2019
The asymptotic spectrum of the Hessian of DNN throughout training
The asymptotic spectrum of the Hessian of DNN throughout trainingInternational Conference on Learning Representations (ICLR), 2019
Arthur Jacot
Franck Gabriel
Clément Hongler
264
38
0
01 Oct 2019
Distance Geometry and Data Science
Distance Geometry and Data ScienceTOP - An Official Journal of the Spanish Society of Statistics and Operations Research (TOP), 2019
Leo Liberti
107
33
0
18 Sep 2019
Dynamics of Deep Neural Networks and Neural Tangent Hierarchy
Dynamics of Deep Neural Networks and Neural Tangent HierarchyInternational Conference on Machine Learning (ICML), 2019
Jiaoyang Huang
H. Yau
169
161
0
18 Sep 2019
Quantum algorithm for finding the negative curvature direction in
  non-convex optimization
Quantum algorithm for finding the negative curvature direction in non-convex optimization
Kaining Zhang
Min-hsiu Hsieh
Liu Liu
Dacheng Tao
129
3
0
17 Sep 2019
diffGrad: An Optimization Method for Convolutional Neural Networks
diffGrad: An Optimization Method for Convolutional Neural NetworksIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2019
S. Dubey
Soumendu Chakraborty
Swalpa Kumar Roy
Snehasis Mukherjee
S. Singh
B. B. Chaudhuri
ODL
359
214
0
12 Sep 2019
Solving Continual Combinatorial Selection via Deep Reinforcement
  Learning
Solving Continual Combinatorial Selection via Deep Reinforcement LearningInternational Joint Conference on Artificial Intelligence (IJCAI), 2019
Hyungseok Song
Hyeryung Jang
H. Tran
Se-eun Yoon
Kyunghwan Son
Donggyu Yun
Hyoju Chung
Yung Yi
74
11
0
09 Sep 2019
Towards Understanding the Importance of Noise in Training Neural
  Networks
Towards Understanding the Importance of Noise in Training Neural NetworksInternational Conference on Machine Learning (ICML), 2019
Mo Zhou
Tianyi Liu
Yan Li
Dachao Lin
Enlu Zhou
T. Zhao
MLT
128
29
0
07 Sep 2019
Automated Polysomnography Analysis for Detection of Non-Apneic and
  Non-Hypopneic Arousals using Feature Engineering and a Bidirectional LSTM
  Network
Automated Polysomnography Analysis for Detection of Non-Apneic and Non-Hypopneic Arousals using Feature Engineering and a Bidirectional LSTM Network
Ali Bahrami Rad
M. Zabihi
Zheng Zhao
Moncef Gabbouj
Aggelos K. Katsaggelos
Simo Särkkä
166
6
0
06 Sep 2019
LCA: Loss Change Allocation for Neural Network Training
LCA: Loss Change Allocation for Neural Network TrainingNeural Information Processing Systems (NeurIPS), 2019
Janice Lan
Rosanne Liu
Hattie Zhou
J. Yosinski
198
27
0
03 Sep 2019
Partitioned integrators for thermodynamic parameterization of neural
  networks
Partitioned integrators for thermodynamic parameterization of neural networksFoundations of Data Science (FODS), 2019
Benedict Leimkuhler
Charles Matthews
Tiffany J. Vlaar
ODL
231
21
0
30 Aug 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems
  Perspective
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective
Guan-Horng Liu
Evangelos A. Theodorou
AI4CE
293
74
0
28 Aug 2019
The many faces of deep learning
The many faces of deep learning
Raul Vicente
FedMLAI4CE
96
0
0
25 Aug 2019
Tackling Algorithmic Bias in Neural-Network Classifiers using
  Wasserstein-2 Regularization
Tackling Algorithmic Bias in Neural-Network Classifiers using Wasserstein-2 RegularizationJournal of Mathematical Imaging and Vision (JMIV), 2019
Laurent Risser
Alberto González Sanz
Quentin Vincenot
Jean-Michel Loubes
313
24
0
15 Aug 2019
Distributed Gradient Descent: Nonconvergence to Saddle Points and the
  Stable-Manifold Theorem
Distributed Gradient Descent: Nonconvergence to Saddle Points and the Stable-Manifold TheoremAllerton Conference on Communication, Control, and Computing (Allerton), 2019
Brian Swenson
Ryan W. Murray
H. Vincent Poor
S. Kar
358
16
0
07 Aug 2019
Extending the step-size restriction for gradient descent to avoid strict
  saddle points
Extending the step-size restriction for gradient descent to avoid strict saddle pointsSIAM Journal on Mathematics of Data Science (SIMODS), 2019
Hayden Schaeffer
S. McCalla
181
8
0
05 Aug 2019
Multi-Point Bandit Algorithms for Nonstationary Online Nonconvex
  Optimization
Multi-Point Bandit Algorithms for Nonstationary Online Nonconvex Optimization
Abhishek Roy
Krishnakumar Balasubramanian
Saeed Ghadimi
P. Mohapatra
OffRL
196
16
0
31 Jul 2019
Towards Understanding Generalization in Gradient-Based Meta-Learning
Towards Understanding Generalization in Gradient-Based Meta-Learning
Simon Guiroy
Vikas Verma
C. Pal
153
22
0
16 Jul 2019
SGD momentum optimizer with step estimation by online parabola model
SGD momentum optimizer with step estimation by online parabola model
J. Duda
ODL
114
30
0
16 Jul 2019
Weight-space symmetry in deep networks gives rise to permutation
  saddles, connected by equal-loss valleys across the loss landscape
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
284
65
0
05 Jul 2019
Learning One-hidden-layer neural networks via Provable Gradient Descent with Random Initialization
Shuhao Xia
Yuanming Shi
ODLMLT
110
0
0
04 Jul 2019
The Difficulty of Training Sparse Neural Networks
The Difficulty of Training Sparse Neural Networks
Utku Evci
Fabian Pedregosa
Aidan Gomez
Erich Elsen
332
106
0
25 Jun 2019
Global Convergence of Policy Gradient Methods to (Almost) Locally
  Optimal Policies
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal PoliciesSIAM Journal of Control and Optimization (SICON), 2019
Jianchao Tan
Alec Koppel
Haoqi Zhu
Tamer Basar
397
207
0
19 Jun 2019
Explaining Landscape Connectivity of Low-cost Solutions for Multilayer
  Nets
Explaining Landscape Connectivity of Low-cost Solutions for Multilayer NetsNeural Information Processing Systems (NeurIPS), 2019
Rohith Kuditipudi
Xiang Wang
Holden Lee
Yi Zhang
Zhiyuan Li
Wei Hu
Sanjeev Arora
Rong Ge
FAtt
392
101
0
14 Jun 2019
Critical Point Finding with Newton-MR by Analogy to Computing Square
  Roots
Critical Point Finding with Newton-MR by Analogy to Computing Square Roots
Charles G. Frye
103
1
0
12 Jun 2019
A Closer Look at the Optimization Landscapes of Generative Adversarial
  Networks
A Closer Look at the Optimization Landscapes of Generative Adversarial NetworksInternational Conference on Learning Representations (ICLR), 2019
Hugo Berard
Gauthier Gidel
Amjad Almahairi
Pascal Vincent
Damien Scieur
GAN
171
66
0
11 Jun 2019
Time Matters in Regularizing Deep Networks: Weight Decay and Data
  Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence
Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near ConvergenceNeural Information Processing Systems (NeurIPS), 2019
Aditya Golatkar
Alessandro Achille
Stefano Soatto
147
103
0
30 May 2019
Single neuron-based neural networks are as efficient as dense deep
  neural networks in binary and multi-class recognition problems
Single neuron-based neural networks are as efficient as dense deep neural networks in binary and multi-class recognition problems
Yassin Khalifa
Justin Hawks
E. Sejdić
53
0
0
28 May 2019
Abstraction Mechanisms Predict Generalization in Deep Neural Networks
Abstraction Mechanisms Predict Generalization in Deep Neural NetworksInternational Conference on Machine Learning (ICML), 2019
Alex Gain
H. Siegelmann
AI4CE
223
6
0
27 May 2019
Deep Online Learning with Stochastic Constraints
Deep Online Learning with Stochastic Constraints
Guy Uziel
OffRLBDL
82
2
0
26 May 2019
Leader Stochastic Gradient Descent for Distributed Training of Deep
  Learning Models: Extension
Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: ExtensionNeural Information Processing Systems (NeurIPS), 2019
Yunfei Teng
Wenbo Gao
F. Chalus
A. Choromańska
Shiqian Ma
Adrian Weller
441
14
0
24 May 2019
Convergence Analyses of Online ADAM Algorithm in Convex Setting and
  Two-Layer ReLU Neural Network
Convergence Analyses of Online ADAM Algorithm in Convex Setting and Two-Layer ReLU Neural Network
Biyi Fang
Diego Klabjan
216
8
0
22 May 2019
Adaptive norms for deep learning with regularized Newton methods
Adaptive norms for deep learning with regularized Newton methods
Jonas Köhler
Leonard Adolphs
Aurelien Lucchi
ODL
163
12
0
22 May 2019
Orthogonal Deep Neural Networks
Orthogonal Deep Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Kui Jia
Shuai Li
Yuxin Wen
Tongliang Liu
Dacheng Tao
182
149
0
15 May 2019
A Generative Model for Sampling High-Performance and Diverse Weights for
  Neural Networks
A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks
Lior Deutsch
Erik Nijkamp
Yu Yang
107
16
0
07 May 2019
Deep learning as optimal control problems: models and numerical methods
Deep learning as optimal control problems: models and numerical methods
Martin Benning
E. Celledoni
Matthias Joachim Ehrhardt
B. Owren
Carola-Bibiane Schönlieb
258
88
0
11 Apr 2019
Traversing the noise of dynamic mini-batch sub-sampled loss functions: A
  visual guide
Traversing the noise of dynamic mini-batch sub-sampled loss functions: A visual guide
D. Kafka
D. Wilke
142
0
0
20 Mar 2019
Annealing for Distributed Global Optimization
Annealing for Distributed Global OptimizationIEEE Conference on Decision and Control (CDC), 2019
Brian Swenson
S. Kar
H. Vincent Poor
J. M. F. Moura
178
31
0
18 Mar 2019
Machine Learning Solutions for High Energy Physics: Applications to
  Electromagnetic Shower Generation, Flavor Tagging, and the Search for
  di-Higgs Production
Machine Learning Solutions for High Energy Physics: Applications to Electromagnetic Shower Generation, Flavor Tagging, and the Search for di-Higgs Production
Michela Paganini
119
1
0
12 Mar 2019
Previous
123...678...111213
Next