ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Neural Information Processing Systems (NeurIPS), 2014
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Dong Wang
Surya Ganguli
Yoshua Bengio
    ODL
ArXiv (abs)PDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 632 papers shown
Sparse Meta Networks for Sequential Adaptation and its Application to
  Adaptive Language Modelling
Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling
Tsendsuren Munkhdalai
CLLOffRL
195
5
0
03 Sep 2020
Deep Reinforcement Learning for Contact-Rich Skills Using Compliant
  Movement Primitives
Deep Reinforcement Learning for Contact-Rich Skills Using Compliant Movement Primitives
Oren Spector
M. Zacksenhouse
85
17
0
30 Aug 2020
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Vardan Papyan
217
84
0
27 Aug 2020
Unconstrained optimisation on Riemannian manifolds
Unconstrained optimisation on Riemannian manifolds
T. Truong
94
4
0
25 Aug 2020
A community-powered search of machine learning strategy space to find
  NMR property prediction models
A community-powered search of machine learning strategy space to find NMR property prediction models
Lars A. Bratholm
W. Gerrard
Brandon M. Anderson
Shaojie Bai
Sunghwan Choi
...
A. Torrubia
Devin Willmott
C. Butts
David R. Glowacki
Kaggle participants
175
19
0
13 Aug 2020
Meta Continual Learning via Dynamic Programming
Meta Continual Learning via Dynamic Programming
R. Krishnan
Dali Wang
CLL
168
10
0
05 Aug 2020
Binary Search and First Order Gradient Based Method for Stochastic
  Optimization
Binary Search and First Order Gradient Based Method for Stochastic Optimization
V. Pandey
ODL
105
0
0
27 Jul 2020
Quantum algorithms for escaping from saddle points
Quantum algorithms for escaping from saddle points
Chenyi Zhang
Jiaqi Leng
Tongyang Li
318
23
0
20 Jul 2020
Understanding Implicit Regularization in Over-Parameterized Single Index
  Model
Understanding Implicit Regularization in Over-Parameterized Single Index ModelJournal of the American Statistical Association (JASA), 2020
Jianqing Fan
Zhuoran Yang
Mengxin Yu
313
22
0
16 Jul 2020
Biological credit assignment through dynamic inversion of feedforward
  networks
Biological credit assignment through dynamic inversion of feedforward networksNeural Information Processing Systems (NeurIPS), 2020
William F. Podlaski
C. Machens
193
21
0
10 Jul 2020
Reformulation of the No-Free-Lunch Theorem for Entangled Data Sets
Reformulation of the No-Free-Lunch Theorem for Entangled Data SetsPhysical Review Letters (PRL), 2020
Kunal Sharma
M. Cerezo
Zoë Holmes
L. Cincio
A. Sornborger
Patrick J. Coles
251
53
0
09 Jul 2020
Weak error analysis for stochastic gradient descent optimization
  algorithms
Weak error analysis for stochastic gradient descent optimization algorithms
A. Bercher
Lukas Gonon
Arnulf Jentzen
Diyora Salimova
266
4
0
03 Jul 2020
The Global Landscape of Neural Networks: An Overview
The Global Landscape of Neural Networks: An Overview
Tian Ding
Dawei Li
Shiyu Liang
Tian Ding
R. Srikant
216
93
0
02 Jul 2020
Bayesian Sparse learning with preconditioned stochastic gradient MCMC
  and its applications
Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications
Yating Wang
Wei Deng
Guang Lin
163
13
0
29 Jun 2020
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate
  and Momentum
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Zeke Xie
Xinrui Wang
Huishuai Zhang
Issei Sato
Masashi Sugiyama
ODL
620
57
0
29 Jun 2020
An analytic theory of shallow networks dynamics for hinge loss
  classification
An analytic theory of shallow networks dynamics for hinge loss classification
Franco Pellegrini
Giulio Biroli
192
19
0
19 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory
  Approach to Neural Network Training
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Diego Granziol
S. Zohren
Stephen J. Roberts
ODL
494
63
0
16 Jun 2020
SPLASH: Learnable Activation Functions for Improving Accuracy and
  Adversarial Robustness
SPLASH: Learnable Activation Functions for Improving Accuracy and Adversarial Robustness
Mohammadamin Tavakoli
Forest Agostinelli
Pierre Baldi
AAMLFAtt
290
43
0
16 Jun 2020
The Limit of the Batch Size
The Limit of the Batch Size
Yang You
Yuhui Wang
Huan Zhang
Zhao-jie Zhang
J. Demmel
Cho-Jui Hsieh
234
23
0
15 Jun 2020
Passive Batch Injection Training Technique: Boosting Network Performance
  by Injecting Mini-Batches from a different Data Distribution
Passive Batch Injection Training Technique: Boosting Network Performance by Injecting Mini-Batches from a different Data Distribution
Pravendra Singh
Pratik Mazumder
Vinay P. Namboodiri
130
0
0
08 Jun 2020
Halting Time is Predictable for Large Models: A Universality Property
  and Average-case Analysis
Halting Time is Predictable for Large Models: A Universality Property and Average-case Analysis
Courtney Paquette
B. V. Merrienboer
Elliot Paquette
Fabian Pedregosa
375
29
0
08 Jun 2020
On the Promise of the Stochastic Generalized Gauss-Newton Method for
  Training DNNs
On the Promise of the Stochastic Generalized Gauss-Newton Method for Training DNNs
Matilde Gargiani
Andrea Zanelli
Moritz Diehl
Katharina Eggensperger
ODL
202
19
0
03 Jun 2020
A fast and simple modification of Newton's method helping to avoid
  saddle points
A fast and simple modification of Newton's method helping to avoid saddle points
T. Truong
T. Tô
T. H. Nguyen
Thu Hang Nguyen
H. Nguyen
M. Helmy
154
4
0
02 Jun 2020
Review of Mathematical frameworks for Fairness in Machine Learning
Review of Mathematical frameworks for Fairness in Machine Learning
E. del Barrio
Paula Gordaliza
Jean-Michel Loubes
FaMLFedML
124
43
0
26 May 2020
qDKT: Question-centric Deep Knowledge Tracing
qDKT: Question-centric Deep Knowledge TracingEducational Data Mining (EDM), 2020
Shashank Sonkar
Andrew E. Waters
Andrew Lan
Phillip J. Grimaldi
Richard G. Baraniuk
AI4Ed
148
48
0
25 May 2020
DENS-ECG: A Deep Learning Approach for ECG Signal Delineation
DENS-ECG: A Deep Learning Approach for ECG Signal Delineation
A. Peimankar
S. Puthusserypady
183
140
0
18 May 2020
Escaping Saddle Points Efficiently with Occupation-Time-Adapted
  Perturbations
Escaping Saddle Points Efficiently with Occupation-Time-Adapted Perturbations
Xin Guo
Jiequn Han
Mahan Tajrobehkar
Wenpin Tang
208
4
0
09 May 2020
Optimization in Machine Learning: A Distribution Space Approach
Optimization in Machine Learning: A Distribution Space ApproachCommunication on Applied Mathematics and Computation (CAMC), 2020
Yongqiang Cai
Qianxiao Li
Zuowei Shen
115
1
0
18 Apr 2020
Symmetry & critical points for a model shallow neural network
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
464
15
0
23 Mar 2020
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep
  Network Losses
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network LossesNeural Computation (Neural Comput.), 2020
Charles G. Frye
James B. Simon
Neha S. Wadia
A. Ligeralde
M. DeWeese
K. Bouchard
ODL
163
3
0
23 Mar 2020
Block Layer Decomposition schemes for training Deep Neural Networks
Block Layer Decomposition schemes for training Deep Neural NetworksJournal of Global Optimization (J. Glob. Optim.), 2019
L. Palagi
R. Seccia
118
6
0
18 Mar 2020
On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial
  Attacks
On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial AttacksComputer Vision and Pattern Recognition (CVPR), 2020
Yue Zhao
Yuwei Wu
Caihua Chen
A. Lim
3DPC
320
87
0
27 Feb 2020
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient
  Shaping
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
Sanghyun Hong
Varun Chandrasekaran
Yigitcan Kaya
Tudor Dumitras
Nicolas Papernot
AAML
200
149
0
26 Feb 2020
Convergence to Second-Order Stationarity for Non-negative Matrix
  Factorization: Provably and Concurrently
Convergence to Second-Order Stationarity for Non-negative Matrix Factorization: Provably and Concurrently
Ioannis Panageas
Stratis Skoulakis
Antonios Varvitsiotis
Tianlin Li
236
2
0
26 Feb 2020
Investigating the interaction between gradient-only line searches and
  different activation functions
Investigating the interaction between gradient-only line searches and different activation functions
D. Kafka
D. Wilke
77
0
0
23 Feb 2020
The Break-Even Point on Optimization Trajectories of Deep Neural
  Networks
The Break-Even Point on Optimization Trajectories of Deep Neural NetworksInternational Conference on Learning Representations (ICLR), 2020
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Dong Wang
Krzysztof J. Geras
243
183
0
21 Feb 2020
Depth Descent Synchronization in $\mathrm{SO}(D)$
Depth Descent Synchronization in SO(D)\mathrm{SO}(D)SO(D)International Journal of Computer Vision (IJCV), 2020
Tyler Maunu
Gilad Lerman
MDE
278
2
0
13 Feb 2020
Understanding Global Loss Landscape of One-hidden-layer ReLU Networks,
  Part 1: Theory
Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 1: Theory
Bo Liu
FAttMLT
242
1
0
12 Feb 2020
Ill-Posedness and Optimization Geometry for Nonlinear Neural Network
  Training
Ill-Posedness and Optimization Geometry for Nonlinear Neural Network Training
Thomas O'Leary-Roseberry
Omar Ghattas
137
5
0
07 Feb 2020
Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex
  Optimization
Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex Optimization
Thomas O'Leary-Roseberry
Nick Alger
Omar Ghattas
ODL
170
9
0
07 Feb 2020
SEERL: Sample Efficient Ensemble Reinforcement Learning
SEERL: Sample Efficient Ensemble Reinforcement LearningAdaptive Agents and Multi-Agent Systems (AAMAS), 2019
Rohan Saphal
Balaraman Ravindran
Dheevatsa Mudigere
Sasikanth Avancha
Bharat Kaul
161
20
0
15 Jan 2020
Resolving learning rates adaptively by locating Stochastic Non-Negative
  Associated Gradient Projection Points using line searches
Resolving learning rates adaptively by locating Stochastic Non-Negative Associated Gradient Projection Points using line searchesJournal of Global Optimization (JGO), 2020
D. Kafka
D. Wilke
123
8
0
15 Jan 2020
Landscape Connectivity and Dropout Stability of SGD Solutions for
  Over-parameterized Neural Networks
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural NetworksInternational Conference on Machine Learning (ICML), 2019
Aleksandr Shevchenko
Marco Mondelli
414
41
0
20 Dec 2019
Deep Curvature Suite
Deep Curvature Suite
Diego Granziol
Xingchen Wan
T. Garipov
3DV
190
12
0
20 Dec 2019
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Tian Ding
ODL
340
178
0
19 Dec 2019
Orthogonal Convolutional Neural Networks
Orthogonal Convolutional Neural NetworksComputer Vision and Pattern Recognition (CVPR), 2019
Jiayun Wang
Yubei Chen
Rudrasis Chakraborty
Stella X. Yu
380
207
0
27 Nov 2019
A Sub-sampled Tensor Method for Non-convex Optimization
A Sub-sampled Tensor Method for Non-convex Optimization
Aurelien Lucchi
Jonas Köhler
168
0
0
23 Nov 2019
Neural Network Memorization Dissection
Neural Network Memorization Dissection
Jindong Gu
Volker Tresp
FedML
118
12
0
21 Nov 2019
Information-Theoretic Local Minima Characterization and Regularization
Information-Theoretic Local Minima Characterization and RegularizationInternational Conference on Machine Learning (ICML), 2019
Zhiwei Jia
Hao Su
232
21
0
19 Nov 2019
Convergence to minima for the continuous version of Backtracking
  Gradient Descent
Convergence to minima for the continuous version of Backtracking Gradient Descent
T. Truong
134
18
0
11 Nov 2019
Previous
123...567...111213
Next