Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1406.2572
Cited By
Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
Neural Information Processing Systems (NeurIPS), 2014
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Dong Wang
Surya Ganguli
Yoshua Bengio
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"
50 / 631 papers shown
Title
Escaping Saddle Points with Compressed SGD
Neural Information Processing Systems (NeurIPS), 2021
Dmitrii Avdiukhin
G. Yaroslavtsev
162
4
0
21 May 2021
Apply Artificial Neural Network to Solving Manpower Scheduling Problem
Tianyu Liu
Lingyu Zhang
64
2
0
07 May 2021
A Bi-Encoder LSTM Model For Learning Unstructured Dialogs
Diwanshu Shekhar
P. Negi
Mohammad H. Mahoor
96
2
0
25 Apr 2021
Exact Stochastic Second Order Deep Learning
F. Mehouachi
C. Kasmi
ODL
61
1
0
08 Apr 2021
Training Deep Neural Networks via Branch-and-Bound
Yuanwei Wu
Ziming Zhang
Guanghui Wang
ODL
222
0
0
05 Apr 2021
Generative Minimization Networks: Training GANs Without Competition
Paulina Grnarova
Yannic Kilcher
Kfir Y. Levy
Aurelien Lucchi
Thomas Hofmann
GAN
88
8
0
23 Mar 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Journal of nonlinear science (J. Nonlinear Sci.), 2021
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
190
12
0
19 Mar 2021
Escaping Saddle Points in Distributed Newton's Method with Communication Efficiency and Byzantine Resilience
Avishek Ghosh
R. Maity
A. Mazumdar
Kannan Ramchandran
FedML
262
4
0
17 Mar 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Neural Information Processing Systems (NeurIPS), 2021
Zhenyu Liao
Michael W. Mahoney
229
37
0
02 Mar 2021
Panel semiparametric quantile regression neural network for electricity consumption forecasting
Ecological Informatics (Ecol. Inform.), 2021
Xingcai Zhou
Jiangyan Wang
AI4TS
104
19
0
01 Mar 2021
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise Linear Activations
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Bo Liu
123
8
0
25 Feb 2021
Learning Neural Network Subspaces
International Conference on Machine Learning (ICML), 2021
Mitchell Wortsman
Maxwell Horton
Carlos Guestrin
Ali Farhadi
Mohammad Rastegari
UQCV
265
96
0
20 Feb 2021
Training Aware Sigmoidal Optimizer
David Macêdo
Pedro Dreyer
Teresa B Ludermir
Cleber Zanchettin
ODL
52
2
0
17 Feb 2021
Appearance of Random Matrix Theory in Deep Learning
Nicholas P. Baskerville
Diego Granziol
J. Keating
271
12
0
12 Feb 2021
Exploiting Spline Models for the Training of Fully Connected Layers in Neural Network
Kanya Mo
Shen Zheng
Xiwei Wang
Jinghua Wang
Klaus-Dieter Schewe Zhejiang University
59
0
0
12 Feb 2021
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality
Annual Conference Computational Learning Theory (COLT), 2021
Courtney Paquette
Kiwon Lee
Fabian Pedregosa
Elliot Paquette
117
38
0
08 Feb 2021
Escaping Saddle Points for Nonsmooth Weakly Convex Functions via Perturbed Proximal Algorithms
Minhui Huang
Weiming Zhu
225
7
0
04 Feb 2021
Recent Advances in Adversarial Training for Adversarial Robustness
International Joint Conference on Artificial Intelligence (IJCAI), 2021
Tao Bai
Jinqi Luo
Jun Zhao
Bihan Wen
Qian Wang
AAML
349
562
0
02 Feb 2021
Kähler Geometry of Quiver Varieties and Machine Learning
G. Jeffreys
Siu-Cheong Lau
120
5
0
27 Jan 2021
On the Differentially Private Nature of Perturbed Gradient Descent
Thulasi Tholeti
Sheetal Kalyani
115
1
0
18 Jan 2021
Towards glass-box CNNs
Manaswini Piduguralla
Jignesh S. Bhatt
130
4
0
11 Jan 2021
Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs
Dennis Ulmer
136
0
0
03 Jan 2021
Topological obstructions in neural networks learning
S. Barannikov
Daria Voronkova
I. Trofimov
Alexander Korotin
Grigorii Sotnikov
Evgeny Burnaev
169
6
0
31 Dec 2020
Stochastic Approximation for Online Tensorial Independent Component Analysis
Annual Conference Computational Learning Theory (COLT), 2020
C. J. Li
Sai Li
223
2
0
28 Dec 2020
Adaptively Solving the Local-Minimum Problem for Deep Neural Networks
Huachuan Wang
J. Lo
ODL
107
5
0
25 Dec 2020
Physical deep learning based on optimal control of dynamical systems
Physical Review Applied (PR Applied), 2020
Genki Furuhata
T. Niiyama
S. Sunada
PINN
AI4CE
273
18
0
16 Dec 2020
A Deep Graph Neural Networks Architecture Design: From Global Pyramid-like Shrinkage Skeleton to Local Topology Link Rewiring
Gege Zhang
79
0
0
16 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
188
51
0
07 Dec 2020
A Variant of Gradient Descent Algorithm Based on Gradient Averaging
Saugata Purkayastha
Sukannya Purkayastha
ODL
99
2
0
04 Dec 2020
Convergence Analysis of Homotopy-SGD for non-convex optimization
Matilde Gargiani
Andrea Zanelli
Quoc Tran-Dinh
Moritz Diehl
Katharina Eggensperger
135
3
0
20 Nov 2020
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CE
ODL
296
3
0
15 Nov 2020
Minimal Model Structure Analysis for Input Reconstruction in Federated Learning
Jia Qian
Hiba Nassar
Lars Kai Hansen
FedML
232
9
0
29 Oct 2020
Memorizing without overfitting: Bias, variance, and interpolation in over-parameterized models
Physical Review Research (PRResearch), 2020
J. Rocks
Pankaj Mehta
373
53
0
26 Oct 2020
Exploring the Security Boundary of Data Reconstruction via Neuron Exclusivity Analysis
USENIX Security Symposium (USENIX Security), 2020
Xudong Pan
Mi Zhang
Yifan Yan
Jiaming Zhu
Zhemin Yang
AAML
171
24
0
26 Oct 2020
Not all parameters are born equal: Attention is mostly what you need
Nikolay Bogoychev
MoE
112
9
0
22 Oct 2020
Deep Neural Networks Are Congestion Games: From Loss Landscape to Wardrop Equilibrium and Beyond
Nina Vesseron
I. Redko
Charlotte Laclau
76
5
0
21 Oct 2020
Softmax Deep Double Deterministic Policy Gradients
Neural Information Processing Systems (NeurIPS), 2020
Ling Pan
Qingpeng Cai
Longbo Huang
214
114
0
19 Oct 2020
An adaptive Hessian approximated stochastic gradient MCMC method
Journal of Computational Physics (JCP), 2020
Yating Wang
Wei Deng
Guang Lin
BDL
76
5
0
03 Oct 2020
Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties
Brett Daley
Chris Amato
ODL
126
4
0
03 Oct 2020
Optimization Landscapes of Wide Deep Neural Networks Are Benign
Johannes Lederer
197
9
0
02 Oct 2020
Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates
Neural Computation (Neural Comput.), 2020
Chen Zeno
Itay Golan
Elad Hoffer
Daniel Soudry
OOD
FedML
201
49
0
01 Oct 2020
Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization
Xuezhe Ma
ODL
328
33
0
28 Sep 2020
Escaping Saddle-Points Faster under Interpolation-like Conditions
Abhishek Roy
Krishnakumar Balasubramanian
Saeed Ghadimi
P. Mohapatra
220
1
0
28 Sep 2020
Anomalous diffusion dynamics of learning in deep neural networks
Neural Networks (NN), 2020
Guozhang Chen
Chengqing Qu
P. Gong
187
23
0
22 Sep 2020
Stochastic Gradient Langevin Dynamics Algorithms with Adaptive Drifts
Sehwan Kim
Qifan Song
F. Liang
BDL
104
14
0
20 Sep 2020
Escaping Saddle Points in Ill-Conditioned Matrix Completion with a Scalable Second Order Method
C. Kümmerle
C. M. Verdun
157
6
0
07 Sep 2020
An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks
Seyedeh Niusha Alavi Foumani
Ce Guo
Wayne Luk
70
1
0
06 Sep 2020
An FPGA Accelerated Method for Training Feed-forward Neural Networks Using Alternating Direction Method of Multipliers and LSMR
Seyedeh Niusha Alavi Foumani
Ce Guo
Wayne Luk
90
3
0
06 Sep 2020
Optimizing Mode Connectivity via Neuron Alignment
Neural Information Processing Systems (NeurIPS), 2020
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
MoMe
593
91
0
05 Sep 2020
Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling
Tsendsuren Munkhdalai
CLL
OffRL
130
5
0
03 Sep 2020
Previous
1
2
3
4
5
6
...
11
12
13
Next