ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Neural Information Processing Systems (NeurIPS), 2014
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Dong Wang
Surya Ganguli
Yoshua Bengio
    ODL
ArXiv (abs)PDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 631 papers shown
Title
Escaping Saddle Points with Compressed SGD
Escaping Saddle Points with Compressed SGDNeural Information Processing Systems (NeurIPS), 2021
Dmitrii Avdiukhin
G. Yaroslavtsev
162
4
0
21 May 2021
Apply Artificial Neural Network to Solving Manpower Scheduling Problem
Apply Artificial Neural Network to Solving Manpower Scheduling Problem
Tianyu Liu
Lingyu Zhang
64
2
0
07 May 2021
A Bi-Encoder LSTM Model For Learning Unstructured Dialogs
A Bi-Encoder LSTM Model For Learning Unstructured Dialogs
Diwanshu Shekhar
P. Negi
Mohammad H. Mahoor
96
2
0
25 Apr 2021
Exact Stochastic Second Order Deep Learning
Exact Stochastic Second Order Deep Learning
F. Mehouachi
C. Kasmi
ODL
61
1
0
08 Apr 2021
Training Deep Neural Networks via Branch-and-Bound
Training Deep Neural Networks via Branch-and-Bound
Yuanwei Wu
Ziming Zhang
Guanghui Wang
ODL
222
0
0
05 Apr 2021
Generative Minimization Networks: Training GANs Without Competition
Generative Minimization Networks: Training GANs Without Competition
Paulina Grnarova
Yannic Kilcher
Kfir Y. Levy
Aurelien Lucchi
Thomas Hofmann
GAN
88
8
0
23 Mar 2021
Landscape analysis for shallow neural networks: complete classification
  of critical points for affine target functions
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functionsJournal of nonlinear science (J. Nonlinear Sci.), 2021
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
190
12
0
19 Mar 2021
Escaping Saddle Points in Distributed Newton's Method with Communication
  Efficiency and Byzantine Resilience
Escaping Saddle Points in Distributed Newton's Method with Communication Efficiency and Byzantine Resilience
Avishek Ghosh
R. Maity
A. Mazumdar
Kannan Ramchandran
FedML
262
4
0
17 Mar 2021
Hessian Eigenspectra of More Realistic Nonlinear Models
Hessian Eigenspectra of More Realistic Nonlinear ModelsNeural Information Processing Systems (NeurIPS), 2021
Zhenyu Liao
Michael W. Mahoney
229
37
0
02 Mar 2021
Panel semiparametric quantile regression neural network for electricity
  consumption forecasting
Panel semiparametric quantile regression neural network for electricity consumption forecastingEcological Informatics (Ecol. Inform.), 2021
Xingcai Zhou
Jiangyan Wang
AI4TS
104
19
0
01 Mar 2021
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise
  Linear Activations
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise Linear ActivationsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Bo Liu
123
8
0
25 Feb 2021
Learning Neural Network Subspaces
Learning Neural Network SubspacesInternational Conference on Machine Learning (ICML), 2021
Mitchell Wortsman
Maxwell Horton
Carlos Guestrin
Ali Farhadi
Mohammad Rastegari
UQCV
265
96
0
20 Feb 2021
Training Aware Sigmoidal Optimizer
Training Aware Sigmoidal Optimizer
David Macêdo
Pedro Dreyer
Teresa B Ludermir
Cleber Zanchettin
ODL
52
2
0
17 Feb 2021
Appearance of Random Matrix Theory in Deep Learning
Appearance of Random Matrix Theory in Deep Learning
Nicholas P. Baskerville
Diego Granziol
J. Keating
271
12
0
12 Feb 2021
Exploiting Spline Models for the Training of Fully Connected Layers in
  Neural Network
Exploiting Spline Models for the Training of Fully Connected Layers in Neural Network
Kanya Mo
Shen Zheng
Xiwei Wang
Jinghua Wang
Klaus-Dieter Schewe Zhejiang University
59
0
0
12 Feb 2021
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize
  Criticality
SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize CriticalityAnnual Conference Computational Learning Theory (COLT), 2021
Courtney Paquette
Kiwon Lee
Fabian Pedregosa
Elliot Paquette
117
38
0
08 Feb 2021
Escaping Saddle Points for Nonsmooth Weakly Convex Functions via Perturbed Proximal Algorithms
Escaping Saddle Points for Nonsmooth Weakly Convex Functions via Perturbed Proximal Algorithms
Minhui Huang
Weiming Zhu
225
7
0
04 Feb 2021
Recent Advances in Adversarial Training for Adversarial Robustness
Recent Advances in Adversarial Training for Adversarial RobustnessInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Tao Bai
Jinqi Luo
Jun Zhao
Bihan Wen
Qian Wang
AAML
349
562
0
02 Feb 2021
Kähler Geometry of Quiver Varieties and Machine Learning
Kähler Geometry of Quiver Varieties and Machine Learning
G. Jeffreys
Siu-Cheong Lau
120
5
0
27 Jan 2021
On the Differentially Private Nature of Perturbed Gradient Descent
On the Differentially Private Nature of Perturbed Gradient Descent
Thulasi Tholeti
Sheetal Kalyani
115
1
0
18 Jan 2021
Towards glass-box CNNs
Towards glass-box CNNs
Manaswini Piduguralla
Jignesh S. Bhatt
130
4
0
11 Jan 2021
Recoding latent sentence representations -- Dynamic gradient-based
  activation modification in RNNs
Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs
Dennis Ulmer
136
0
0
03 Jan 2021
Topological obstructions in neural networks learning
Topological obstructions in neural networks learning
S. Barannikov
Daria Voronkova
I. Trofimov
Alexander Korotin
Grigorii Sotnikov
Evgeny Burnaev
169
6
0
31 Dec 2020
Stochastic Approximation for Online Tensorial Independent Component
  Analysis
Stochastic Approximation for Online Tensorial Independent Component AnalysisAnnual Conference Computational Learning Theory (COLT), 2020
C. J. Li
Sai Li
223
2
0
28 Dec 2020
Adaptively Solving the Local-Minimum Problem for Deep Neural Networks
Adaptively Solving the Local-Minimum Problem for Deep Neural Networks
Huachuan Wang
J. Lo
ODL
107
5
0
25 Dec 2020
Physical deep learning based on optimal control of dynamical systems
Physical deep learning based on optimal control of dynamical systemsPhysical Review Applied (PR Applied), 2020
Genki Furuhata
T. Niiyama
S. Sunada
PINNAI4CE
273
18
0
16 Dec 2020
A Deep Graph Neural Networks Architecture Design: From Global
  Pyramid-like Shrinkage Skeleton to Local Topology Link Rewiring
A Deep Graph Neural Networks Architecture Design: From Global Pyramid-like Shrinkage Skeleton to Local Topology Link Rewiring
Gege Zhang
79
0
0
16 Dec 2020
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and
  its Applications to Regularization
A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization
Adepu Ravi Sankar
Yash Khasbage
Rahul Vigneswaran
V. Balasubramanian
188
51
0
07 Dec 2020
A Variant of Gradient Descent Algorithm Based on Gradient Averaging
A Variant of Gradient Descent Algorithm Based on Gradient Averaging
Saugata Purkayastha
Sukannya Purkayastha
ODL
99
2
0
04 Dec 2020
Convergence Analysis of Homotopy-SGD for non-convex optimization
Convergence Analysis of Homotopy-SGD for non-convex optimization
Matilde Gargiani
Andrea Zanelli
Quoc Tran-Dinh
Moritz Diehl
Katharina Eggensperger
135
3
0
20 Nov 2020
A Random Matrix Theory Approach to Damping in Deep Learning
A Random Matrix Theory Approach to Damping in Deep Learning
Diego Granziol
Nicholas P. Baskerville
AI4CEODL
296
3
0
15 Nov 2020
Minimal Model Structure Analysis for Input Reconstruction in Federated
  Learning
Minimal Model Structure Analysis for Input Reconstruction in Federated Learning
Jia Qian
Hiba Nassar
Lars Kai Hansen
FedML
232
9
0
29 Oct 2020
Memorizing without overfitting: Bias, variance, and interpolation in
  over-parameterized models
Memorizing without overfitting: Bias, variance, and interpolation in over-parameterized modelsPhysical Review Research (PRResearch), 2020
J. Rocks
Pankaj Mehta
373
53
0
26 Oct 2020
Exploring the Security Boundary of Data Reconstruction via Neuron
  Exclusivity Analysis
Exploring the Security Boundary of Data Reconstruction via Neuron Exclusivity AnalysisUSENIX Security Symposium (USENIX Security), 2020
Xudong Pan
Mi Zhang
Yifan Yan
Jiaming Zhu
Zhemin Yang
AAML
171
24
0
26 Oct 2020
Not all parameters are born equal: Attention is mostly what you need
Not all parameters are born equal: Attention is mostly what you need
Nikolay Bogoychev
MoE
112
9
0
22 Oct 2020
Deep Neural Networks Are Congestion Games: From Loss Landscape to
  Wardrop Equilibrium and Beyond
Deep Neural Networks Are Congestion Games: From Loss Landscape to Wardrop Equilibrium and Beyond
Nina Vesseron
I. Redko
Charlotte Laclau
76
5
0
21 Oct 2020
Softmax Deep Double Deterministic Policy Gradients
Softmax Deep Double Deterministic Policy GradientsNeural Information Processing Systems (NeurIPS), 2020
Ling Pan
Qingpeng Cai
Longbo Huang
214
114
0
19 Oct 2020
An adaptive Hessian approximated stochastic gradient MCMC method
An adaptive Hessian approximated stochastic gradient MCMC methodJournal of Computational Physics (JCP), 2020
Yating Wang
Wei Deng
Guang Lin
BDL
76
5
0
03 Oct 2020
Expectigrad: Fast Stochastic Optimization with Robust Convergence
  Properties
Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties
Brett Daley
Chris Amato
ODL
126
4
0
03 Oct 2020
Optimization Landscapes of Wide Deep Neural Networks Are Benign
Optimization Landscapes of Wide Deep Neural Networks Are Benign
Johannes Lederer
197
9
0
02 Oct 2020
Task Agnostic Continual Learning Using Online Variational Bayes with
  Fixed-Point Updates
Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point UpdatesNeural Computation (Neural Comput.), 2020
Chen Zeno
Itay Golan
Elad Hoffer
Daniel Soudry
OODFedML
201
49
0
01 Oct 2020
Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for
  Nonconvex Stochastic Optimization
Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization
Xuezhe Ma
ODL
328
33
0
28 Sep 2020
Escaping Saddle-Points Faster under Interpolation-like Conditions
Escaping Saddle-Points Faster under Interpolation-like Conditions
Abhishek Roy
Krishnakumar Balasubramanian
Saeed Ghadimi
P. Mohapatra
220
1
0
28 Sep 2020
Anomalous diffusion dynamics of learning in deep neural networks
Anomalous diffusion dynamics of learning in deep neural networksNeural Networks (NN), 2020
Guozhang Chen
Chengqing Qu
P. Gong
187
23
0
22 Sep 2020
Stochastic Gradient Langevin Dynamics Algorithms with Adaptive Drifts
Stochastic Gradient Langevin Dynamics Algorithms with Adaptive Drifts
Sehwan Kim
Qifan Song
F. Liang
BDL
104
14
0
20 Sep 2020
Escaping Saddle Points in Ill-Conditioned Matrix Completion with a
  Scalable Second Order Method
Escaping Saddle Points in Ill-Conditioned Matrix Completion with a Scalable Second Order Method
C. Kümmerle
C. M. Verdun
157
6
0
07 Sep 2020
An Analysis of Alternating Direction Method of Multipliers for
  Feed-forward Neural Networks
An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks
Seyedeh Niusha Alavi Foumani
Ce Guo
Wayne Luk
70
1
0
06 Sep 2020
An FPGA Accelerated Method for Training Feed-forward Neural Networks
  Using Alternating Direction Method of Multipliers and LSMR
An FPGA Accelerated Method for Training Feed-forward Neural Networks Using Alternating Direction Method of Multipliers and LSMR
Seyedeh Niusha Alavi Foumani
Ce Guo
Wayne Luk
90
3
0
06 Sep 2020
Optimizing Mode Connectivity via Neuron Alignment
Optimizing Mode Connectivity via Neuron AlignmentNeural Information Processing Systems (NeurIPS), 2020
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
MoMe
593
91
0
05 Sep 2020
Sparse Meta Networks for Sequential Adaptation and its Application to
  Adaptive Language Modelling
Sparse Meta Networks for Sequential Adaptation and its Application to Adaptive Language Modelling
Tsendsuren Munkhdalai
CLLOffRL
130
5
0
03 Sep 2020
Previous
123456...111213
Next