ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2572
  4. Cited By
Identifying and attacking the saddle point problem in high-dimensional
  non-convex optimization

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

Neural Information Processing Systems (NeurIPS), 2014
10 June 2014
Yann N. Dauphin
Razvan Pascanu
Çağlar Gülçehre
Dong Wang
Surya Ganguli
Yoshua Bengio
    ODL
ArXiv (abs)PDFHTML

Papers citing "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization"

50 / 631 papers shown
Title
A Walk with SGD
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
294
129
0
24 Feb 2018
Sensitivity and Generalization in Neural Networks: an Empirical Study
Sensitivity and Generalization in Neural Networks: an Empirical Study
Roman Novak
Yasaman Bahri
Daniel A. Abolafia
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
AAML
342
477
0
23 Feb 2018
Hessian-based Analysis of Large Batch Training and Robustness to
  Adversaries
Hessian-based Analysis of Large Batch Training and Robustness to Adversaries
Z. Yao
A. Gholami
Qi Lei
Kurt Keutzer
Michael W. Mahoney
382
176
0
22 Feb 2018
Generalization in Machine Learning via Analytical Learning Theory
Generalization in Machine Learning via Analytical Learning Theory
Kenji Kawaguchi
Yoshua Bengio
Vikas Verma
Leslie Pack Kaelbling
106
10
0
21 Feb 2018
Memcomputing: Leveraging memory and physics to compute efficiently
Memcomputing: Leveraging memory and physics to compute efficiently
M. Di Ventra
F. Traversa
135
79
0
20 Feb 2018
EA-CG: An Approximate Second-Order Method for Training Fully-Connected
  Neural Networks
EA-CG: An Approximate Second-Order Method for Training Fully-Connected Neural Networks
Sheng-Wei Chen
Chun-Nan Chou
Edward Y. Chang
142
5
0
19 Feb 2018
Understanding the Loss Surface of Neural Networks for Binary
  Classification
Understanding the Loss Surface of Neural Networks for Binary Classification
Shiyu Liang
Tian Ding
Shouqing Yang
R. Srikant
222
91
0
19 Feb 2018
Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross
  Entropy
Guaranteed Recovery of One-Hidden-Layer Neural Networks via Cross Entropy
H. Fu
Yuejie Chi
Yingbin Liang
FedML
319
41
0
18 Feb 2018
Spurious Valleys in Two-layer Neural Network Optimization Landscapes
Spurious Valleys in Two-layer Neural Network Optimization Landscapes
Luca Venturi
Afonso S. Bandeira
Joan Bruna
281
75
0
18 Feb 2018
Model compression via distillation and quantization
Model compression via distillation and quantization
A. Polino
Razvan Pascanu
Dan Alistarh
MQ
257
799
0
15 Feb 2018
The Mechanics of n-Player Differentiable Games
The Mechanics of n-Player Differentiable Games
David Balduzzi
S. Racanière
James Martens
Jakob N. Foerster
K. Tuyls
T. Graepel
MLT
241
290
0
15 Feb 2018
A Diffusion Approximation Theory of Momentum SGD in Nonconvex
  Optimization
A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization
Tianyi Liu
Zhehui Chen
Enlu Zhou
T. Zhao
192
14
0
14 Feb 2018
Deep Neural Networks Learn Non-Smooth Functions Effectively
Deep Neural Networks Learn Non-Smooth Functions Effectively
Masaaki Imaizumi
Kenji Fukumizu
294
136
0
13 Feb 2018
signSGD: Compressed Optimisation for Non-Convex Problems
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedMLODL
491
1,162
0
13 Feb 2018
Critical Percolation as a Framework to Analyze the Training of Deep
  Networks
Critical Percolation as a Framework to Analyze the Training of Deep Networks
Zohar Ringel
Rodrigo Andrade de Bem
88
3
0
06 Feb 2018
Digital Watermarking for Deep Neural Networks
Digital Watermarking for Deep Neural Networks
Yuki Nagai
Yusuke Uchida
S. Sakazawa
Shiníchi Satoh
WIGM
161
155
0
06 Feb 2018
Rover Descent: Learning to optimize by learning to navigate on
  prototypical loss surfaces
Rover Descent: Learning to optimize by learning to navigate on prototypical loss surfaces
Louis Faury
Flavian Vasile
116
2
0
22 Jan 2018
Structured Inhomogeneous Density Map Learning for Crowd Counting
Structured Inhomogeneous Density Map Learning for Crowd Counting
Hanhui Li
Xiangjian He
Hefeng Wu
Saeed Amirgholipour Kasmani
Ruomei Wang
Xiaonan Luo
Liang Lin
99
12
0
20 Jan 2018
Near Maximum Likelihood Decoding with Deep Learning
Near Maximum Likelihood Decoding with Deep Learning
Eliya Nachmani
Yaron Bachar
Elad Marciano
D. Burshtein
Yair Be’ery
126
26
0
08 Jan 2018
Generating Neural Networks with Neural Networks
Generating Neural Networks with Neural Networks
Lior Deutsch
271
22
0
06 Jan 2018
High Dimensional Spaces, Deep Learning and Adversarial Examples
High Dimensional Spaces, Deep Learning and Adversarial Examples
S. Dube
360
28
0
02 Jan 2018
Accelerating Deep Learning with Memcomputing
Accelerating Deep Learning with Memcomputing
Haik Manukian
F. Traversa
M. Di Ventra
AI4CE
196
33
0
01 Jan 2018
The Multilinear Structure of ReLU Networks
The Multilinear Structure of ReLU NetworksInternational Conference on Machine Learning (ICML), 2017
T. Laurent
J. V. Brecht
174
53
0
29 Dec 2017
Visualizing the Loss Landscape of Neural Nets
Visualizing the Loss Landscape of Neural NetsNeural Information Processing Systems (NeurIPS), 2017
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
546
2,116
0
28 Dec 2017
Non-convex Optimization for Machine Learning
Non-convex Optimization for Machine Learning
Prateek Jain
Purushottam Kar
322
504
0
21 Dec 2017
ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent
ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent
Vishwak Srinivasan
Adepu Ravi Sankar
V. Balasubramanian
ODL
76
17
0
20 Dec 2017
Block-diagonal Hessian-free Optimization for Training Neural Networks
Block-diagonal Hessian-free Optimization for Training Neural Networks
Huishuai Zhang
Caiming Xiong
James Bradbury
R. Socher
ODL
112
24
0
20 Dec 2017
Third-order Smoothness Helps: Even Faster Stochastic Optimization
  Algorithms for Finding Local Minima
Third-order Smoothness Helps: Even Faster Stochastic Optimization Algorithms for Finding Local Minima
Yaodong Yu
Pan Xu
Quanquan Gu
134
3
0
18 Dec 2017
Improving Exploration in Evolution Strategies for Deep Reinforcement
  Learning via a Population of Novelty-Seeking Agents
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
Edoardo Conti
Vashisht Madhavan
F. Such
Joel Lehman
Kenneth O. Stanley
Jeff Clune
258
368
0
18 Dec 2017
Deep Learning for Distant Speech Recognition
Deep Learning for Distant Speech Recognition
Mirco Ravanelli
119
16
0
17 Dec 2017
Mathematics of Deep Learning
Mathematics of Deep Learning
René Vidal
Joan Bruna
Raja Giryes
Stefano Soatto
OOD
133
120
0
13 Dec 2017
Saving Gradient and Negative Curvature Computations: Finding Local
  Minima More Efficiently
Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently
Yaodong Yu
Difan Zou
Quanquan Gu
97
10
0
11 Dec 2017
Deep convolutional neural networks for brain image analysis on magnetic
  resonance imaging: a review
Deep convolutional neural networks for brain image analysis on magnetic resonance imaging: a review
J. Bernal
Kaisar Kushibar
Daniel S. Asfaw
Sergi Valverde
A. Oliver
Robert Martí
Xavier Llado
169
367
0
11 Dec 2017
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural
  Networks
Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks
Shankar Krishnan
Ying Xiao
Rif A. Saurous
ODL
130
20
0
08 Dec 2017
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient
  Descent
Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent
Chi Jin
Praneeth Netrapalli
Sai Li
ODL
198
280
0
28 Nov 2017
Convergent Block Coordinate Descent for Training Tikhonov Regularized
  Deep Neural Networks
Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
Ziming Zhang
M. Brand
102
77
0
20 Nov 2017
Neon2: Finding Local Minima via First-Order Oracles
Neon2: Finding Local Minima via First-Order Oracles
Zeyuan Allen-Zhu
Yuanzhi Li
299
138
0
17 Nov 2017
Online Deep Learning: Learning Deep Neural Networks on the Fly
Online Deep Learning: Learning Deep Neural Networks on the Fly
Doyen Sahoo
Quang Pham
Jing Lu
Guosheng Lin
OnRLAI4CE
170
337
0
10 Nov 2017
Learning Non-overlapping Convolutional Neural Networks with Multiple
  Kernels
Learning Non-overlapping Convolutional Neural Networks with Multiple Kernels
Kai Zhong
Zhao Song
Inderjit S. Dhillon
134
75
0
08 Nov 2017
One Model to Rule them all: Multitask and Multilingual Modelling for
  Lexical Analysis
One Model to Rule them all: Multitask and Multilingual Modelling for Lexical Analysis
Johannes Bjerva
145
6
0
03 Nov 2017
Critical Points of Neural Networks: Analytical Forms and Landscape
  Properties
Critical Points of Neural Networks: Analytical Forms and Landscape Properties
Yi Zhou
Yingbin Liang
173
54
0
30 Oct 2017
Optimization Landscape and Expressivity of Deep CNNs
Optimization Landscape and Expressivity of Deep CNNsInternational Conference on Learning Representations (ICLR), 2017
Quynh N. Nguyen
Matthias Hein
250
29
0
30 Oct 2017
Rethinking generalization requires revisiting old ideas: statistical
  mechanics approaches and complex learning behavior
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
Charles H. Martin
Michael W. Mahoney
AI4CE
145
64
0
26 Oct 2017
First-order Methods Almost Always Avoid Saddle Points
First-order Methods Almost Always Avoid Saddle Points
Jason D. Lee
Ioannis Panageas
Georgios Piliouras
Max Simchowitz
Michael I. Jordan
Benjamin Recht
ODL
215
84
0
20 Oct 2017
Characterization of Gradient Dominance and Regularity Conditions for
  Neural Networks
Characterization of Gradient Dominance and Regularity Conditions for Neural Networks
Yi Zhou
Yingbin Liang
192
33
0
18 Oct 2017
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text
  Recognition
AdaDNNs: Adaptive Ensemble of Deep Neural Networks for Scene Text Recognition
Chun Yang
Xu-Cheng Yin
Zejun Li
Jianwei Wu
Chunchao Guo
Hongfa Wang
Lei Xiao
118
10
0
10 Oct 2017
How regularization affects the critical points in linear networks
How regularization affects the critical points in linear networks
Amirhossein Taghvaei
Jin-Won Kim
P. Mehta
147
13
0
27 Sep 2017
Accelerating SGD for Distributed Deep-Learning Using Approximated
  Hessian Matrix
Accelerating SGD for Distributed Deep-Learning Using Approximated Hessian Matrix
Sébastien M. R. Arnold
Chunming Wang
52
0
0
15 Sep 2017
A Generic Approach for Escaping Saddle points
A Generic Approach for Escaping Saddle points
Sashank J. Reddi
Manzil Zaheer
S. Sra
Barnabás Póczós
Francis R. Bach
Ruslan Salakhutdinov
Alex Smola
205
83
0
05 Sep 2017
Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields
Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields
Thomas Unterthiner
Bernhard Nessler
Calvin Seward
Günter Klambauer
M. Heusel
Hubert Ramsauer
Sepp Hochreiter
GAN
252
75
0
29 Aug 2017
Previous
123...101112139
Next