ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1702.08580
  4. Cited By
Depth Creates No Bad Local Minima
v1v2 (latest)

Depth Creates No Bad Local Minima

27 February 2017
Haihao Lu
Kenji Kawaguchi
    ODLFAtt
ArXiv (abs)PDFHTML

Papers citing "Depth Creates No Bad Local Minima"

50 / 72 papers shown
System Identification and Control Using Lyapunov-Based Deep Neural Networks without Persistent Excitation: A Concurrent Learning Approach
System Identification and Control Using Lyapunov-Based Deep Neural Networks without Persistent Excitation: A Concurrent Learning Approach
Rebecca G. Hart
Omkar Sudhir Patil
Zachary I. Bell
Warren E. Dixon
148
1
0
15 May 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian InputNeural Information Processing Systems (NeurIPS), 2024
Ziang Chen
Rong Ge
MLT
416
1
0
10 Jan 2025
Exploring Neural Network Landscapes: Star-Shaped and Geodesic
  Connectivity
Exploring Neural Network Landscapes: Star-Shaped and Geodesic Connectivity
Zhanran Lin
Puheng Li
Lei Wu
459
9
0
09 Apr 2024
The Expressive Power of Low-Rank Adaptation
The Expressive Power of Low-Rank AdaptationInternational Conference on Learning Representations (ICLR), 2023
Yuchen Zeng
Kangwook Lee
490
96
0
26 Oct 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear
  Networks
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
294
20
0
01 Jun 2023
Function Space and Critical Points of Linear Convolutional Networks
Function Space and Critical Points of Linear Convolutional NetworksSIAM Journal on applied algebra and geometry (SIAM J. Appl. Algebra Geom.), 2023
Kathlén Kohn
Guido Montúfar
Vahid Shahverdi
Matthew Trager
253
18
0
12 Apr 2023
Critical Points and Convergence Analysis of Generative Deep Linear
  Networks Trained with Bures-Wasserstein Loss
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein LossInternational Conference on Machine Learning (ICML), 2023
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
361
7
0
06 Mar 2023
Bayesian Interpolation with Deep Linear Networks
Bayesian Interpolation with Deep Linear NetworksProceedings of the National Academy of Sciences of the United States of America (PNAS), 2022
Boris Hanin
Alexander Zlokapa
427
29
0
29 Dec 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can WorkNeural Information Processing Systems (NeurIPS), 2022
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Tian Ding
Jianfeng Yao
326
10
0
21 Oct 2022
DiffML: End-to-end Differentiable ML Pipelines
DiffML: End-to-end Differentiable ML Pipelines
Benjamin Hilprecht
Christian Hammacher
Eduardo Reis
Mohamed Abdelaal
Carsten Binnig
MedIm
157
13
0
04 Jul 2022
Understanding Deep Learning via Decision Boundary
Understanding Deep Learning via Decision BoundaryIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Shiye Lei
Fengxiang He
Yancheng Yuan
Dacheng Tao
205
23
0
03 Jun 2022
Posterior Collapse of a Linear Latent Variable Model
Posterior Collapse of a Linear Latent Variable ModelNeural Information Processing Systems (NeurIPS), 2022
Zihao Wang
Liu Ziyin
BDL
244
23
0
09 May 2022
A Convergence Analysis of Nesterov's Accelerated Gradient Method in
  Training Deep Linear Neural Networks
A Convergence Analysis of Nesterov's Accelerated Gradient Method in Training Deep Linear Neural NetworksInformation Sciences (Inf. Sci.), 2022
Xin Liu
Wei Tao
Zhisong Pan
130
12
0
18 Apr 2022
Convergence and Implicit Regularization Properties of Gradient Descent
  for Deep Residual Networks
Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual NetworksSocial Science Research Network (SSRN), 2022
R. Cont
Alain Rossier
Renyuan Xu
MLT
379
6
0
14 Apr 2022
Exact Solutions of a Deep Linear Network
Exact Solutions of a Deep Linear NetworkNeural Information Processing Systems (NeurIPS), 2022
Liu Ziyin
Botao Li
Xiangmin Meng
ODL
598
23
0
10 Feb 2022
Learning Neural Ranking Models Online from Implicit User Feedback
Learning Neural Ranking Models Online from Implicit User FeedbackThe Web Conference (WWW), 2022
Yiling Jia
Hongning Wang
151
6
0
17 Jan 2022
Global Convergence Analysis of Deep Linear Networks with A One-neuron
  Layer
Global Convergence Analysis of Deep Linear Networks with A One-neuron Layer
Kun Chen
Dachao Lin
Zhihua Zhang
178
1
0
08 Jan 2022
Geometry of Linear Convolutional Networks
Geometry of Linear Convolutional Networks
Kathlén Kohn
Thomas Merkh
Guido Montúfar
Matthew Trager
306
28
0
03 Aug 2021
The loss landscape of deep linear neural networks: a second-order
  analysis
The loss landscape of deep linear neural networks: a second-order analysis
El Mehdi Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
224
21
0
28 Jul 2021
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise
  Linear Activations
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise Linear ActivationsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Bo Liu
133
10
0
25 Feb 2021
When Are Solutions Connected in Deep Networks?
When Are Solutions Connected in Deep Networks?Neural Information Processing Systems (NeurIPS), 2021
Quynh N. Nguyen
Pierre Bréchet
Marco Mondelli
381
10
0
18 Feb 2021
Neural Networks with Complex-Valued Weights Have No Spurious Local
  Minima
Neural Networks with Complex-Valued Weights Have No Spurious Local MinimaAnnual Conference on Information Sciences and Systems (CISS), 2021
Xingtu Liu
MLT
160
1
0
31 Jan 2021
The Landscape of Multi-Layer Linear Neural Network From the Perspective
  of Algebraic Geometry
The Landscape of Multi-Layer Linear Neural Network From the Perspective of Algebraic Geometry
Xiuyi Yang
83
2
0
30 Jan 2021
Recent advances in deep learning theory
Recent advances in deep learning theory
Fengxiang He
Dacheng Tao
AI4CE
336
57
0
20 Dec 2020
Notes on Deep Learning Theory
Notes on Deep Learning Theory
Eugene Golikov
VLMAI4CE
94
2
0
10 Dec 2020
Statistical Mechanics of Deep Linear Neural Networks: The
  Back-Propagating Kernel Renormalization
Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Kernel Renormalization
Qianyi Li
H. Sompolinsky
446
84
0
07 Dec 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
636
24
0
04 Oct 2020
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to
  Ridgelet Spectrum
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to Ridgelet Spectrum
Sho Sonoda
Isao Ishikawa
Masahiro Ikeda
MLT
144
0
0
07 Jul 2020
The Global Landscape of Neural Networks: An Overview
The Global Landscape of Neural Networks: An Overview
Tian Ding
Dawei Li
Shiyu Liang
Tian Ding
R. Srikant
222
94
0
02 Jul 2020
Flatness is a False Friend
Flatness is a False Friend
Diego Granziol
ODL
146
20
0
16 Jun 2020
Piecewise linear activations substantially shape the loss surfaces of
  neural networks
Piecewise linear activations substantially shape the loss surfaces of neural networksInternational Conference on Learning Representations (ICLR), 2020
Fengxiang He
Bohan Wang
Dacheng Tao
ODL
191
33
0
27 Mar 2020
Some Geometrical and Topological Properties of DNNs' Decision Boundaries
Some Geometrical and Topological Properties of DNNs' Decision Boundaries
Bo Liu
Mengya Shen
AAML
235
3
0
07 Mar 2020
On the Global Convergence of Training Deep Linear ResNets
On the Global Convergence of Training Deep Linear ResNetsInternational Conference on Learning Representations (ICLR), 2020
Difan Zou
Philip M. Long
Quanquan Gu
191
41
0
02 Mar 2020
Understanding Global Loss Landscape of One-hidden-layer ReLU Networks,
  Part 1: Theory
Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 1: Theory
Bo Liu
FAttMLT
254
1
0
12 Feb 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear
  Networks
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020
Wei Hu
Lechao Xiao
Jeffrey Pennington
210
129
0
16 Jan 2020
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Tian Ding
ODL
340
178
0
19 Dec 2019
Sub-Optimal Local Minima Exist for Neural Networks with Almost All
  Non-Linear Activations
Sub-Optimal Local Minima Exist for Neural Networks with Almost All Non-Linear Activations
Tian Ding
Dawei Li
Tian Ding
343
14
0
04 Nov 2019
Effects of Depth, Width, and Initialization: A Convergence Analysis of
  Layer-wise Training for Deep Linear Neural Networks
Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural NetworksAnalysis and Applications (Anal. Appl.), 2019
Yeonjong Shin
299
13
0
14 Oct 2019
Pure and Spurious Critical Points: a Geometric Study of Linear Networks
Pure and Spurious Critical Points: a Geometric Study of Linear NetworksInternational Conference on Learning Representations (ICLR), 2019
Matthew Trager
Kathlén Kohn
Joan Bruna
185
35
0
03 Oct 2019
Classification Logit Two-sample Testing by Neural Networks
Classification Logit Two-sample Testing by Neural NetworksIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2019
Xiuyuan Cheng
A. Cloninger
210
38
0
25 Sep 2019
Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked
  Matrix-Tensor Model
Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor ModelNeural Information Processing Systems (NeurIPS), 2019
Stefano Sarao Mannelli
Giulio Biroli
C. Cammarota
Florent Krzakala
Lenka Zdeborová
178
46
0
18 Jul 2019
Weight-space symmetry in deep networks gives rise to permutation
  saddles, connected by equal-loss valleys across the loss landscape
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
287
65
0
05 Jul 2019
Interpretable Few-Shot Learning via Linear Distillation
Interpretable Few-Shot Learning via Linear Distillation
Arip Asadulaev
Igor Kuznetsov
Andrey Filchenkov
FedMLFAtt
240
1
0
13 Jun 2019
Decoupling Gating from Linearity
Decoupling Gating from Linearity
Jonathan Fiat
Eran Malach
Shai Shalev-Shwartz
209
31
0
12 Jun 2019
Loss Surface Modality of Feed-Forward Neural Network Architectures
Loss Surface Modality of Feed-Forward Neural Network ArchitecturesIEEE International Joint Conference on Neural Network (IJCNN), 2019
Anna Sergeevna Bosman
A. Engelbrecht
Mardé Helbig
188
10
0
24 May 2019
An Essay on Optimization Mystery of Deep Learning
An Essay on Optimization Mystery of Deep Learning
Eugene Golikov
ODL
62
0
0
17 May 2019
Passed & Spurious: Descent Algorithms and Local Minima in Spiked
  Matrix-Tensor Models
Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor ModelsInternational Conference on Machine Learning (ICML), 2019
Stefano Sarao Mannelli
Florent Krzakala
Pierfrancesco Urbani
Lenka Zdeborová
466
52
0
01 Feb 2019
Depth creates no more spurious local minima
Depth creates no more spurious local minima
Li Zhang
227
19
0
28 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
401
102
0
24 Jan 2019
On Connected Sublevel Sets in Deep Learning
On Connected Sublevel Sets in Deep Learning
Quynh N. Nguyen
296
106
0
22 Jan 2019
12
Next
Page 1 of 2