ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.02444
  4. Cited By
Global optimality conditions for deep neural networks
v1v2v3 (latest)

Global optimality conditions for deep neural networks

8 July 2017
Chulhee Yun
S. Sra
Ali Jadbabaie
ArXiv (abs)PDFHTML

Papers citing "Global optimality conditions for deep neural networks"

50 / 79 papers shown
Distributionally Robust Optimization via Diffusion Ambiguity Modeling
Distributionally Robust Optimization via Diffusion Ambiguity Modeling
Jiaqi Wen
Jianyi Yang
164
2
0
26 Oct 2025
The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning
Milad Aghajohari
Kamran Chitsaz
Amirhossein Kazemnejad
Sarath Chandar
Alessandro Sordoni
Aaron Courville
Siva Reddy
OffRLReLMLRM
308
0
0
08 Oct 2025
Backward Oversmoothing: why is it hard to train deep Graph Neural Networks?
Backward Oversmoothing: why is it hard to train deep Graph Neural Networks?
Nicolas Keriven
321
2
0
22 May 2025
Exploring Loss Landscapes through the Lens of Spin Glass Theory
Exploring Loss Landscapes through the Lens of Spin Glass Theory
Hao Liao
Wei Zhang
Zhanyi Huang
Zexiao Long
Mingyang Zhou
Xiaoqun Wu
Rui Mao
Chi Ho Yeung
314
3
0
30 Jul 2024
Towards Training Without Depth Limits: Batch Normalization Without
  Gradient Explosion
Towards Training Without Depth Limits: Batch Normalization Without Gradient ExplosionInternational Conference on Learning Representations (ICLR), 2023
Alexandru Meterez
Amir Joudaki
Francesco Orabona
Alexander Immer
Gunnar Rätsch
Hadi Daneshmand
258
10
0
03 Oct 2023
Transferring Learning Trajectories of Neural Networks
Transferring Learning Trajectories of Neural NetworksInternational Conference on Learning Representations (ICLR), 2023
Daiki Chijiwa
335
4
0
23 May 2023
Critical Points and Convergence Analysis of Generative Deep Linear
  Networks Trained with Bures-Wasserstein Loss
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein LossInternational Conference on Machine Learning (ICML), 2023
Pierre Bréchet
Katerina Papagiannouli
Jing An
Guido Montúfar
432
7
0
06 Mar 2023
A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix
  Factorization
A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization
JIAN-PENG Cao
Chao Qian
Yihui Huang
Dicheng Chen
Yuncheng Gao
Jiyang Dong
D. Guo
X. Qu
425
1
0
29 Dec 2022
Piecewise Linear Neural Networks and Deep Learning
Piecewise Linear Neural Networks and Deep LearningNature Reviews Methods Primers (NRMP), 2022
Qinghua Tao
Li Li
Xiaolin Huang
Xiangming Xi
Shuning Wang
Johan A. K. Suykens
199
41
0
18 Jun 2022
Parameter Convex Neural Networks
Parameter Convex Neural Networks
Jingcheng Zhou
Wei Wei
Xing Li
Bowen Pang
Zhiming Zheng
112
1
0
11 Jun 2022
Memorization-Dilation: Modeling Neural Collapse Under Label Noise
Memorization-Dilation: Modeling Neural Collapse Under Label Noise
Duc Anh Nguyen
Ron Levie
Julian Lienen
Gitta Kutyniok
Eyke Hüllermeier
318
2
0
11 Jun 2022
Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks
Statistical Guarantees for Approximate Stationary Points of Shallow Neural Networks
Mahsa Taheri
Fang Xie
Johannes Lederer
403
1
0
09 May 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization LandscapeInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Devansh Bisla
Jing Wang
A. Choromańska
387
48
0
20 Jan 2022
Complexity from Adaptive-Symmetries Breaking: Global Minima in the
  Statistical Mechanics of Deep Neural Networks
Complexity from Adaptive-Symmetries Breaking: Global Minima in the Statistical Mechanics of Deep Neural Networks
Shaun Li
AI4CE
261
1
0
03 Jan 2022
On the Global Convergence of Gradient Descent for multi-layer ResNets in
  the mean-field regime
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
MLTAI4CE
277
10
0
06 Oct 2021
Convergence of gradient descent for learning linear neural networks
Convergence of gradient descent for learning linear neural networksAdvances in Continuous and Discrete Models (ACDM), 2021
Gabin Maxime Nguegnang
Holger Rauhut
Ulrich Terstiege
MLT
393
28
0
04 Aug 2021
The loss landscape of deep linear neural networks: a second-order
  analysis
The loss landscape of deep linear neural networks: a second-order analysis
El Mehdi Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
296
25
0
28 Jul 2021
Analysis and Optimisation of Bellman Residual Errors with Neural
  Function Approximation
Analysis and Optimisation of Bellman Residual Errors with Neural Function Approximation
Martin Gottwald
Sven Gronauer
Hao Shen
Klaus Diepold
541
4
0
16 Jun 2021
Overparameterization of deep ResNet: zero loss and mean-field analysis
Overparameterization of deep ResNet: zero loss and mean-field analysisJournal of machine learning research (JMLR), 2021
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
ODL
345
28
0
30 May 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
A Geometric Analysis of Neural Collapse with Unconstrained FeaturesNeural Information Processing Systems (NeurIPS), 2021
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
358
251
0
06 May 2021
Noether: The More Things Change, the More Stay the Same
Noether: The More Things Change, the More Stay the Same
Grzegorz Gluch
R. Urbanke
208
22
0
12 Apr 2021
Training Deep Neural Networks via Branch-and-Bound
Training Deep Neural Networks via Branch-and-Bound
Yuanwei Wu
Ziming Zhang
Guanghui Wang
ODL
316
0
0
05 Apr 2021
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise
  Linear Activations
Spurious Local Minima Are Common for Deep Neural Networks with Piecewise Linear ActivationsIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Bo Liu
176
12
0
25 Feb 2021
When Are Solutions Connected in Deep Networks?
When Are Solutions Connected in Deep Networks?Neural Information Processing Systems (NeurIPS), 2021
Quynh N. Nguyen
Pierre Bréchet
Marco Mondelli
438
10
0
18 Feb 2021
The Landscape of Multi-Layer Linear Neural Network From the Perspective
  of Algebraic Geometry
The Landscape of Multi-Layer Linear Neural Network From the Perspective of Algebraic Geometry
Xiuyi Yang
138
2
0
30 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural
  Networks
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
289
3
0
12 Jan 2021
A Survey on Neural Network Interpretability
A Survey on Neural Network InterpretabilityIEEE Transactions on Emerging Topics in Computational Intelligence (IEEE TETCI), 2020
Yu Zhang
Peter Tiño
A. Leonardis
Shengcai Liu
FaMLXAI
709
880
0
28 Dec 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear NetworkInternational Conference on Machine Learning (ICML), 2020
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
726
26
0
04 Oct 2020
From Symmetry to Geometry: Tractable Nonconvex Problems
From Symmetry to Geometry: Tractable Nonconvex Problems
Yuqian Zhang
Qing Qu
John N. Wright
444
48
0
14 Jul 2020
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to
  Ridgelet Spectrum
Ridge Regression with Over-Parametrized Two-Layer Networks Converge to Ridgelet Spectrum
Sho Sonoda
Isao Ishikawa
Masahiro Ikeda
MLT
239
0
0
07 Jul 2020
The Global Landscape of Neural Networks: An Overview
The Global Landscape of Neural Networks: An Overview
Tian Ding
Dawei Li
Shiyu Liang
Tian Ding
R. Srikant
269
97
0
02 Jul 2020
Piecewise linear activations substantially shape the loss surfaces of
  neural networks
Piecewise linear activations substantially shape the loss surfaces of neural networksInternational Conference on Learning Representations (ICLR), 2020
Fengxiang He
Bohan Wang
Dacheng Tao
ODL
222
33
0
27 Mar 2020
Some Geometrical and Topological Properties of DNNs' Decision Boundaries
Some Geometrical and Topological Properties of DNNs' Decision Boundaries
Bo Liu
Mengya Shen
AAML
284
3
0
07 Mar 2020
On the Global Convergence of Training Deep Linear ResNets
On the Global Convergence of Training Deep Linear ResNetsInternational Conference on Learning Representations (ICLR), 2020
Difan Zou
Philip M. Long
Quanquan Gu
240
43
0
02 Mar 2020
Understanding Global Loss Landscape of One-hidden-layer ReLU Networks,
  Part 1: Theory
Understanding Global Loss Landscape of One-hidden-layer ReLU Networks, Part 1: Theory
Bo Liu
FAttMLT
443
1
0
12 Feb 2020
Learning CHARME models with neural networks
Learning CHARME models with neural networks
José G. Gómez-García
M. Fadili
C. Chesneau
179
1
0
08 Feb 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear
  Networks
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear NetworksInternational Conference on Learning Representations (ICLR), 2020
Wei Hu
Lechao Xiao
Jeffrey Pennington
279
135
0
16 Jan 2020
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Tian Ding
ODL
487
183
0
19 Dec 2019
How Much Over-parameterization Is Sufficient to Learn Deep ReLU
  Networks?
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?International Conference on Learning Representations (ICLR), 2019
Zixiang Chen
Yuan Cao
Difan Zou
Quanquan Gu
407
132
0
27 Nov 2019
Bregman Proximal Framework for Deep Linear Neural Networks
Bregman Proximal Framework for Deep Linear Neural Networks
Mahesh Chandra Mukkamala
Felix Westerkamp
Emanuel Laude
Zorah Lähner
Peter Ochs
321
8
0
08 Oct 2019
Pure and Spurious Critical Points: a Geometric Study of Linear Networks
Pure and Spurious Critical Points: a Geometric Study of Linear NetworksInternational Conference on Learning Representations (ICLR), 2019
Matthew Trager
Kathlén Kohn
Joan Bruna
269
41
0
03 Oct 2019
Student Specialization in Deep ReLU Networks With Finite Width and Input
  Dimension
Student Specialization in Deep ReLU Networks With Finite Width and Input Dimension
Yuandong Tian
MLT
295
8
0
30 Sep 2019
Distance Geometry and Data Science
Distance Geometry and Data ScienceTOP - An Official Journal of the Spanish Society of Statistics and Operations Research (TOP), 2019
Leo Liberti
159
34
0
18 Sep 2019
Neural Architecture Search by Estimation of Network Structure
  Distributions
Neural Architecture Search by Estimation of Network Structure Distributions
A. Muravev
Jenni Raitoharju
Moncef Gabbouj
OOD
270
1
0
19 Aug 2019
Are deep ResNets provably better than linear predictors?
Are deep ResNets provably better than linear predictors?Neural Information Processing Systems (NeurIPS), 2019
Chulhee Yun
S. Sra
Ali Jadbabaie
467
14
0
09 Jul 2019
Semi-Implicit Generative Model
Semi-Implicit Generative Model
Mingzhang Yin
Mingyuan Zhou
VLMGAN
228
3
0
29 May 2019
Fine-grained Optimization of Deep Neural Networks
Fine-grained Optimization of Deep Neural NetworksNeural Information Processing Systems (NeurIPS), 2019
Mete Ozay
ODL
228
2
0
22 May 2019
Orthogonal Deep Neural Networks
Orthogonal Deep Neural NetworksIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Kui Jia
Shuai Li
Yuxin Wen
Tongliang Liu
Dacheng Tao
236
153
0
15 May 2019
Generalization Error Bounds of Gradient Descent for Learning
  Over-parameterized Deep ReLU Networks
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao
Quanquan Gu
ODLMLTAI4CE
770
166
0
04 Feb 2019
Depth creates no more spurious local minima
Depth creates no more spurious local minima
Li Zhang
285
20
0
28 Jan 2019
12
Next
Page 1 of 2