Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.04231
Cited By
Identity Matters in Deep Learning
14 November 2016
Moritz Hardt
Tengyu Ma
OOD
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Identity Matters in Deep Learning"
20 / 70 papers shown
Title
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
33
1,251
0
04 Oct 2018
Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks
Ohad Shamir
22
45
0
23 Sep 2018
Training Deeper Neural Machine Translation Models with Transparent Attention
Ankur Bapna
M. Chen
Orhan Firat
Yuan Cao
Yonghui Wu
29
138
0
22 Aug 2018
ResNet with one-neuron hidden layers is a Universal Approximator
Hongzhou Lin
Stefanie Jegelka
28
227
0
28 Jun 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
26
134
0
20 Jun 2018
Understanding Batch Normalization
Johan Bjorck
Carla P. Gomes
B. Selman
Kilian Q. Weinberger
11
592
0
01 Jun 2018
How Does Batch Normalization Help Optimization?
Shibani Santurkar
Dimitris Tsipras
Andrew Ilyas
A. Madry
ODL
16
1,521
0
29 May 2018
Adding One Neuron Can Eliminate All Bad Local Minima
Shiyu Liang
Ruoyu Sun
J. Lee
R. Srikant
29
89
0
22 May 2018
How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?
S. Du
Yining Wang
Xiyu Zhai
Sivaraman Balakrishnan
Ruslan Salakhutdinov
Aarti Singh
SSL
13
57
0
21 May 2018
Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
S. Du
Surbhi Goel
MLT
20
17
0
20 May 2018
Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks
Peter L. Bartlett
D. Helmbold
Philip M. Long
23
116
0
16 Feb 2018
Deep Neural Nets with Interpolating Function as Output Activation
Bao Wang
Xiyang Luo
Z. Li
Wei-wei Zhu
Zuoqiang Shi
Stanley J. Osher
20
3
0
01 Feb 2018
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
24
101
0
14 Jan 2018
Visualizing the Loss Landscape of Neural Nets
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
63
1,842
0
28 Dec 2017
Global optimality conditions for deep neural networks
Chulhee Yun
S. Sra
Ali Jadbabaie
121
117
0
08 Jul 2017
Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization
Qunwei Li
Yi Zhou
Yingbin Liang
P. Varshney
18
94
0
14 May 2017
Skip Connections Eliminate Singularities
Emin Orhan
Xaq Pitkow
28
25
0
31 Jan 2017
Removal of Batch Effects using Distribution-Matching Residual Networks
Uri Shaham
Kelly P. Stanton
Jun Zhao
Huamin Li
K. Raddassi
Ruth R. Montgomery
Y. Kluger
16
159
0
13 Oct 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
130
1,198
0
16 Aug 2016
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
179
1,185
0
30 Nov 2014
Previous
1
2