On skip connections and normalisation layers in deep optimisation

10 October 2022

Papers citing "On skip connections and normalisation layers in deep optimisation"

6 / 6 papers shown

Title
Understanding Gradient Descent on Edge of Stability in Deep Learning Sanjeev Arora Zhiyuan Li A. Panigrahi MLT 75 89 0 19 May 2022
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping James Martens Andy Ballard Guillaume Desjardins G. Swirszcz Valentin Dalibard Jascha Narain Sohl-Dickstein S. Schoenholz 83 43 0 05 Oct 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 31 49 0 24 Jan 2021
RepVGG: Making VGG-style ConvNets Great Again Xiaohan Ding X. Zhang Ningning Ma Jungong Han Guiguang Ding Jian-jun Sun 117 1,544 0 11 Jan 2021
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 220 348 0 14 Jun 2018
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition Hamed Karimi J. Nutini Mark W. Schmidt 119 1,198 0 16 Aug 2016