Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

16 January 2020

Papers citing "Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks"

18 / 18 papers shown

Title
Ergodic Generative Flows Leo Maxime Brunswic Mateo Clemente Rui Heng Yang Adam Sigal Amir Rasouli Yinchuan Li 42 0 0 06 May 2025
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks Jim Zhao Sidak Pal Singh Aurélien Lucchi AI4CE 43 0 0 04 Nov 2024
MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition Nicolas Menet Michael Hersche G. Karunaratne Luca Benini Abu Sebastian Abbas Rahimi 28 13 0 05 Dec 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 62 2 0 02 Feb 2023
CyclicFL: A Cyclic Model Pre-Training Approach to Efficient Federated Learning Peng Zhang Yingbo Zhou Ming Hu Xin Fu Xian Wei Mingsong Chen FedML 24 1 0 28 Jan 2023
Dynamical Isometry for Residual Networks Advait Gadhikar R. Burkholz ODL AI4CE 34 2 0 05 Oct 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons Sangmin Lee Byeongsu Sim Jong Chul Ye MLT 94 6 0 27 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do Niladri S. Chatterji Philip M. Long 17 8 0 19 Sep 2022
Noise2NoiseFlow: Realistic Camera Noise Modeling without Clean Images Ali Maleky Shayan Kousha M. S. Brown Marcus A. Brubaker VLM DiffM 24 20 0 02 Jun 2022
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks G. Bingham Risto Miikkulainen ODL 24 4 0 18 Sep 2021
Coordinate descent on the orthogonal group for recurrent neural network training E. Massart V. Abrol 29 10 0 30 Jul 2021
The Importance of Pessimism in Fixed-Dataset Policy Optimization Jacob Buckman Carles Gelada Marc G. Bellemare OffRL 20 135 0 15 Sep 2020
Obtaining Adjustable Regularization for Free via Iterate Averaging Jingfeng Wu Vladimir Braverman Lin F. Yang 25 2 0 15 Aug 2020
Deep Isometric Learning for Visual Recognition Haozhi Qi Chong You X. Wang Yi-An Ma Jitendra Malik VLM 24 53 0 30 Jun 2020
Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks Soham De Samuel L. Smith ODL 14 20 0 24 Feb 2020
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 220 348 0 14 Jun 2018
Global optimality conditions for deep neural networks Chulhee Yun S. Sra Ali Jadbabaie 121 117 0 08 Jul 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,743 0 26 Sep 2016