Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

25 February 2016

Papers citing "Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks"

50 / 957 papers shown

Title
Universal Score-based Speech Enhancement with High Content Preservation Robin Scheibler Yusuke Fujita Yuma Shirahata Tatsuya Komatsu DiffM 32 10 0 18 Jun 2024
Let Go of Your Labels with Unsupervised Transfer Artyom Gadetsky Yulun Jiang Maria Brbić VLM 32 6 0 11 Jun 2024
Compute Better Spent: Replacing Dense Layers with Structured Matrices Shikai Qiu Andres Potapczynski Marc Finzi Micah Goldblum Andrew Gordon Wilson 32 11 0 10 Jun 2024
Conv-INR: Convolutional Implicit Neural Representation for Multimodal Visual Signals Zhicheng Cai 36 0 0 06 Jun 2024
Encoding Semantic Priors into the Weights of Implicit Neural Representation Zhicheng Cai Qiu Shen 28 0 0 06 Jun 2024
Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation Min-Jae Hwang Ilia Kulikov Benjamin Peloquin Hongyu Gong Peng-Jen Chen Ann Lee 27 1 0 04 Jun 2024
Automatic Dance Video Segmentation for Understanding Choreography Koki Endo Shuhei Tsuchida Tsukasa Fukusato Takeo Igarashi VOS 16 0 0 30 May 2024
Supervised Batch Normalization Bilal Faye M. Lebbah Hanane Azzag 18 1 0 27 May 2024
CNN-based Compressor Mass Flow Estimator in Industrial Aircraft Vapor Cycle System Justin Reverdi Sixin Zhang Said Aoues Fabrice Gamboa Serge Gratton Thomas Pellegrini 34 0 0 27 May 2024
Can Implicit Bias Imply Adversarial Robustness? Hancheng Min René Vidal 34 3 0 24 May 2024
UnitNorm: Rethinking Normalization for Transformers in Time Series Nan Huang C. Kümmerle Xiang Zhang AI4TS 22 2 0 24 May 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models Akide Liu Jing Liu Zizheng Pan Yefei He Gholamreza Haffari Bohan Zhuang MQ 35 30 0 23 May 2024
Quantile Activation: Correcting a Failure Mode of ML Models Aditya Challa Sravan Danda Laurent Najman Snehanshu Saha UQCV 36 0 0 19 May 2024
DINO as a von Mises-Fisher mixture model Hariprasath Govindarajan Per Sidén Jacob Roll Fredrik Lindsten 39 11 0 17 May 2024
HILCodec: High Fidelity and Lightweight Neural Audio Codec S. Ahn Beom Jun Woo Mingrui Han Chanyeong Moon Nam Soo Kim 21 6 0 08 May 2024
Hidden Synergy: $L_1$ Weight Normalization and 1-Path-Norm Regularization Aditya Biswas 36 0 0 29 Apr 2024
Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks Yassine Abbahaddou Sofiane Ennadir J. Lutzeyer Michalis Vazirgiannis Henrik Bostrom AAML OOD 29 6 0 27 Apr 2024
Hard ASH: Sparsity and the right optimizer make a continual learner Santtu Keskinen CLL 37 1 0 26 Apr 2024
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder Yicheng Gu Xueyao Zhang Liumeng Xue Haizhou Li Zhizheng Wu 28 2 0 26 Apr 2024
FlashSpeech: Efficient Zero-Shot Speech Synthesis Zhen Ye Zeqian Ju Haohe Liu Xu Tan Jianyi Chen ... Weizhen Bian Shulin He Qi-fei Liu Yi-Ting Guo Wei Xue 38 16 0 23 Apr 2024
K-percent Evaluation for Lifelong RL Golnaz Mesbahi Parham Mohammad Panahi Olya Mastikhina Martha White Adam White CLL OffRL 26 0 0 02 Apr 2024
Privacy Re-identification Attacks on Tabular GANs Abdallah Alshantti Adil Rasheed Frank Westad AAML 19 3 0 31 Mar 2024
Learning in PINNs: Phase transition, total diffusion, and generalization Sokratis J. Anagnostopoulos Juan Diego Toscano Nikolaos Stergiopulos George Karniadakis 24 10 0 27 Mar 2024
DeepMIF: Deep Monotonic Implicit Fields for Large-Scale LiDAR 3D Mapping Kutay Yilmaz Matthias Nießner Anastasiia Kornilova Alexey Artemov 30 0 0 26 Mar 2024
On permutation-invariant neural networks Masanari Kimura Ryotaro Shimizu Yuki Hirakawa Ryosuke Goto Yuki Saito OOD AAML 35 12 0 26 Mar 2024
SF(DA) $^2$ : Source-free Domain Adaptation Through the Lens of Data Augmentation Uiwon Hwang Jonghyun Lee Juhyeon Shin Sungroh Yoon 26 10 0 16 Mar 2024
Linearly Constrained Weights: Reducing Activation Shift for Faster Training of Neural Networks Takuro Kutsuna LLMSV 19 1 0 08 Mar 2024
SGD with Partial Hessian for Deep Neural Networks Optimization Ying Sun Hongwei Yong Lei Zhang ODL 28 0 0 05 Mar 2024
Disentangling the Causes of Plasticity Loss in Neural Networks Clare Lyle Zeyu Zheng Khimya Khetarpal H. V. Hasselt Razvan Pascanu James Martens Will Dabney AI4CE 53 31 0 29 Feb 2024
Enhancing Hypergradients Estimation: A Study of Preconditioning and Reparameterization Zhenzhang Ye Gabriel Peyré Daniel Cremers Pierre Ablin 34 2 0 26 Feb 2024
Layer-wise Regularized Dropout for Neural Language Models Shiwen Ni Min Yang Ruifeng Xu Chengming Li Xiping Hu 30 0 0 26 Feb 2024
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning Nik Vaessen David A. van Leeuwen 30 3 0 21 Feb 2024
DoRA: Weight-Decomposed Low-Rank Adaptation Shih-yang Liu Chien-Yi Wang Hongxu Yin Pavlo Molchanov Yu-Chiang Frank Wang Kwang-Ting Cheng Min-Hung Chen 27 337 0 14 Feb 2024
Data Reconstruction Attacks and Defenses: A Systematic Evaluation Sheng Liu Zihan Wang Yuxiao Chen Qi Lei AAML MIACV 59 4 0 13 Feb 2024
ODIN: Disentangled Reward Mitigates Hacking in RLHF Lichang Chen Chen Zhu Davit Soselia Jiuhai Chen Tianyi Zhou Tom Goldstein Heng-Chiao Huang M. Shoeybi Bryan Catanzaro AAML 42 51 0 11 Feb 2024
How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers G. Buzaglo I. Harel Mor Shpigel Nacson Alon Brutzkus Nathan Srebro Daniel Soudry 54 3 0 09 Feb 2024
Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation Pedro Vianna Muawiz Chaudhary Paria Mehrbod An Tang Guy Cloutier Guy Wolf Michael Eickenberg Eugene Belilovsky OOD 29 3 0 07 Feb 2024
Positive concave deep equilibrium models Mateusz Gabor Tomasz Piotrowski Renato L. G. Cavalcante 21 2 0 06 Feb 2024
Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence Jiafei Lyu Le Wan Xiu Li Zongqing Lu CML OffRL 33 2 0 05 Feb 2024
Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale Yangyang Shu Xiaofeng Cao Qi Chen Bowen Zhang Ziqin Zhou A. Hengel Lingqiao Liu 21 0 0 02 Feb 2024
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent Benjamin Gess Sebastian Kassing Nimit Rana 34 0 0 02 Feb 2024
$HyperZ$\cdot$Z$\cdot$W Operator Connects Slow-Fast Networks for Full Context Interaction$ HyperZ $\cdot$ Z $\cdot$ W Operator Connects Slow-Fast Networks for Full Context Interaction Harvie Zhang 31 0 0 31 Jan 2024
CNG-SFDA: Clean-and-Noisy Region Guided Online-Offline Source-Free Domain Adaptation Hyeonwoo Cho Chanmin Park Donghee Kim Jinyoung Kim Won Hwa Kim TTA 24 0 0 26 Jan 2024
Understanding the Generalization Benefits of Late Learning Rate Decay Yinuo Ren Chao Ma Lexing Ying AI4CE 26 6 0 21 Jan 2024
HARDCORE: H-field and power loss estimation for arbitrary waveforms with residual, dilated convolutional neural networks in ferrite cores Wilhelm Kirchgässner Nikolas Förster T. Piepenbrock Oliver Schweins Oliver Wallscheid 11 5 0 21 Jan 2024
Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis Prabhav Agrawal Thilo Köhler Zhiping Xiu Prashant Serai Qing He 18 1 0 19 Jan 2024
A2Q+: Improving Accumulator-Aware Weight Quantization Ian Colbert Alessandro Pappalardo Jakoba Petri-Koenig Yaman Umuroglu MQ 21 4 0 19 Jan 2024
Symbolic Manipulation Planning with Discovered Object and Relational Predicates Alper Ahmetoglu Erhan Öztop Emre Ugur 39 5 0 02 Jan 2024
Objects as volumes: A stochastic geometry view of opaque solids Bailey Miller Hanyu Chen Alice Lai Ioannis Gkioulekas 43 5 0 24 Dec 2023
Fed-CO2: Cooperation of Online and Offline Models for Severe Data Heterogeneity in Federated Learning Zhongyi Cai Ye-ling Shi Wei Huang Jingya Wang FedML 24 4 0 21 Dec 2023