Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

25 February 2016

Papers citing "Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks"

50 / 957 papers shown

Title
Mask-PINNs: Regulating Feature Distributions in Physics-Informed Neural Networks Feilong Jiang Xiaonan Hou Jianqiao Ye Min Xia OOD PINN 42 0 0 09 May 2025
On the Importance of Gaussianizing Representations Daniel Eftekhari Vardan Papyan 26 0 0 01 May 2025
MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis Jiahao Lu Chong Yin Silvia Ingala Kenny Erleben M. Nielsen S. Darkner 49 0 0 27 Apr 2025
Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching Junn Yong Loo Michelle Adeline Julia Kaiwen Lau Fang Yu Leong Hwa Hui Tew Arghya Pal Vishnu Monn Baskaran Chee-Ming Ting Raphaël C.-W. Phan BDL 61 0 0 22 Apr 2025
SparsyFed: Sparse Adaptive Federated Training Adriano Guastella Lorenzo Sani Alex Iacob Alessio Mora Paolo Bellavista Nicholas D. Lane FedML 31 0 0 07 Apr 2025
FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation Mohammadmahdi Honarmand O. Mutlu Parnian Azizian Saimourya Surabhi Dennis Paul Wall TTA 75 0 0 29 Mar 2025
Beyond RGB: Adaptive Parallel Processing for RAW Object Detection Shani Gamrian Hila Barel Feiran Li Masakazu Yoshimura Daisuke Iso 51 0 0 17 Mar 2025
Transformers without Normalization Jiachen Zhu Xinlei Chen Kaiming He Yann LeCun Zhuang Liu ViT OffRL 53 7 0 13 Mar 2025
Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion Mona Sheikh Zeinoddin Mobarakol Islam Zafer Tandogdu Greg Shaw Mathew J. Clarkson E. Mazomenos Danail Stoyanov 127 0 0 10 Mar 2025
Same accuracy, twice as fast: continuous training surpasses retraining from scratch Eli Verwimp Guy Hacohen Tinne Tuytelaars OnRL 39 0 0 28 Feb 2025
Self-Adjust Softmax Chuanyang Zheng Yihang Gao Guoxuan Chen Han Shi Jing Xiong Xiaozhe Ren Chao Huang Xin Jiang Z. Li Yu-Hu Li 38 0 0 25 Feb 2025
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models Yuxuan Zhang CLL ALM 63 1 0 25 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning Hojoon Lee Youngdo Lee Takuma Seno Donghu Kim Peter Stone Jaegul Choo 63 1 0 24 Feb 2025
Simplifying DINO via Coding Rate Regularization Ziyang Wu Jingyuan Zhang Druv Pai X. Wang Chandan Singh Jianwei Yang Jianfeng Gao Yi-An Ma 156 1 0 17 Feb 2025
Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks Josua Faller Jörg Martin BDL 73 0 0 04 Feb 2025
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective Sifan Wang Ananyae Kumar Bhartari Bowen Li P. Perdikaris PINN 54 4 0 02 Feb 2025
EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition Hamid Nasiri Peter Garraghan 39 1 0 21 Jan 2025
Normalizing Batch Normalization for Long-Tailed Recognition Yuxiang Bao Guoliang Kang Linlin Yang Xiaoyue Duan Bo Zhao Baochang Zhang MQ 47 0 0 06 Jan 2025
Optimization Insights into Deep Diagonal Linear Networks Hippolyte Labarrière C. Molinari Lorenzo Rosasco S. Villa Cristian Vega 76 0 0 21 Dec 2024
RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks Nazia Tasnim Bryan A. Plummer CLL OffRL 74 0 0 25 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks Jim Zhao Sidak Pal Singh Aurélien Lucchi AI4CE 39 0 0 04 Nov 2024
Improving self-training under distribution shifts via anchored confidence with theoretical guarantees Taejong Joo Diego Klabjan UQCV 49 0 0 01 Nov 2024
TrAct: Making First-layer Pre-Activations Trainable Felix Petersen Christian Borgelt Stefano Ermon 19 0 0 31 Oct 2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training Atli Kosson Bettina Messmer Martin Jaggi AI4CE 18 2 0 31 Oct 2024
Multilingual Vision-Language Pre-training for the Remote Sensing Domain João Daniel Silva João Magalhães D. Tuia Bruno Martins CLIP VLM 37 1 0 30 Oct 2024
Mitigating Gradient Overlap in Deep Residual Networks with Gradient Normalization for Improved Non-Convex Optimization Juyoung Yun 19 2 0 28 Oct 2024
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary Hao-Tang Tsui Chien-Yao Wang H. Liao ObjD VLM 48 0 0 20 Oct 2024
On the Regularization of Learnable Embeddings for Time Series Forecasting L. Butera G. Felice Andrea Cini C. Alippi AI4TS 29 0 0 18 Oct 2024
AERO: Softmax-Only LLMs for Efficient Private Inference N. Jha Brandon Reagen 27 1 0 16 Oct 2024
RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a Deep-Learning Word Prediction Framework Junxiao Shen Roger Boldu Arpit Kalla Michael Glueck Hemant Bhaskar Surale Amy Karlson 19 3 0 08 Oct 2024
FINALLY: fast and universal speech enhancement with studio-like quality Nicholas Babaev Kirill Tamogashev Azat Saginbaev Ivan Shchekotov Hanbin Bae Hosang Sung WonJun Lee Hoon-Young Cho Pavel Andreev 29 2 0 08 Oct 2024
Variable Bitrate Residual Vector Quantization for Audio Coding Yunkee Chae Woosung Choi Yuhta Takida Junghyun Koo Yukara Ikemiya ... K. Cheuk Marco A. Martínez Ramírez Kyogu Lee Wei-Hsiang Liao Yuki Mitsufuji 74 0 0 08 Oct 2024
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes Kosuke Nishida Kyosuke Nishida Kuniko Saito 28 1 0 07 Oct 2024
Addressing Data Heterogeneity in Federated Learning with Adaptive Normalization-Free Feature Recalibration Vasilis Siomos Sergio Naval Marimont Jonathan Passerat-Palmbach G. Tarroni 28 0 0 02 Oct 2024
Margin-bounded Confidence Scores for Out-of-Distribution Detection L. Tamang Mohamed Reda Bouadjenek Richard Dazeley Sunil Aryal OODD 41 0 0 22 Sep 2024
ASPINN: An asymptotic strategy for solving singularly perturbed differential equations Sen Wang Peizhi Zhao Tao Song 25 0 0 20 Sep 2024
A Comprehensive Survey on Evidential Deep Learning and Its Applications Junyu Gao Mengyuan Chen Liangyu Xiang Changsheng Xu EDL BDL UQCV 42 5 0 07 Sep 2024
Weight Conditioning for Smooth Optimization of Neural Networks Hemanth Saratchandran Thomas X. Wang Simon Lucey 38 0 0 05 Sep 2024
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation Takuhiro Kaneko Hirokazu Kameoka Kou Tanaka Yuto Kondo DiffM 40 0 0 03 Sep 2024
Pureformer-VC: Non-parallel One-Shot Voice Conversion with Pure Transformer Blocks and Triplet Discriminative Training Wenhan Yao Zedong Xing Xiarun Chen Jia Liu yongqiang He Weiping Wen 21 0 0 03 Sep 2024
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music N. Shikarpur Krishna Maneesha Dendukuri Yusong Wu Antoine Caillon Cheng-Zhi Anna Huang 15 1 0 22 Aug 2024
Oja's plasticity rule overcomes several challenges of training neural networks under biological constraints Navid Shervani-Tabar Marzieh Alireza Mirhoseini Robert Rosenbaum AAML AI4CE 37 0 0 15 Aug 2024
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement Le Yu Bowen Yu Haiyang Yu Fei Huang Yongbin Li MoMe 27 5 0 06 Aug 2024
ZNorm: Z-Score Gradient Normalization for Deep Neural Networks Giorgia Adorni Alberto Piatti 18 1 0 02 Aug 2024
Towards the Spectral bias Alleviation by Normalizations in Coordinate Networks Zhicheng Cai Hao Zhu Qiu Shen Xinran Wang Xun Cao 35 0 0 25 Jul 2024
Variational Potential Flow: A Novel Probabilistic Framework for Energy-Based Generative Modelling Junn Yong Loo Michelle Adeline Arghya Pal Vishnu Monn Baskaran Chee-Ming Ting Raphaël C.-W. Phan DiffM 36 0 0 21 Jul 2024
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning Augustine N. Mavor-Parker Matthew J. Sargent Caswell Barry Lewis D. Griffin Clare Lyle 37 2 0 09 Jul 2024
SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention Yunzhong Si Huiying Xu Xinzhong Zhu Wenhao Zhang Yao Dong Yuxing Chen Hongbo Li 45 19 0 06 Jul 2024
Normalization and effective learning rates in reinforcement learning Clare Lyle Zeyu Zheng Khimya Khetarpal James Martens H. V. Hasselt Razvan Pascanu Will Dabney 19 7 0 01 Jul 2024
Injectivity of ReLU-layers: Perspectives from Frame Theory Péter Balázs Martin Ehler Daniel Haider FAtt 20 0 0 22 Jun 2024