ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1602.07868
  4. Cited By
Weight Normalization: A Simple Reparameterization to Accelerate Training
  of Deep Neural Networks

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

25 February 2016
Tim Salimans
Diederik P. Kingma
    ODL
ArXivPDFHTML

Papers citing "Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks"

50 / 957 papers shown
Title
Mask-PINNs: Regulating Feature Distributions in Physics-Informed Neural Networks
Mask-PINNs: Regulating Feature Distributions in Physics-Informed Neural Networks
Feilong Jiang
Xiaonan Hou
Jianqiao Ye
Min Xia
OOD
PINN
42
0
0
09 May 2025
On the Importance of Gaussianizing Representations
On the Importance of Gaussianizing Representations
Daniel Eftekhari
Vardan Papyan
26
0
0
01 May 2025
MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis
MERA: Multimodal and Multiscale Self-Explanatory Model with Considerably Reduced Annotation for Lung Nodule Diagnosis
Jiahao Lu
Chong Yin
Silvia Ingala
Kenny Erleben
M. Nielsen
S. Darkner
51
0
0
27 Apr 2025
Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching
Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching
Junn Yong Loo
Michelle Adeline
Julia Kaiwen Lau
Fang Yu Leong
Hwa Hui Tew
Arghya Pal
Vishnu Monn Baskaran
Chee-Ming Ting
Raphaël C.-W. Phan
BDL
66
0
0
22 Apr 2025
SparsyFed: Sparse Adaptive Federated Training
SparsyFed: Sparse Adaptive Federated Training
Adriano Guastella
Lorenzo Sani
Alex Iacob
Alessio Mora
Paolo Bellavista
Nicholas D. Lane
FedML
31
0
0
07 Apr 2025
FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation
FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation
Mohammadmahdi Honarmand
O. Mutlu
Parnian Azizian
Saimourya Surabhi
Dennis Paul Wall
TTA
75
0
0
29 Mar 2025
Beyond RGB: Adaptive Parallel Processing for RAW Object Detection
Beyond RGB: Adaptive Parallel Processing for RAW Object Detection
Shani Gamrian
Hila Barel
Feiran Li
Masakazu Yoshimura
Daisuke Iso
51
0
0
17 Mar 2025
Transformers without Normalization
Jiachen Zhu
Xinlei Chen
Kaiming He
Yann LeCun
Zhuang Liu
ViT
OffRL
56
7
0
13 Mar 2025
Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion
Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion
Mona Sheikh Zeinoddin
Mobarakol Islam
Zafer Tandogdu
Greg Shaw
Mathew J. Clarkson
E. Mazomenos
Danail Stoyanov
130
0
0
10 Mar 2025
Same accuracy, twice as fast: continuous training surpasses retraining from scratch
Same accuracy, twice as fast: continuous training surpasses retraining from scratch
Eli Verwimp
Guy Hacohen
Tinne Tuytelaars
OnRL
39
0
0
28 Feb 2025
Self-Adjust Softmax
Self-Adjust Softmax
Chuanyang Zheng
Yihang Gao
Guoxuan Chen
Han Shi
Jing Xiong
Xiaozhe Ren
Chao Huang
Xin Jiang
Z. Li
Yu-Hu Li
38
0
0
25 Feb 2025
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models
SECURA: Sigmoid-Enhanced CUR Decomposition with Uninterrupted Retention and Low-Rank Adaptation in Large Language Models
Yuxuan Zhang
CLL
ALM
65
1
0
25 Feb 2025
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hyperspherical Normalization for Scalable Deep Reinforcement Learning
Hojoon Lee
Youngdo Lee
Takuma Seno
Donghu Kim
Peter Stone
Jaegul Choo
63
1
0
24 Feb 2025
Simplifying DINO via Coding Rate Regularization
Simplifying DINO via Coding Rate Regularization
Ziyang Wu
Jingyuan Zhang
Druv Pai
X. Wang
Chandan Singh
Jianwei Yang
Jianfeng Gao
Yi-An Ma
159
1
0
17 Feb 2025
Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks
Optimal Subspace Inference for the Laplace Approximation of Bayesian Neural Networks
Josua Faller
Jörg Martin
BDL
73
0
0
04 Feb 2025
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Sifan Wang
Ananyae Kumar Bhartari
Bowen Li
P. Perdikaris
PINN
56
4
0
02 Feb 2025
EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition
EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition
Hamid Nasiri
Peter Garraghan
41
1
0
21 Jan 2025
Normalizing Batch Normalization for Long-Tailed Recognition
Yuxiang Bao
Guoliang Kang
Linlin Yang
Xiaoyue Duan
Bo Zhao
Baochang Zhang
MQ
47
0
0
06 Jan 2025
Optimization Insights into Deep Diagonal Linear Networks
Optimization Insights into Deep Diagonal Linear Networks
Hippolyte Labarrière
C. Molinari
Lorenzo Rosasco
S. Villa
Cristian Vega
76
0
0
21 Dec 2024
RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
RECAST: Reparameterized, Compact weight Adaptation for Sequential Tasks
Nazia Tasnim
Bryan A. Plummer
CLL
OffRL
74
0
0
25 Nov 2024
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Jim Zhao
Sidak Pal Singh
Aurélien Lucchi
AI4CE
39
0
0
04 Nov 2024
Improving self-training under distribution shifts via anchored
  confidence with theoretical guarantees
Improving self-training under distribution shifts via anchored confidence with theoretical guarantees
Taejong Joo
Diego Klabjan
UQCV
49
0
0
01 Nov 2024
TrAct: Making First-layer Pre-Activations Trainable
TrAct: Making First-layer Pre-Activations Trainable
Felix Petersen
Christian Borgelt
Stefano Ermon
21
0
0
31 Oct 2024
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
Atli Kosson
Bettina Messmer
Martin Jaggi
AI4CE
18
2
0
31 Oct 2024
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
João Daniel Silva
João Magalhães
D. Tuia
Bruno Martins
CLIP
VLM
42
1
0
30 Oct 2024
Mitigating Gradient Overlap in Deep Residual Networks with Gradient
  Normalization for Improved Non-Convex Optimization
Mitigating Gradient Overlap in Deep Residual Networks with Gradient Normalization for Improved Non-Convex Optimization
Juyoung Yun
21
2
0
28 Oct 2024
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary
YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary
Hao-Tang Tsui
Chien-Yao Wang
H. Liao
ObjD
VLM
51
0
0
20 Oct 2024
On the Regularization of Learnable Embeddings for Time Series Forecasting
On the Regularization of Learnable Embeddings for Time Series Forecasting
L. Butera
G. Felice
Andrea Cini
C. Alippi
AI4TS
29
0
0
18 Oct 2024
AERO: Softmax-Only LLMs for Efficient Private Inference
AERO: Softmax-Only LLMs for Efficient Private Inference
N. Jha
Brandon Reagen
27
1
0
16 Oct 2024
RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a
  Deep-Learning Word Prediction Framework
RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a Deep-Learning Word Prediction Framework
Junxiao Shen
Roger Boldu
Arpit Kalla
Michael Glueck
Hemant Bhaskar Surale Amy Karlson
21
3
0
08 Oct 2024
FINALLY: fast and universal speech enhancement with studio-like quality
FINALLY: fast and universal speech enhancement with studio-like quality
Nicholas Babaev
Kirill Tamogashev
Azat Saginbaev
Ivan Shchekotov
Hanbin Bae
Hosang Sung
WonJun Lee
Hoon-Young Cho
Pavel Andreev
29
2
0
08 Oct 2024
Variable Bitrate Residual Vector Quantization for Audio Coding
Variable Bitrate Residual Vector Quantization for Audio Coding
Yunkee Chae
Woosung Choi
Yuhta Takida
Junghyun Koo
Yukara Ikemiya
...
K. Cheuk
Marco A. Martínez Ramírez
Kyogu Lee
Wei-Hsiang Liao
Yuki Mitsufuji
74
0
0
08 Oct 2024
Initialization of Large Language Models via Reparameterization to
  Mitigate Loss Spikes
Initialization of Large Language Models via Reparameterization to Mitigate Loss Spikes
Kosuke Nishida
Kyosuke Nishida
Kuniko Saito
28
1
0
07 Oct 2024
Addressing Data Heterogeneity in Federated Learning with Adaptive
  Normalization-Free Feature Recalibration
Addressing Data Heterogeneity in Federated Learning with Adaptive Normalization-Free Feature Recalibration
Vasilis Siomos
Sergio Naval Marimont
Jonathan Passerat-Palmbach
G. Tarroni
28
0
0
02 Oct 2024
Margin-bounded Confidence Scores for Out-of-Distribution Detection
Margin-bounded Confidence Scores for Out-of-Distribution Detection
L. Tamang
Mohamed Reda Bouadjenek
Richard Dazeley
Sunil Aryal
OODD
41
0
0
22 Sep 2024
ASPINN: An asymptotic strategy for solving singularly perturbed
  differential equations
ASPINN: An asymptotic strategy for solving singularly perturbed differential equations
Sen Wang
Peizhi Zhao
Tao Song
25
0
0
20 Sep 2024
A Comprehensive Survey on Evidential Deep Learning and Its Applications
A Comprehensive Survey on Evidential Deep Learning and Its Applications
Junyu Gao
Mengyuan Chen
Liangyu Xiang
Changsheng Xu
EDL
BDL
UQCV
42
5
0
07 Sep 2024
Weight Conditioning for Smooth Optimization of Neural Networks
Weight Conditioning for Smooth Optimization of Neural Networks
Hemanth Saratchandran
Thomas X. Wang
Simon Lucey
38
0
0
05 Sep 2024
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with
  Adversarial Conditional Diffusion Distillation
FastVoiceGrad: One-step Diffusion-Based Voice Conversion with Adversarial Conditional Diffusion Distillation
Takuhiro Kaneko
Hirokazu Kameoka
Kou Tanaka
Yuto Kondo
DiffM
40
0
0
03 Sep 2024
Pureformer-VC: Non-parallel One-Shot Voice Conversion with Pure
  Transformer Blocks and Triplet Discriminative Training
Pureformer-VC: Non-parallel One-Shot Voice Conversion with Pure Transformer Blocks and Triplet Discriminative Training
Wenhan Yao
Zedong Xing
Xiarun Chen
Jia Liu
yongqiang He
Weiping Wen
21
0
0
03 Sep 2024
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani
  Classical Music
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music
N. Shikarpur
Krishna Maneesha Dendukuri
Yusong Wu
Antoine Caillon
Cheng-Zhi Anna Huang
20
1
0
22 Aug 2024
Oja's plasticity rule overcomes several challenges of training neural networks under biological constraints
Oja's plasticity rule overcomes several challenges of training neural networks under biological constraints
Navid Shervani-Tabar
Marzieh Alireza Mirhoseini
Robert Rosenbaum
AAML
AI4CE
37
0
0
15 Aug 2024
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language
  Models via Weight Disentanglement
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement
Le Yu
Bowen Yu
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
27
5
0
06 Aug 2024
ZNorm: Z-Score Gradient Normalization for Deep Neural Networks
ZNorm: Z-Score Gradient Normalization for Deep Neural Networks
Giorgia Adorni
Alberto Piatti
18
0
0
02 Aug 2024
Towards the Spectral bias Alleviation by Normalizations in Coordinate
  Networks
Towards the Spectral bias Alleviation by Normalizations in Coordinate Networks
Zhicheng Cai
Hao Zhu
Qiu Shen
Xinran Wang
Xun Cao
35
0
0
25 Jul 2024
Variational Potential Flow: A Novel Probabilistic Framework for
  Energy-Based Generative Modelling
Variational Potential Flow: A Novel Probabilistic Framework for Energy-Based Generative Modelling
Junn Yong Loo
Michelle Adeline
Arghya Pal
Vishnu Monn Baskaran
Chee-Ming Ting
Raphaël C.-W. Phan
DiffM
38
0
0
21 Jul 2024
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning
Augustine N. Mavor-Parker
Matthew J. Sargent
Caswell Barry
Lewis D. Griffin
Clare Lyle
37
2
0
09 Jul 2024
SCSA: Exploring the Synergistic Effects Between Spatial and Channel
  Attention
SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention
Yunzhong Si
Huiying Xu
Xinzhong Zhu
Wenhao Zhang
Yao Dong
Yuxing Chen
Hongbo Li
45
19
0
06 Jul 2024
Normalization and effective learning rates in reinforcement learning
Normalization and effective learning rates in reinforcement learning
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
James Martens
H. V. Hasselt
Razvan Pascanu
Will Dabney
19
7
0
01 Jul 2024
Injectivity of ReLU-layers: Perspectives from Frame Theory
Injectivity of ReLU-layers: Perspectives from Frame Theory
Péter Balázs
Martin Ehler
Daniel Haider
FAtt
20
0
0
22 Jun 2024
1234...181920
Next