ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.05101
  4. Cited By
Decoupled Weight Decay Regularization
v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017
I. Loshchilov
Katharina Eggensperger
    OffRL
ArXiv (abs)PDFHTMLGithub (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown
Neural Networks for Predicting Permeability Tensors of 2D Porous Media: Comparison of Convolution- and Transformer-based Architectures
Neural Networks for Predicting Permeability Tensors of 2D Porous Media: Comparison of Convolution- and Transformer-based Architectures
Sigurd Vargdal
Paula Reis
Henrik Andersen Sveinsson
Gaute Linga
MedIm
213
0
0
01 Dec 2025
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision
Chenting Wang
Yuhan Zhu
Yicheng Xu
Jiange Yang
Ziang Yan
Yali Wang
Yi Wang
Limin Wang
VGen
168
0
0
01 Dec 2025
DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants
Martin Andrae
Erik Larsson
So Takao
Tomas Landelius
Fredrik Lindsten
76
0
0
29 Nov 2025
Does Self-Evaluation Enable Wireheading in Language Models?
Does Self-Evaluation Enable Wireheading in Language Models?
David Demitri Africa
Hans Ethan Ting
203
0
0
28 Nov 2025
Closed-Loop Transformers: Autoregressive Modeling as Iterative Latent Equilibrium
Closed-Loop Transformers: Autoregressive Modeling as Iterative Latent Equilibrium
Akbar Anbar Jafari
G. Anbarjafari
68
1
0
26 Nov 2025
Deterministic Continuous Replacement: Fast and Stable Module Replacement in Pretrained Transformers
Deterministic Continuous Replacement: Fast and Stable Module Replacement in Pretrained Transformers
Rowan Bradbury
Aniket Srinivasan Ashok
Sai Ram Kasanagottu
Gunmay Jhingran
Shuai Meng
146
0
0
24 Nov 2025
Rethinking Vision Transformer Depth via Structural Reparameterization
Rethinking Vision Transformer Depth via Structural Reparameterization
Chengwei Zhou
Vipin Chaudhary
Gourav Datta
ViT
109
0
0
24 Nov 2025
Coherent Multi-Agent Trajectory Forecasting in Team Sports with CausalTraj
Coherent Multi-Agent Trajectory Forecasting in Team Sports with CausalTraj
Wei Zhen Teoh
AI4TS
206
0
0
23 Nov 2025
RNN as Linear Transformer: A Closer Investigation into Representational Potentials of Visual Mamba Models
RNN as Linear Transformer: A Closer Investigation into Representational Potentials of Visual Mamba Models
Timing Yang
Guoyizhe Wei
Alan Yuille
Feng Wang
Mamba
142
0
0
23 Nov 2025
Contrastive vision-language learning with paraphrasing and negation
K. Ngan
Saman Sadeghi Afgeh
Joe Townsend
Artur Garcez
VLM
177
0
0
20 Nov 2025
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence
Liyuan Deng
Yunpeng Bai
Yongkang Dai
Xiaoshui Huang
Hongping Gan
Dongshuo Huang
Hao jiacheng
Yilei Shi
92
0
0
20 Nov 2025
Unsupervised Image Classification with Adaptive Nearest Neighbor Selection and Cluster Ensembles
Melih Baydar
Emre Akbas
182
0
0
20 Nov 2025
First Frame Is the Place to Go for Video Content Customization
First Frame Is the Place to Go for Video Content Customization
Jingxi Chen
Z. Li
Zhichao Liu
Guangyao Shi
Xiyang Wu
Fuxiao Liu
Cornelia Fermüller
Brandon Yushan Feng
Yiannis Aloimonos
DiffMVGen
202
0
0
19 Nov 2025
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model
Y. Yang
Zhi Cen
Sida Peng
Xiangwei Chen
Yifu Deng
Xinyu Zhu
Fan Jia
Xiaowei Zhou
Hujun Bao
DiffMVGen
324
0
0
18 Nov 2025
PerTouch: VLM-Driven Agent for Personalized and Semantic Image Retouching
PerTouch: VLM-Driven Agent for Personalized and Semantic Image Retouching
Zewei Chang
Zheng-Peng Duan
Jianxing Zhang
Chun-Le Guo
Siyu Liu
Hyungju Chun
Hyunhee Park
Zikun Liu
Chongyi Li
DiffM
312
0
0
17 Nov 2025
Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining
Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining
Zhaocheng Yu
Kui Jiang
Junjun Jiang
Xianming Liu
Guanglu Sun
Yi Xiao
136
0
0
17 Nov 2025
AdamNX: An Adam improvement algorithm based on a novel exponential decay mechanism for the second-order moment estimate
AdamNX: An Adam improvement algorithm based on a novel exponential decay mechanism for the second-order moment estimate
Meng Zhu
Quan Xiao
Weidong Min
266
0
0
17 Nov 2025
D$^{2}$-VPR: A Parameter-efficient Visual-foundation-model-based Visual Place Recognition Method via Knowledge Distillation and Deformable Aggregation
D2^{2}2-VPR: A Parameter-efficient Visual-foundation-model-based Visual Place Recognition Method via Knowledge Distillation and Deformable Aggregation
Zheyuan Zhang
Jiwei Zhang
Boyu Zhou
Linzhimeng Duan
Hong Chen
156
1
0
16 Nov 2025
Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions
Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions
Emi Soroka
Tanmay Chopra
Krish Desai
Sanjay Lall
ALM
341
0
0
04 Nov 2025
A Generative Adversarial Approach to Adversarial Attacks Guided by Contrastive Language-Image Pre-trained Model
A Generative Adversarial Approach to Adversarial Attacks Guided by Contrastive Language-Image Pre-trained Model
Sampriti Soor
Alik Pramanick
Jothiprakash K
Arijit Sur
AAMLGANVLM
644
0
0
03 Nov 2025
FedMuon: Accelerating Federated Learning with Matrix Orthogonalization
FedMuon: Accelerating Federated Learning with Matrix Orthogonalization
Junkang Liu
Fanhua Shang
Junchao Zhou
Hongying Liu
Yuanyuan Liu
Jin Liu
FedML
213
2
0
31 Oct 2025
FedAdamW: A Communication-Efficient Optimizer with Convergence and Generalization Guarantees for Federated Large Models
FedAdamW: A Communication-Efficient Optimizer with Convergence and Generalization Guarantees for Federated Large Models
Junkang Liu
Fanhua Shang
Kewen Zhu
Hongying Liu
Yuanyuan Liu
Jin Liu
FedML
194
1
0
31 Oct 2025
Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism
Yuhua Jiang
Shuang Cheng
Yihao Liu
Ermo Hua
Che Jiang
Weigao Sun
Yu Cheng
Feifei Gao
Biqing Qi
Bowen Zhou
93
0
0
30 Oct 2025
DualCap: Enhancing Lightweight Image Captioning via Dual Retrieval with Similar Scenes Visual Prompts
DualCap: Enhancing Lightweight Image Captioning via Dual Retrieval with Similar Scenes Visual Prompts
Binbin Li
Guimiao Yang
Zisen Qi
Haiping Wang
Yu Ding
VLM
330
0
0
28 Oct 2025
Learning "Partner-Aware" Collaborators in Multi-Party Collaboration
Learning "Partner-Aware" Collaborators in Multi-Party Collaboration
Abhijnan Nath
Nikhil Krishnaswamy
123
0
0
26 Oct 2025
Model-Aware Tokenizer Transfer
Model-Aware Tokenizer Transfer
Mykola Haltiuk
Aleksander Smywiński-Pohl
120
0
0
24 Oct 2025
Modest-Align: Data-Efficient Alignment for Vision-Language Models
Modest-Align: Data-Efficient Alignment for Vision-Language Models
Jiaxiang Liu
Yuan Wang
Jiawei Du
Joey Tianyi Zhou
Mingkun Xu
Zuozhu Liu
VLM
123
0
0
24 Oct 2025
What Does It Take to Build a Performant Selective Classifier?
What Does It Take to Build a Performant Selective Classifier?
Stephan Rabanser
Nicolas Papernot
214
0
0
23 Oct 2025
Deep Learning-Based Control Optimization for Glass Bottle Forming
Deep Learning-Based Control Optimization for Glass Bottle Forming
Mattia Pujatti
Andrea Di Luca
Nicola Peghini
Federico Monegaglia
Marco Cristoforetti
AI4CE
52
0
0
21 Oct 2025
Trace Anything: Representing Any Video in 4D via Trajectory Fields
Trace Anything: Representing Any Video in 4D via Trajectory Fields
Xinhang Liu
Yuxi Xiao
Donny Y. Chen
Jiashi Feng
Yu-Wing Tai
Chi-Keung Tang
Bingyi Kang
136
4
0
15 Oct 2025
Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning
Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning
Guozhi Liu
Qi Mu
Tiansheng Huang
Xinhua Wang
Li Shen
Weiwei Lin
Zhang Li
124
1
0
11 Oct 2025
Probabilistic Hyper-Graphs using Multiple Randomly Masked Autoencoders for Semi-supervised Multi-modal Multi-task Learning
Probabilistic Hyper-Graphs using Multiple Randomly Masked Autoencoders for Semi-supervised Multi-modal Multi-task Learning
Pîrvu Mihai-Cristian
Leordeanu Marius
181
1
0
11 Oct 2025
Reconstructing the local density field with combined convolutional and point cloud architecture
Reconstructing the local density field with combined convolutional and point cloud architecture
Baptiste Barthe-Gold
Nhat-Minh Nguyen
Leander Thiele
3DPC
185
0
0
09 Oct 2025
Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training
Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training
Ruizhe Wang
Yucheng Ding
Xiao Liu
Yaoxiang Wang
Peng Cheng
Baining Guo
Zhengjun Zha
Yeyun Gong
145
0
0
09 Oct 2025
Reinforcement Learning-based Task Offloading in the Internet of Wearable Things
Reinforcement Learning-based Task Offloading in the Internet of Wearable Things
Waleed Bin Qaim
Aleksandr Ometov
Claudia Campolo
Antonella Molinaro
E. Lohan
J. Nurmi
OffRL
144
0
0
08 Oct 2025
Mid-Training of Large Language Models: A Survey
Mid-Training of Large Language Models: A Survey
Kaixiang Mo
Yuxin Shi
Weiwei Weng
Zhiqiang Zhou
Shuman Liu
Haibo Zhang
Anxiang Zeng
LRM
151
0
0
08 Oct 2025
MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis
MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis
Qinghua Liu
Sam Heshmati
Zheda Mai
Zubin Abraham
John Paparrizos
Liu Ren
AI4TS
132
1
0
08 Oct 2025
Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains
Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains
Yawen Xue
Masaya Tsunokake
Yuta Koreeda
Ekant Muljibhai Amin
Takashi Sumiyoshi
Yasuhiro Sogawa
LLMAG
124
0
0
01 Oct 2025
A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features
A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features
Axel Barroso-Laguna
Tommaso Cavallari
V. Prisacariu
Eric Brachmann
164
0
0
01 Oct 2025
Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Purrception: Variational Flow Matching for Vector-Quantized Image Generation
Răzvan-Andrei Matişan
Vincent Tao Hu
Grigory Bartosh
Bjorn Ommer
Cees G. M. Snoek
Max Welling
Jan-Willem van de Meent
Mohammad Mahdi Derakhshani
Floor Eijkelboom
140
1
0
01 Oct 2025
InfVSR: Breaking Length Limits of Generic Video Super-Resolution
InfVSR: Breaking Length Limits of Generic Video Super-Resolution
Ziqing Zhang
Kai Liu
Zheng Chen
X. Li
Yihao Chen
Bingnan Duan
Linghe Kong
Yulun Zhang
159
2
0
01 Oct 2025
Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack
Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack
Nanxiang Jiang
Zhaoxin Fan
Enhan Kang
Daiheng Gao
Yun Zhou
Yanxia Chang
Zheng Zhu
Yeying Jin
Wenjun Wu
AAML
184
0
0
01 Oct 2025
Asymmetric VAE for One-Step Video Super-Resolution Acceleration
Asymmetric VAE for One-Step Video Super-Resolution Acceleration
Jianze Li
Yong Guo
Yulun Zhang
Xiaokang Yang
DiffM
111
0
0
29 Sep 2025
Effective Quantization of Muon Optimizer States
Effective Quantization of Muon Optimizer States
Aman Gupta
Rafael Celente
Abhishek Shivanna
D. T. Braithwaite
Gregory Dexter
Shao Tang
Hiroto Udagawa
Daniel Silva
R. Ramanath
S. Keerthi
MQ
139
0
0
27 Sep 2025
You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors
You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors
Bochuan Cao
Changjiang Li
Yuanpu Cao
Yameng Ge
Ting Wang
Jinghui Chen
AAML
113
3
0
26 Sep 2025
One Filters All: A Generalist Filter for State Estimation
One Filters All: A Generalist Filter for State Estimation
Shiqi Liu
Wenhan Cao
Chang Liu
Zeyu He
Tianyi Zhang
Jingliang Duan
OffRL
172
2
0
24 Sep 2025
VGGT-DP: Generalizable Robot Control via Vision Foundation Models
VGGT-DP: Generalizable Robot Control via Vision Foundation Models
Shijia Ge
Yinxin Zhang
Shuzhao Xie
Weixiang Zhang
Mingcai Zhou
Zhi Wang
85
0
0
23 Sep 2025
MirrorSAM2: Segment Mirror in Videos with Depth Perception
MirrorSAM2: Segment Mirror in Videos with Depth Perception
Mingchen Xu
Yukun Lai
Ze Ji
Jing Wu
VLMMDE
136
0
0
21 Sep 2025
Advancing Speech Understanding in Speech-Aware Language Models with GRPO
Advancing Speech Understanding in Speech-Aware Language Models with GRPO
Avishai Elmakies
Hagai Aronowitz
Nimrod Shabtay
Eli Schwartz
R. Hoory
Avihu Dekel
116
1
0
21 Sep 2025
Unlocking Hidden Potential in Point Cloud Networks with Attention-Guided Grouping-Feature Coordination
Unlocking Hidden Potential in Point Cloud Networks with Attention-Guided Grouping-Feature Coordination
Shangzhuo Xie
Qianqian Yang
3DPC
132
0
0
20 Sep 2025
1234...232425
Next