ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.05101
  4. Cited By
Decoupled Weight Decay Regularization
v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017
I. Loshchilov
Katharina Eggensperger
    OffRL
ArXiv (abs)PDFHTMLGithub (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown
MoAngelo: Motion-Aware Neural Surface Reconstruction for Dynamic Scenes
MoAngelo: Motion-Aware Neural Surface Reconstruction for Dynamic Scenes
Mohamed Ebbed
Zorah Lähner
3DH
99
0
0
19 Sep 2025
Positional Encoding via Token-Aware Phase Attention
Positional Encoding via Token-Aware Phase Attention
Wang
Sheng Shen
Rémi Munos
Hongyuan Zhan
Yuandong Tian
189
0
0
16 Sep 2025
Towards Foundational Models for Single-Chip Radar
Towards Foundational Models for Single-Chip Radar
Tianshu Huang
Akarsh Prabhakara
Chuhan Chen
Jay Karhade
Deva Ramanan
Matthew O'Toole
Anthony G. Rowe
184
1
0
15 Sep 2025
Weakly Supervised Vulnerability Localization via Multiple Instance Learning
Weakly Supervised Vulnerability Localization via Multiple Instance LearningACM Transactions on Software Engineering and Methodology (TOSEM), 2025
Wenchao Gu
Yupan Chen
Yanlin Wang
Hongyu Zhang
Cuiyun Gao
Michael R. Lyu
AAML
137
0
0
14 Sep 2025
Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Dubing Chen
Huan Zheng
Yucheng Zhou
Xianfei Li
Wenlong Liao
Tao He
Pai Peng
Jianbing Shen
3DPC
122
3
0
10 Sep 2025
Collaborate, Deliberate, Evaluate: How LLM Alignment Affects Coordinated Multi-Agent Outcomes
Collaborate, Deliberate, Evaluate: How LLM Alignment Affects Coordinated Multi-Agent Outcomes
Abhijnan Nath
Carine Graff
Nikhil Krishnaswamy
LLMAG
159
3
0
07 Sep 2025
Empowering Large Language Model for Sequential Recommendation via Multimodal Embeddings and Semantic IDs
Empowering Large Language Model for Sequential Recommendation via Multimodal Embeddings and Semantic IDs
Yuhao Wang
Junwei Pan
Xinhang Li
Xinjian Zhao
Y Samuel Wang
Yue Liu
Dapeng Liu
Jie Jiang
Xiangyu Zhao
159
2
0
02 Sep 2025
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
Georgios Papoudakis
Thomas Coste
Jianye Hao
Jun Wang
Cheng Deng
OffRL
276
0
0
01 Sep 2025
Mamba-CNN: A Hybrid Architecture for Efficient and Accurate Facial Beauty Prediction
Mamba-CNN: A Hybrid Architecture for Efficient and Accurate Facial Beauty Prediction
Djamel Eddine Boukhari
CVBMMamba3DH
141
4
0
01 Sep 2025
Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection
Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection
Ye Tao
Xinran Fu
Honglin Pang
Xi Yang
Chuntao Li
81
3
0
26 Aug 2025
RoofSeg: An edge-aware transformer-based network for end-to-end roof plane segmentation
RoofSeg: An edge-aware transformer-based network for end-to-end roof plane segmentation
Siyuan You
Guozheng Xu
Pengwei Zhou
Qiwen Jin
Jian Yao
Li Li
ViT
120
0
0
26 Aug 2025
UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization
UniSino: Physics-Driven Foundational Model for Universal CT Sinogram Standardization
Xingyu Ai
Shaoyu Wang
Zhiyuan Jia
Ao Xu
Hongming Shan
Jianhua Ma
Qiegen Liu
97
0
0
25 Aug 2025
DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method
DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method
Qingwen Zhang
Xiaomeng Zhu
Yushan Zhang
Yixi Cai
Olov Andersson
Patric Jensfelt
258
0
0
23 Aug 2025
CLAIRE-DSA: Fluoroscopic Image Classification for Quality Assurance of Computer Vision Pipelines in Acute Ischemic Stroke
CLAIRE-DSA: Fluoroscopic Image Classification for Quality Assurance of Computer Vision Pipelines in Acute Ischemic Stroke
Cristo J. van den Berg
Frank G. te Nijenhuis
Mirre J. Blaauboer
Daan T. W. van Erp
Carlijn M. Keppels
...
W. V. van Zwam
Sandra A. P. Cornelissen
D. Ruijters
Ruisheng Su
T. van Walsum
49
0
0
18 Aug 2025
OVG-HQ: Online Video Grounding with Hybrid-modal Queries
OVG-HQ: Online Video Grounding with Hybrid-modal Queries
Runhao Zeng
Jiaqi Mao
Minghao Lai
Minh Hieu Phan
Yanjie Dong
Wei Wang
Qi Chen
Xiping Hu
161
0
0
16 Aug 2025
CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization
CPO: Addressing Reward Ambiguity in Role-playing Dialogue via Comparative Policy Optimization
Xinge Ye
Rui Wang
Yuchuan Wu
Victor Ma
Feiteng Fang
Fei Huang
Y. Li
OffRL
147
4
0
12 Aug 2025
FetFIDS: A Feature Embedding Attention based Federated Network Intrusion Detection Algorithm
FetFIDS: A Feature Embedding Attention based Federated Network Intrusion Detection Algorithm
Shreya Ghosh
Abu Shafin Mohammad Mahdee Jameel
Aly El Gamal
FedML
28
0
0
12 Aug 2025
Gradient Surgery for Safe LLM Fine-Tuning
Gradient Surgery for Safe LLM Fine-Tuning
Biao Yi
Jiahao Li
Baolei Zhang
Lihai Nie
Tong Li
Tiansheng Huang
Zheli Liu
122
2
0
10 Aug 2025
MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
MobileViCLIP: An Efficient Video-Text Model for Mobile Devices
Min Yang
Zihan Jia
Zhilin Dai
Sheng Guo
Limin Wang
CLIPVLM
195
0
0
10 Aug 2025
AdvDINO: Domain-Adversarial Self-Supervised Representation Learning for Spatial Proteomics
AdvDINO: Domain-Adversarial Self-Supervised Representation Learning for Spatial Proteomics
Stella Su
Marc Harary
Scott J. Rodig
William Lotter
64
2
0
07 Aug 2025
Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation
Learning to See and Act: Task-Aware Virtual View Exploration for Robotic Manipulation
Yongjie Bai
Zhouxia Wang
Wenshu Fan
Weixing Chen
Ziliang Chen
...
Yongsen Zheng
Lingbo Liu
Guanbin Li
Guanbin Li
Liang Lin
295
1
0
07 Aug 2025
Bidding-Aware Retrieval for Multi-Stage Consistency in Online Advertising
Bidding-Aware Retrieval for Multi-Stage Consistency in Online Advertising
Yinan Han
Y. Liu
Ziru Xu
Zhaoyu Zhou
Zhi Kou
Yeqiu Yang
Han Zhu
Jian Xu
Bo Zheng
81
0
0
07 Aug 2025
Audio Does Matter: Importance-Aware Multi-Granularity Fusion for Video Moment Retrieval
Audio Does Matter: Importance-Aware Multi-Granularity Fusion for Video Moment Retrieval
Junan Lin
Daizong Liu
Xianke Chen
Xiaoye Qu
Xun Yang
Jixiang Zhu
Sanyuan Zhang
Jianfeng Dong
302
0
0
06 Aug 2025
Speech-to-LaTeX: New Models and Datasets for Converting Spoken Equations and Sentences
Speech-to-LaTeX: New Models and Datasets for Converting Spoken Equations and Sentences
Dmitrii Korzh
Dmitrii Tarasov
Artyom Iudin
Elvir Karimov
Matvey Skripkin
Nikita Kuzmin
Andrey Kuznetsov
Oleg Y. Rogov
Ivan Oseledets
171
0
0
05 Aug 2025
Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver
Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver
J. Patsenker
Henry Li
Myeongseob Ko
Ruoxi Jia
Y. Kluger
DiffM
331
0
0
05 Aug 2025
Trainable Dynamic Mask Sparse Attention
Trainable Dynamic Mask Sparse Attention
Jingze Shi
Yifan Wu
Yiran Peng
Yiran Peng
Liangdong Wang
Guang Liu
Yuyu Luo
351
3
0
04 Aug 2025
The Art of Breaking Words: Rethinking Multilingual Tokenizer Design
The Art of Breaking Words: Rethinking Multilingual Tokenizer Design
Aamod Thakur
Ajay Nagpal
Atharva Savarkar
Kundeshwar Pundalik
Siddhesh Dosi
Piyush Sawarkar
Viraj Thakur
Rohit Saluja
Maunendra Sankar Desarkar
Ganesh Ramakrishnan
104
2
0
03 Aug 2025
InspectVLM: Unified in Theory, Unreliable in Practice
InspectVLM: Unified in Theory, Unreliable in Practice
Conor Wallace
Isaac Corley
Jonathan Lwowski
MLLMVLM
111
0
0
03 Aug 2025
Versatile Transition Generation with Image-to-Video Diffusion
Versatile Transition Generation with Image-to-Video Diffusion
Zuhao Yang
Jiahui Zhang
Yingchen Yu
Shijian Lu
Song Bai
DiffMVGen
233
3
0
03 Aug 2025
Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion
Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion
Timing Li
Bing Cao
Jiahe Feng
Haifang Cao
Q. Hu
Q. Hu
135
0
0
31 Jul 2025
SCANet: Split Coordinate Attention Network for Building Footprint Extraction
SCANet: Split Coordinate Attention Network for Building Footprint ExtractionInternational Conference on Neural Information Processing (ICONIP), 2025
Chunshi Wang
Bin Zhao
Shuxue Ding
ViT
132
0
0
28 Jul 2025
Regularizing Subspace Redundancy of Low-Rank Adaptation
Regularizing Subspace Redundancy of Low-Rank Adaptation
Yue Zhu
Haiwen Diao
Shang Gao
Jiazuo Yu
Jiawen Zhu
...
Shuai Hao
Xu Jia
Lu Zhang
Y. Zhang
Huchuan Lu
200
0
0
28 Jul 2025
MambaMap: Online Vectorized HD Map Construction using State Space Model
MambaMap: Online Vectorized HD Map Construction using State Space Model
Ruizi Yang
Xiaolu Liu
Junbo Chen
Jianke Zhu
Mamba
165
0
0
27 Jul 2025
FinDPO: Financial Sentiment Analysis for Algorithmic Trading through Preference Optimization of LLMs
FinDPO: Financial Sentiment Analysis for Algorithmic Trading through Preference Optimization of LLMs
Giorgos Iacovides
Wuyang Zhou
Danilo Mandic
150
4
0
24 Jul 2025
DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD
DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD
Xianbiao Qi
Marco Chen
Wenjie Xiao
Jiaquan Ye
Yelin He
Chun-Guang Li
Zhouchen Lin
OffRL
140
0
0
23 Jul 2025
FW-VTON: Flattening-and-Warping for Person-to-Person Virtual Try-on
FW-VTON: Flattening-and-Warping for Person-to-Person Virtual Try-on
Zheng Wang
Xianbing Sun
Shengyi Wu
Jiahui Zhan
Jianlou Si
Chi Zhang
Liqing Zhang
Jianfu Zhang
155
0
0
21 Jul 2025
TriCLIP-3D: A Unified Parameter-Efficient Framework for Tri-Modal 3D Visual Grounding based on CLIP
TriCLIP-3D: A Unified Parameter-Efficient Framework for Tri-Modal 3D Visual Grounding based on CLIP
Fan Li
Zanyi Wang
Zeyi Huang
Guang Dai
Jingdong Wang
Mengmeng Wang
215
0
0
20 Jul 2025
Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)
Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)
Chongli Qin
Jost Tobias Springenberg
OffRL
209
12
0
17 Jul 2025
Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking
Hashed Watermark as a Filter: Defeating Forging and Overwriting Attacks in Weight-based Neural Network Watermarking
Yuan Yao
Jin Song
Jian Jin
AAML
203
1
0
15 Jul 2025
Pre-Training LLMs on a budget: A comparison of three optimizers
Pre-Training LLMs on a budget: A comparison of three optimizers
Joel Schlotthauer
Christian Kroos
Chris Hinze
Viktor Hangya
Luzian Hahn
Fabian Küch
197
0
0
11 Jul 2025
Tractable Representation Learning with Probabilistic Circuits
Tractable Representation Learning with Probabilistic Circuits
Steven Braun
Sahil Sidheekh
Antonio Vergari
Martin Mundt
S. Natarajan
Kristian Kersting
TPM
378
0
0
06 Jul 2025
Don't Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism
Don't Trust Generative Agents to Mimic Communication on Social Networks Unless You Benchmarked their Empirical Realism
Simon Münker
Nils Schwager
Achim Rettinger
215
0
0
27 Jun 2025
DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic
DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic
Munish Monga
Vishal M. Chudasama
Pankaj Wasnik
Biplab Banerjee
532
0
0
26 Jun 2025
Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity
Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity
Samin Yeasar Arnob
Scott Fujimoto
Doina Precup
OffRL
227
0
0
20 Jun 2025
Enhanced Dermatology Image Quality Assessment via Cross-Domain Training
Enhanced Dermatology Image Quality Assessment via Cross-Domain Training
Ignacio Hernández Montilla
Alfonso Medela
Paola Pasquali
Andy Aguilar
Taig Mac Carthy
Gerardo Fernández
Antonio Martorell
Enrique Onieva
82
0
0
19 Jun 2025
Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts
Sysformer: Safeguarding Frozen Large Language Models with Adaptive System Prompts
Kartik Sharma
Yiqiao Jin
Vineeth Rakesh
Yingtong Dou
Menghai Pan
Mahashweta Das
Srijan Kumar
AAML
224
0
0
18 Jun 2025
The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions
The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions
Devin Kwok
Gül Sena Altıntaş
Colin Raffel
David Rolnick
415
2
0
16 Jun 2025
GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO
GFRIEND: Generative Few-shot Reward Inference through EfficieNt DPO
Yiyang Zhao
Huiyu Bai
Xuejiao Zhao
OffRL
181
0
0
10 Jun 2025
G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration
Samuel Holt
Max Ruiz Luyten
Antonin Berthon
M. Schaar
199
3
0
10 Jun 2025
Flow Diverse and Efficient: Learning Momentum Flow Matching via Stochastic Velocity Field Sampling
Zhiyuan Ma
Ruixun Liu
Sixian Liu
Jianjun Li
Bowen Zhou
235
2
0
10 Jun 2025
Previous
12345...232425
Next