Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
1711.05101
Cited By

Decoupled Weight Decay Regularization

v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017

Katharina Eggensperger

ArXiv (abs)PDF HTML Github (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown

Neural Networks for Predicting Permeability Tensors of 2D Porous Media: Comparison of Convolution- and Transformer-based Architectures

Neural Networks for Predicting Permeability Tensors of 2D Porous Media: Comparison of Convolution- and Transformer-based Architectures

Henrik Andersen Sveinsson

213

0

0

01 Dec 2025

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

168

0

0

01 Dec 2025

DAISI: Data Assimilation with Inverse Sampling using Stochastic Interpolants

Tomas Landelius

Fredrik Lindsten

76

0

0

29 Nov 2025

Does Self-Evaluation Enable Wireheading in Language Models?

Does Self-Evaluation Enable Wireheading in Language Models?

David Demitri Africa

Hans Ethan Ting

203

0

0

28 Nov 2025

Closed-Loop Transformers: Autoregressive Modeling as Iterative Latent Equilibrium

Closed-Loop Transformers: Autoregressive Modeling as Iterative Latent Equilibrium

Akbar Anbar Jafari

68

1

0

26 Nov 2025

Deterministic Continuous Replacement: Fast and Stable Module Replacement in Pretrained Transformers

Deterministic Continuous Replacement: Fast and Stable Module Replacement in Pretrained Transformers

Aniket Srinivasan Ashok

Sai Ram Kasanagottu

Gunmay Jhingran

146

0

0

24 Nov 2025

Rethinking Vision Transformer Depth via Structural Reparameterization

Rethinking Vision Transformer Depth via Structural Reparameterization

Vipin Chaudhary

109

0

0

24 Nov 2025

Coherent Multi-Agent Trajectory Forecasting in Team Sports with CausalTraj

Coherent Multi-Agent Trajectory Forecasting in Team Sports with CausalTraj

206

0

0

23 Nov 2025

RNN as Linear Transformer: A Closer Investigation into Representational Potentials of Visual Mamba Models

RNN as Linear Transformer: A Closer Investigation into Representational Potentials of Visual Mamba Models

142

0

0

23 Nov 2025

Contrastive vision-language learning with paraphrasing and negation

Saman Sadeghi Afgeh

177

0

0

20 Nov 2025

MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

92

0

0

20 Nov 2025

Unsupervised Image Classification with Adaptive Nearest Neighbor Selection and Cluster Ensembles

182

0

0

20 Nov 2025

First Frame Is the Place to Go for Video Content Customization

First Frame Is the Place to Go for Video Content Customization

Cornelia Fermüller

Brandon Yushan Feng

Yiannis Aloimonos

202

0

0

19 Nov 2025

StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model

StreamingTalker: Audio-driven 3D Facial Animation with Autoregressive Diffusion Model

324

0

0

18 Nov 2025

PerTouch: VLM-Driven Agent for Personalized and Semantic Image Retouching

PerTouch: VLM-Driven Agent for Personalized and Semantic Image Retouching

Zheng-Peng Duan

312

0

0

17 Nov 2025

Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining

Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining

136

0

0

17 Nov 2025

AdamNX: An Adam improvement algorithm based on a novel exponential decay mechanism for the second-order moment estimate

AdamNX: An Adam improvement algorithm based on a novel exponential decay mechanism for the second-order moment estimate

266

0

0

17 Nov 2025

$D$^{2}$-VPR: A Parameter-efficient Visual-foundation-model-based Visual Place Recognition Method via Knowledge Distillation and Deformable Aggregation$

^{2}

-VPR: A Parameter-efficient Visual-foundation-model-based Visual Place Recognition Method via Knowledge Distillation and Deformable Aggregation

Linzhimeng Duan

156

1

0

16 Nov 2025

Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions

Unsupervised Evaluation of Multi-Turn Objective-Driven Interactions

341

0

0

04 Nov 2025

A Generative Adversarial Approach to Adversarial Attacks Guided by Contrastive Language-Image Pre-trained Model

A Generative Adversarial Approach to Adversarial Attacks Guided by Contrastive Language-Image Pre-trained Model

644

0

0

03 Nov 2025

FedMuon: Accelerating Federated Learning with Matrix Orthogonalization

FedMuon: Accelerating Federated Learning with Matrix Orthogonalization

213

2

0

31 Oct 2025

FedAdamW: A Communication-Efficient Optimizer with Convergence and Generalization Guarantees for Federated Large Models

FedAdamW: A Communication-Efficient Optimizer with Convergence and Generalization Guarantees for Federated Large Models

194

1

0

31 Oct 2025

Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism

Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism

93

0

0

30 Oct 2025

DualCap: Enhancing Lightweight Image Captioning via Dual Retrieval with Similar Scenes Visual Prompts

DualCap: Enhancing Lightweight Image Captioning via Dual Retrieval with Similar Scenes Visual Prompts

330

0

0

28 Oct 2025

Learning "Partner-Aware" Collaborators in Multi-Party Collaboration

Learning "Partner-Aware" Collaborators in Multi-Party Collaboration

Nikhil Krishnaswamy

123

0

0

26 Oct 2025

Model-Aware Tokenizer Transfer

Model-Aware Tokenizer Transfer

Aleksander Smywiński-Pohl

120

0

0

24 Oct 2025

Modest-Align: Data-Efficient Alignment for Vision-Language Models

Modest-Align: Data-Efficient Alignment for Vision-Language Models

Joey Tianyi Zhou

123

0

0

24 Oct 2025

What Does It Take to Build a Performant Selective Classifier?

What Does It Take to Build a Performant Selective Classifier?

Stephan Rabanser

Nicolas Papernot

214

0

0

23 Oct 2025

Deep Learning-Based Control Optimization for Glass Bottle Forming

Deep Learning-Based Control Optimization for Glass Bottle Forming

Federico Monegaglia

Marco Cristoforetti

52

0

0

21 Oct 2025

Trace Anything: Representing Any Video in 4D via Trajectory Fields

Trace Anything: Representing Any Video in 4D via Trajectory Fields

136

4

0

15 Oct 2025

Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning

Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning

Tiansheng Huang

124

1

0

11 Oct 2025

Probabilistic Hyper-Graphs using Multiple Randomly Masked Autoencoders for Semi-supervised Multi-modal Multi-task Learning

Probabilistic Hyper-Graphs using Multiple Randomly Masked Autoencoders for Semi-supervised Multi-modal Multi-task Learning

Pîrvu Mihai-Cristian

Leordeanu Marius

181

1

0

11 Oct 2025

Reconstructing the local density field with combined convolutional and point cloud architecture

Reconstructing the local density field with combined convolutional and point cloud architecture

Baptiste Barthe-Gold

Nhat-Minh Nguyen

185

0

0

09 Oct 2025

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

Recycling Pretrained Checkpoints: Orthogonal Growth of Mixture-of-Experts for Efficient Large Language Model Pre-Training

145

0

0

09 Oct 2025

Reinforcement Learning-based Task Offloading in the Internet of Wearable Things

Reinforcement Learning-based Task Offloading in the Internet of Wearable Things

Waleed Bin Qaim

Aleksandr Ometov

Claudia Campolo

Antonella Molinaro

144

0

0

08 Oct 2025

Mid-Training of Large Language Models: A Survey

Mid-Training of Large Language Models: A Survey

151

0

0

08 Oct 2025

MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis

MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis

John Paparrizos

132

1

0

08 Oct 2025

Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains

Agent Fine-tuning through Distillation for Domain-specific LLMs in Microdomains

Masaya Tsunokake

Ekant Muljibhai Amin

Takashi Sumiyoshi

Yasuhiro Sogawa

124

0

0

01 Oct 2025

A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features

A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features

Axel Barroso-Laguna

Tommaso Cavallari

164

0

0

01 Oct 2025

Purrception: Variational Flow Matching for Vector-Quantized Image Generation

Purrception: Variational Flow Matching for Vector-Quantized Image Generation

Răzvan-Andrei Matişan

Grigory Bartosh

Cees G. M. Snoek

Jan-Willem van de Meent

Mohammad Mahdi Derakhshani

Floor Eijkelboom

140

1

0

01 Oct 2025

InfVSR: Breaking Length Limits of Generic Video Super-Resolution

InfVSR: Breaking Length Limits of Generic Video Super-Resolution

159

2

0

01 Oct 2025

Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack

Erased, But Not Forgotten: Erased Rectified Flow Transformers Still Remain Unsafe Under Concept Attack

184

0

0

01 Oct 2025

Asymmetric VAE for One-Step Video Super-Resolution Acceleration

Asymmetric VAE for One-Step Video Super-Resolution Acceleration

111

0

0

29 Sep 2025

Effective Quantization of Muon Optimizer States

Effective Quantization of Muon Optimizer States

Abhishek Shivanna

D. T. Braithwaite

139

0

0

27 Sep 2025

You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors

You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors

113

3

0

26 Sep 2025

One Filters All: A Generalist Filter for State Estimation

One Filters All: A Generalist Filter for State Estimation

172

2

0

24 Sep 2025

VGGT-DP: Generalizable Robot Control via Vision Foundation Models

VGGT-DP: Generalizable Robot Control via Vision Foundation Models

85

0

0

23 Sep 2025

MirrorSAM2: Segment Mirror in Videos with Depth Perception

MirrorSAM2: Segment Mirror in Videos with Depth Perception

136

0

0

21 Sep 2025

Advancing Speech Understanding in Speech-Aware Language Models with GRPO

Advancing Speech Understanding in Speech-Aware Language Models with GRPO

Avishai Elmakies

Hagai Aronowitz

116

1

0

21 Sep 2025

Unlocking Hidden Potential in Point Cloud Networks with Attention-Guided Grouping-Feature Coordination

Unlocking Hidden Potential in Point Cloud Networks with Attention-Guided Grouping-Feature Coordination

132

0

0

20 Sep 2025

1 2 3 4...23 24 25