v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017

I. Loshchilov

Katharina Eggensperger

OffRL

ArXiv (abs)PDF HTML Github (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown

A mean teacher algorithm for unlearning of language models

Yegor Klochkov

648

18 Apr 2025

NNTile: a machine learning framework capable of training extremely large GPT language models on a single node

165

17 Apr 2025

Multi-Object Grounding via Hierarchical Contrastive Siamese Transformers

Chengyi Du

Keyan Jin

235

14 Apr 2025

Small Object Detection with YOLO: A Performance Analysis Across Model Versions and Hardware

Muhammad Fasih Tariq

Muhammad Azeem Javed

ObjD

277

14 Apr 2025

Decoupled Diffusion Sparks Adaptive Scene Generation

Christoffer Petersson

Hongyang Li

262

14 Apr 2025

Towards Quantifying Commonsense Reasoning with Mechanistic InsightsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

252

14 Apr 2025

MatterTune: An Integrated, User-Friendly Platform for Fine-Tuning Atomistic Foundation Models to Accelerate Materials Simulation and DiscoveryDigital Discovery (DD), 2025

255

14 Apr 2025

A Model Zoo of Vision Transformers

504

14 Apr 2025

Gradient as Conditions: Rethinking HOG for All-in-one Image Restoration

314

12 Apr 2025

^2

: Self-Distilled Sparse Drafters

Mike Lasby

Nish Sinnadurai

Valavan Manohararajah

Sean Lie

Yani Andrew Ioannou

Vithursan Thangarasa

786

10 Apr 2025

Charm: The Missing Piece in ViT fine-tuning for Image Aesthetic AssessmentComputer Vision and Pattern Recognition (CVPR), 2025

304

03 Apr 2025

NeuraLUT-Assemble: Hardware-aware Assembling of Sub-Neural Networks for Efficient LUT InferenceIEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), 2025

Marta Andronic

George A. Constantinides

326

01 Apr 2025

FIESTA: Fisher Information-based Efficient Selective Test-time Adaptation

Mohammadmahdi Honarmand

266

29 Mar 2025

NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving

372

28 Mar 2025

SChanger: Change Detection from a Semantic Change and Spatial Consistency PerspectiveIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS), 2025

405

26 Mar 2025

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model AlignmentComputer Vision and Pattern Recognition (CVPR), 2025

329

24 Mar 2025

LeanStereo: A Leaner Backbone based Stereo NetworkIEEE International Joint Conference on Neural Network (IJCNN), 2023

329

24 Mar 2025

Beyond Accuracy: What Matters in Designing Well-Behaved Image Classification Models?

444

21 Mar 2025

PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning

359

21 Mar 2025

Classification of User Reports for Detection of Faulty Computer Components using NLP Models: A Case Study

Maria de Lourdes M. Silva

128

20 Mar 2025

Learn Your Scales: Towards Scale-Consistent Generative Novel View Synthesis

Fereshteh Forghani

Jason J. Yu

Tristan Aumentado-Armstrong

Konstantinos G. Derpanis

Marcus A. Brubaker

DiffM

332

19 Mar 2025

VenusFactory: A Unified Platform for Protein Engineering Data Retrieval and Language Model Fine-Tuning

...

200

19 Mar 2025

Quantum EigenGame for excited state calculation

David Quiroga

Jason Han

Anastasios Kyrillidis

280

17 Mar 2025

Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors

301

17 Mar 2025

L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

257

17 Mar 2025

OpeNLGauge: An Explainable Metric for NLG Evaluation with Open-Weights LLMs

364

14 Mar 2025

Text Compression for Efficient Language GenerationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

David Gu

Peter Belcak

Roger Wattenhofer

241

14 Mar 2025

BIMBA: Selective-Scan Compression for Long-Range Video Question AnsweringComputer Vision and Pattern Recognition (CVPR), 2025

1.0K

12 Mar 2025

The R2D2 Deep Neural Network Series for Scalable Non-Cartesian Magnetic Resonance Imaging

193

12 Mar 2025

TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos

313

09 Mar 2025

Improving SAM for Camouflaged Object Detection via Dual Stream Adapters

Jiaming Liu

Linghe Kong

Guihai Chen

325

08 Mar 2025

PointsToWood: A deep learning framework for complete canopy leaf-wood segmentation of TLS data across diverse European forests

119

06 Mar 2025

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLMAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

Siyang Song

Mohammed Irfan Kurpath

Sahal Shaji Mullappilly

656

06 Mar 2025

Lead Instrument Detection from Multitrack MusicIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

Longshen Ou

Yu Takahashi

Ye Wang

188

05 Mar 2025

GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera ControlComputer Vision and Pattern Recognition (CVPR), 2025

322

125

05 Mar 2025

All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning

438

03 Mar 2025

Efficiently Editing Mixture-of-Experts Models with Compressed Experts

316

01 Mar 2025

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model

415

26 Feb 2025

Reference-Aligned Retrieval-Augmented Question Answering over Heterogeneous Proprietary Documents

771

26 Feb 2025

FLINT: Learning-based Flow Estimation and Temporal Interpolation for Scientific Ensemble VisualizationIEEE Transactions on Visualization and Computer Graphics (TVCG), 2024

Hamid Gadirov

Jos B. T. M. Roerdink

Steffen Frey

AI4CE

271

24 Feb 2025

Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series Classification

315

24 Feb 2025

DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis

358

21 Feb 2025

MoM: Linear Sequence Modeling with Mixture-of-Memories

553

19 Feb 2025

ALGEN: Few-shot Inversion Attacks on Textual Embeddings using Alignment and Generation

Yiyi Chen

Qiongkai Xu

Johannes Bjerva

403

16 Feb 2025

Target-Augmented Shared Fusion-based Multimodal Sarcasm Explanation GenerationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025

Palaash Goel

Dushyant Singh Chauhan

Md. Shad Akhtar

LRM

300

11 Feb 2025

MatSwap: Light-aware material transfers in images

Ivan Lopes

Valentin Deschaintre

Yannick Hold-Geoffroy

Raoul de Charette

DiffM

508

11 Feb 2025

AppVLM: A Lightweight Vision Language Model for Online App Control

Youssef Attia El Hili

299

10 Feb 2025

$deCIFer: Crystal Structure Prediction from Powder Diffraction Data using Autoregressive Language Models$

deCIFer: Crystal Structure Prediction from Powder Diffraction Data using Autoregressive Language Models

764

04 Feb 2025

CoddLLM: Empowering Large Language Models for Data Analytics

Asterios Katsifodimos

890

01 Feb 2025

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

410

30 Jan 2025