v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017

I. Loshchilov

Katharina Eggensperger

OffRL

ArXiv (abs)PDF HTML Github (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown

Physics-Grounded Differentiable Simulation for Soft Growing RobotsInternational Conference on Soft Robotics (RoboSoft), 2025

Laura H. Blumenschein

Zachary Kingston

286

29 Jan 2025

360Brew: A Decoder-only Foundation Model for Personalized Ranking and Recommendation

...

629

27 Jan 2025

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward PassComputer Vision and Pattern Recognition (CVPR), 2025

761

165

23 Jan 2025

3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results

...

191

20 Jan 2025

PolyLUT: Ultra-low Latency Polynomial Inference with Hardware-Aware Structured PruningIEEE transactions on computers (IEEE Trans. Comput.), 2025

Marta Andronic

Jiawen Li

George A. Constantinides

167

14 Jan 2025

Optimizing Small Language Models for In-Vehicle Function-Calling

150

04 Jan 2025

DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT

374

31 Dec 2024

Positive2Negative: Breaking the Information-Lossy Barrier in Self-Supervised Single Image DenoisingComputer Vision and Pattern Recognition (CVPR), 2024

427

21 Dec 2024

Bag of Tricks for Multimodal AutoML with Image, Text, and Tabular Data

384

19 Dec 2024

HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel LanguagesInformation Security Conference (IS), 2024

306

19 Dec 2024

Jet: A Modern Transformer-Based Normalizing Flow

Alexander Kolesnikov

André Susano Pinto

Michael Tschannen

244

19 Dec 2024

GaraMoSt: Parallel Multi-Granularity Motion and Structural Modeling for Efficient Multi-Frame Interpolation in DSA ImagesAAAI Conference on Artificial Intelligence (AAAI), 2024

302

18 Dec 2024

Bias Vector: Mitigating Biases in Language Models with Task Arithmetic ApproachInternational Conference on Computational Linguistics (COLING), 2024

263

16 Dec 2024

Learning Implicit Features with Flow Infused Attention for Realistic Virtual Try-On

278

16 Dec 2024

Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

Nianze Tao

OOD OODD BDL

676

16 Dec 2024

APAR: Modeling Irregular Target Functions in Tabular Regression via Arithmetic-Aware Pre-Training and Adaptive-Regularized Fine-TuningAAAI Conference on Artificial Intelligence (AAAI), 2024

391

14 Dec 2024

Exploring Grokking: Experimental and Mechanistic Investigations

Hu Qiye

Zhou Hao

Yu RuoXi

368

14 Dec 2024

Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism

360

13 Dec 2024

GR-NLP-TOOLKIT: An Open-Source NLP Toolkit for Modern GreekInternational Conference on Computational Linguistics (COLING), 2024

Anastasios Toumazatos

...

268

11 Dec 2024

SweetieChat: A Strategy-Enhanced Role-playing Framework for Diverse Scenarios Handling Emotional Support AgentInternational Conference on Computational Linguistics (COLING), 2024

451

11 Dec 2024

FlashSloth: Lightning Multimodal Large Language Models via Embedded Visual CompressionComputer Vision and Pattern Recognition (CVPR), 2024

246

05 Dec 2024

Reinforcement Learning from Wild Animal Videos

955

05 Dec 2024

Unified Framework for Open-World Compositional Zero-shot LearningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2024

306

05 Dec 2024

GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model

296

05 Dec 2024

HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior

227

27 Nov 2024

Cautious Optimizers: Improving Training with One Line of Code

712

25 Nov 2024

RECAST: Reparameterized, Compact weight Adaptation for Sequential TasksInternational Conference on Learning Representations (ICLR), 2024

Nazia Tasnim

Bryan A. Plummer

CLL OffRL

471

25 Nov 2024

Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization

403

24 Nov 2024

Financial Risk Assessment via Long-term Payment Behavior Sequence FoldingIndustrial Conference on Data Mining (IDM), 2024

221

22 Nov 2024

Entropy Bootstrapping for Weakly Supervised Nuclei Detection

James Willoughby

Irina Voiculescu

UQCV

236

20 Nov 2024

A Theory for Compressibility of Graph Transformers for Transductive Learning

308

20 Nov 2024

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

279

15 Nov 2024

Pay Attention to the Keys: Visual Piano Transcription Using TransformersInternational Joint Conference on Artificial Intelligence (IJCAI), 2024

Uros Zivanovic

Ivan Pilkov

Carlos Eduardo Cancino-Chacón

ViT

181

13 Nov 2024

MEANT: Multimodal Encoder for Antecedent InformationConference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Benjamin Iyoya Irving

Annika Marie Schoene

AIFin

180

10 Nov 2024

Multi-View Majority Vote Learning Algorithms: Direct Minimization of PAC-Bayesian Bounds

326

09 Nov 2024

Few-Shot Task Learning through Inverse Generative ModelingNeural Information Processing Systems (NeurIPS), 2024

491

07 Nov 2024

Learning to Unify Audio, Visual and Text for Audio-Enhanced Multilingual Visual Answer Localization

Zhibin Wen

Bin Li

193

05 Nov 2024

Expanding Sparse Tuning for Low Memory UsageNeural Information Processing Systems (NeurIPS), 2024

330

04 Nov 2024

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Chaoyou Fu

Ke Li

Long Ma

432

103

01 Nov 2024

Joint Extraction and Classification of Danish Competences for Job MatchingEuropean Conference on Information Retrieval (ECIR), 2024

Qiuchi Li

Christina Lioma

141

29 Oct 2024

USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal SynthesisProceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (IMWUT), 2024

201

29 Oct 2024

Super-resolution in disordered media using neural networks

241

28 Oct 2024

Mixture of Parrots: Experts improve memorization more than reasoningInternational Conference on Learning Representations (ICLR), 2024

370

24 Oct 2024

Lightweight Neural App ControlInternational Conference on Learning Representations (ICLR), 2024

Jun Wang

Youssef Attia El Hili

LM&Ro

252

23 Oct 2024

Publishing Neural Networks in Drug Discovery Might Compromise Training Data PrivacyJournal of Cheminformatics (J Cheminform), 2024

261

22 Oct 2024

Joint Top-Down and Bottom-Up Frameworks for 3D Visual GroundingInternational Conference on Pattern Recognition (ICPR), 2024

Yang Liu

Daizong Liu

Wei Hu

3DPC

381

21 Oct 2024

Catastrophic Failure of LLM Unlearning via QuantizationInternational Conference on Learning Representations (ICLR), 2024

Zhiwei Zhang

Fali Wang

Xiaomin Li

Zongyu Wu

Xianfeng Tang

Hui Liu

Qi He

Wenpeng Yin

Suhang Wang

331

21 Oct 2024

Non-invasive Neural Decoding in Source Reconstructed Brain Space

Yonatan Gideoni

Ryan Charles Timms

Oiwi Parker Jones

218

20 Oct 2024

Physically Guided Deep Unsupervised Inversion for 1D Magnetotelluric ModelsIEEE Geoscience and Remote Sensing Letters (GRSL), 2024

Paul Goyes-Peñafiel

Umair bin Waheed

Henry Arguello

121

20 Oct 2024

Cliqueformer: Model-Based Optimization with Structured Transformers

J. Kuba

Pieter Abbeel

Sergey Levine

OffRL AI4CE

464

17 Oct 2024