v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017

I. Loshchilov

Katharina Eggensperger

OffRL

ArXiv (abs)PDF HTML Github (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown

State Value Generation with Prompt Learning and Self-Training for Low-Resource Dialogue State Tracking

191

30 Jan 2024

Liquid Resistance Liquid Capacitance Networks

260

30 Jan 2024

Triple Disentangled Representation Learning for Multimodal Affective AnalysisInformation Fusion (Inf. Fusion), 2024

235

29 Jan 2024

PILOT: Legal Case Outcome Prediction with Case LawNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024

Jimeng Sun

244

28 Jan 2024

SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection

Foozhan Ataiefard

Walid Ahmed

Habib Hajimolahoseini

Yang Liu

196

27 Jan 2024

P2Seg: Pointly-supervised Segmentation via Mutual Distillation

263

18 Jan 2024

Stream Query Denoising for Vectorized HD Map ConstructionEuropean Conference on Computer Vision (ECCV), 2024

Shuo Wang

Xiangyu Zhang

289

17 Jan 2024

Harnessing Orthogonality to Train Low-Rank Neural NetworksEuropean Conference on Artificial Intelligence (ECAI), 2024

383

16 Jan 2024

Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in VideoAAAI Conference on Artificial Intelligence (AAAI), 2024

283

15 Jan 2024

DiffDA: a Diffusion Model for Weather-scale Data AssimilationInternational Conference on Machine Learning (ICML), 2024

354

11 Jan 2024

Text2MDT: Extracting Medical Decision Trees from Medical Texts

241

04 Jan 2024

Enhancing Automatic Modulation Recognition through Robust Global Feature ExtractionIEEE Transactions on Vehicular Technology (IEEE Trans. Veh. Technol.), 2024

219

02 Jan 2024

Make BERT-based Chinese Spelling Check Model Enhanced by Layerwise Attention and Gaussian Mixture Model

171

27 Dec 2023

COOPER: Coordinating Specialized Agents towards a Complex Dialogue Goal

Jian Wang

237

19 Dec 2023

Towards an end-to-end artificial intelligence driven global weather forecasting system

...

380

18 Dec 2023

ElasticLaneNet: An Efficient Geometry-Flexible Approach for Lane Detection

268

16 Dec 2023

Data-Efficient Multimodal Fusion on a Single GPUComputer Vision and Pattern Recognition (CVPR), 2023

464

15 Dec 2023

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation ImageryComputer Vision and Pattern Recognition (CVPR), 2023

...

Jingdong Chen

Ming Yang

Yongjun Zhang

Yansheng Li

374

235

15 Dec 2023

Learn or Recall? Revisiting Incremental Learning with Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Junhao Zheng

Shengjie Qiu

Qianli Ma

395

13 Dec 2023

Gated Linear Attention Transformers with Hardware-Efficient Training

Bailin Wang

478

303

11 Dec 2023

Early Action Recognition with Action Prototypes

134

11 Dec 2023

An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis

Via Nielson

Steven Hillis

08 Dec 2023

Trajeglish: Traffic Modeling as Next-Token Prediction

Jonah Philion

Xue Bin Peng

Sanja Fidler

226

07 Dec 2023

DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake Detection

Nannan Wang

Xinbo Gao

188

07 Dec 2023

Expand BERT Representation with Visual Information via Grounded Language Learning with Multimodal Partial AlignmentACM Multimedia (ACM MM), 2023

338

04 Dec 2023

Learning Part Segmentation from Synthetic Animals

291

30 Nov 2023

Perceptual Group Tokenizer: Building Perception with Iterative GroupingInternational Conference on Learning Representations (ICLR), 2023

218

30 Nov 2023

Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation

...

Alan Karthikesalingam

Ira Ktena

MedIm

260

30 Nov 2023

Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation LearningComputer Vision and Pattern Recognition (CVPR), 2023

Yiwen Ye

Yutong Xie

Jianpeng Zhang

Ziyang Chen

Qi Wu

Yong-quan Xia

CLL

250

29 Nov 2023

CESAR: Automatic Induction of Compositional Instructions for Multi-turn DialogsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Yang Liu

234

29 Nov 2023

Mug-STAN: Adapting Image-Language Pretrained Models for General Video Understanding

267

25 Nov 2023

Retrieval-Augmented Layout Transformer for Content-Aware Layout GenerationComputer Vision and Pattern Recognition (CVPR), 2023

452

22 Nov 2023

Learning with Chemical versus Electrical Synapses -- Does it Make a Difference?IEEE International Conference on Robotics and Automation (ICRA), 2023

Mónika Farsang

Mathias Lechner

David Lung

Ramin Hasani

Daniela Rus

Radu Grosu

101

21 Nov 2023

Few-shot Multispectral Segmentation with Representations Generated by Reinforcement Learning

Dilith Jayakody

Thanuja D. Ambegoda

128

20 Nov 2023

PhytNet -- Tailored Convolutional Neural Networks for Custom Botanical Data

Jamie R. Sykes

Katherine Denby

Daniel W. Franks

158

20 Nov 2023

Coarse-Grained Configurational Polymer Fingerprints for Property Prediction using Machine Learning

Ishan Kumar

P. Jha

20 Nov 2023

Segment Anything in Defect Detection

Bozhen Hu

Bin Gao

Cheng Tan

Tongle Wu

Stan Z. Li

102

17 Nov 2023

Neural machine translation for automated feedback on children's early-stage writing

Jonas Vestergaard Jensen

Mikkel Jordahn

Michael Riis Andersen

266

15 Nov 2023

Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts

Smaranda Muresan

326

15 Nov 2023

Data Augmentations in Deep Weight Spaces

...

332

15 Nov 2023

Token Prediction as Implicit Classification to Identify LLM-Generated TextConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Bhiksha Raj

185

15 Nov 2023

Probabilistic reconstruction of Dark Matter fields from biased tracers using diffusion models

161

14 Nov 2023

The Transient Nature of Emergent In-Context Learning in TransformersNeural Information Processing Systems (NeurIPS), 2023

481

14 Nov 2023

On the Behavior of Audio-Visual Fusion Architectures in Identity Verification Tasks

Daniel Claborne

Eric Slyman

Karl Pazdernik

169

09 Nov 2023

mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality CollaborationComputer Vision and Pattern Recognition (CVPR), 2023

Jiabo Ye

Ji Zhang

Fei Huang

Jingren Zhou

MLLM VLM

469

600

07 Nov 2023

MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy OptimizationWeb Search and Data Mining (WSDM), 2023

Hao Peng

175

06 Nov 2023

GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values

Farnoosh Javadi

Walid Ahmed

Habib Hajimolahoseini

310

06 Nov 2023

Robust Generalization Strategies for Morpheme Glossing in an Endangered Language Documentation Context

Michael Ginn

Alexis Palmer

197

05 Nov 2023

A New Korean Text Classification Benchmark for Recognizing the Political Intents in Online Newspapers

Beomjune Kim

Eunsun Lee

Dongbin Na

142

03 Nov 2023

Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic GraphsNeural Information Processing Systems (NeurIPS), 2023

269

02 Nov 2023