ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.05101
  4. Cited By
Decoupled Weight Decay Regularization
v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017
I. Loshchilov
Katharina Eggensperger
    OffRL
ArXiv (abs)PDFHTMLGithub (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown
State Value Generation with Prompt Learning and Self-Training for
  Low-Resource Dialogue State Tracking
State Value Generation with Prompt Learning and Self-Training for Low-Resource Dialogue State Tracking
Ming Gu
Yan Yang
Chengcai Chen
Zhou Yu
191
0
0
30 Jan 2024
Liquid Resistance Liquid Capacitance Networks
Liquid Resistance Liquid Capacitance Networks
Mónika Farsang
Sophie A. Neubauer
Radu Grosu
AI4TS
260
5
0
30 Jan 2024
Triple Disentangled Representation Learning for Multimodal Affective
  Analysis
Triple Disentangled Representation Learning for Multimodal Affective AnalysisInformation Fusion (Inf. Fusion), 2024
Ying Zhou
Xuefeng Liang
Han Chen
Yin Zhao
Xin Chen
Lida Yu
235
14
0
29 Jan 2024
PILOT: Legal Case Outcome Prediction with Case Law
PILOT: Legal Case Outcome Prediction with Case LawNorth American Chapter of the Association for Computational Linguistics (NAACL), 2024
Lang Cao
Zifeng Wang
Cao Xiao
Jimeng Sun
AILawELM
244
18
0
28 Jan 2024
SkipViT: Speeding Up Vision Transformers with a Token-Level Skip
  Connection
SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection
Foozhan Ataiefard
Walid Ahmed
Habib Hajimolahoseini
Saina Asani
Farnoosh Javadi
Mohammad Hassanpour
Omar Mohamed Awad
Austin Wen
Kangling Liu
Yang Liu
ViT
196
3
0
27 Jan 2024
P2Seg: Pointly-supervised Segmentation via Mutual Distillation
P2Seg: Pointly-supervised Segmentation via Mutual Distillation
Zipeng Wang
Xuehui Yu
Xumeng Han
Wenwen Yu
Zhixun Huang
Jianbin Jiao
Zhenjun Han
263
1
0
18 Jan 2024
Stream Query Denoising for Vectorized HD Map Construction
Stream Query Denoising for Vectorized HD Map ConstructionEuropean Conference on Computer Vision (ECCV), 2024
Shuo Wang
Fan Jia
Yingfei Liu
Yucheng Zhao
Zehui Chen
Tiancai Wang
Chi Zhang
Xiangyu Zhang
Feng Zhao
289
40
0
17 Jan 2024
Harnessing Orthogonality to Train Low-Rank Neural Networks
Harnessing Orthogonality to Train Low-Rank Neural NetworksEuropean Conference on Artificial Intelligence (ECAI), 2024
D. Coquelin
Katharina Flügel
Marie Weiel
Nicholas Kiefer
Charlotte Debus
Achim Streit
Markus Goetz
383
5
0
16 Jan 2024
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy
  for Temporal Sentence Grounding in Video
Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in VideoAAAI Conference on Artificial Intelligence (AAAI), 2024
Zhaobo Qi
Yibo Yuan
Xiaowen Ruan
Shuhui Wang
Weigang Zhang
Qingming Huang
283
11
0
15 Jan 2024
DiffDA: a Diffusion Model for Weather-scale Data Assimilation
DiffDA: a Diffusion Model for Weather-scale Data AssimilationInternational Conference on Machine Learning (ICML), 2024
Langwen Huang
Lukas Gianinazzi
Yuejiang Yu
P. Dueben
Torsten Hoefler
354
70
0
11 Jan 2024
Text2MDT: Extracting Medical Decision Trees from Medical Texts
Text2MDT: Extracting Medical Decision Trees from Medical Texts
Wei-wei Zhu
Wenfeng Li
Xing Tian
Pengfei Wang
Xiaoling Wang
Jin Chen
Man Lan
Yuan Ni
Guotong Xie
241
6
0
04 Jan 2024
Enhancing Automatic Modulation Recognition through Robust Global Feature
  Extraction
Enhancing Automatic Modulation Recognition through Robust Global Feature ExtractionIEEE Transactions on Vehicular Technology (IEEE Trans. Veh. Technol.), 2024
Yunpeng Qu
Zhilin Lu
Rui Zeng
Jintao Wang
Jian Wang
219
27
0
02 Jan 2024
Make BERT-based Chinese Spelling Check Model Enhanced by Layerwise
  Attention and Gaussian Mixture Model
Make BERT-based Chinese Spelling Check Model Enhanced by Layerwise Attention and Gaussian Mixture Model
Yongchang Cao
Liang He
Zhanghua Wu
Xinyu Dai
171
2
0
27 Dec 2023
COOPER: Coordinating Specialized Agents towards a Complex Dialogue Goal
COOPER: Coordinating Specialized Agents towards a Complex Dialogue Goal
Yi Cheng
Wenge Liu
Jian Wang
Chak Tou Leong
Ouyang Yi
Wenjie Li
Xian Wu
Yefeng Zheng
LLMAG
237
26
0
19 Dec 2023
Towards an end-to-end artificial intelligence driven global weather forecasting system
Towards an end-to-end artificial intelligence driven global weather forecasting system
Kun Chen
Mengwei He
Zhangrui Li
Peng Ye
Tao Chen
...
Yi Xiao
Kang Chen
Tao Han
Jing-Jia Luo
Wanli Ouyang
AI4Cl
380
27
0
18 Dec 2023
ElasticLaneNet: An Efficient Geometry-Flexible Approach for Lane
  Detection
ElasticLaneNet: An Efficient Geometry-Flexible Approach for Lane Detection
Yaxin Feng
Yuan Lan
Luchan Zhang
Yang Xiang
268
1
0
16 Dec 2023
Data-Efficient Multimodal Fusion on a Single GPU
Data-Efficient Multimodal Fusion on a Single GPUComputer Vision and Pattern Recognition (CVPR), 2023
Noël Vouitsis
Zhaoyan Liu
S. Gorti
Valentin Villecroze
Jesse C. Cresswell
Guangwei Yu
Gabriel Loaiza-Ganem
Anthony L. Caterini
464
11
0
15 Dec 2023
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards
  Universal Interpretation for Earth Observation Imagery
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation ImageryComputer Vision and Pattern Recognition (CVPR), 2023
Xin Guo
Jiangwei Lao
Bo Dang
Yingying Zhang
Lei Yu
...
Jian Wang
Jingdong Chen
Ming Yang
Yongjun Zhang
Yansheng Li
374
235
0
15 Dec 2023
Learn or Recall? Revisiting Incremental Learning with Pre-trained
  Language Models
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Junhao Zheng
Shengjie Qiu
Qianli Ma
395
13
0
13 Dec 2023
Gated Linear Attention Transformers with Hardware-Efficient Training
Gated Linear Attention Transformers with Hardware-Efficient Training
Aaron Courville
Bailin Wang
Songlin Yang
Yikang Shen
Yoon Kim
478
303
0
11 Dec 2023
Early Action Recognition with Action Prototypes
Early Action Recognition with Action Prototypes
G. Camporese
Alessandro Bergamo
Xunyu Lin
Joseph Tighe
Davide Modolo
EgoV
134
0
0
11 Dec 2023
An Experimental Study: Assessing the Combined Framework of WavLM and
  BEST-RQ for Text-to-Speech Synthesis
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis
Via Nielson
Steven Hillis
81
0
0
08 Dec 2023
Trajeglish: Traffic Modeling as Next-Token Prediction
Trajeglish: Traffic Modeling as Next-Token Prediction
Jonah Philion
Xue Bin Peng
Sanja Fidler
226
46
0
07 Dec 2023
DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake
  Detection
DeepFidelity: Perceptual Forgery Fidelity Assessment for Deepfake Detection
Chunlei Peng
Huiqing Guo
Decheng Liu
Nannan Wang
Ruimin Hu
Xinbo Gao
CVBM
188
5
0
07 Dec 2023
Expand BERT Representation with Visual Information via Grounded Language
  Learning with Multimodal Partial Alignment
Expand BERT Representation with Visual Information via Grounded Language Learning with Multimodal Partial AlignmentACM Multimedia (ACM MM), 2023
Cong-Duy Nguyen
The-Anh Vu-Le
Thong Nguyen
Tho Quan
Anh Tuan Luu
338
7
0
04 Dec 2023
Learning Part Segmentation from Synthetic Animals
Learning Part Segmentation from Synthetic Animals
Jiawei Peng
Ju He
Prakhar Kaushik
Zihao Xiao
Jiteng Mu
Yaoyao Liu
291
3
0
30 Nov 2023
Perceptual Group Tokenizer: Building Perception with Iterative Grouping
Perceptual Group Tokenizer: Building Perception with Iterative GroupingInternational Conference on Learning Representations (ICLR), 2023
Zhiwei Deng
Ting Chen
Yang Li
ViTVLM
218
3
0
30 Nov 2023
Consensus, dissensus and synergy between clinicians and specialist
  foundation models in radiology report generation
Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation
Ryutaro Tanno
D. G. Barrett
Andrew Sellergren
Sumedh Ghaisas
Sumanth Dathathri
...
S. Shetty
Pushmeet Kohli
Po-Sen Huang
Alan Karthikesalingam
Ira Ktena
MedIm
260
15
0
30 Nov 2023
Continual Self-supervised Learning: Towards Universal Multi-modal
  Medical Data Representation Learning
Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation LearningComputer Vision and Pattern Recognition (CVPR), 2023
Yiwen Ye
Yutong Xie
Jianpeng Zhang
Ziyang Chen
Qi Wu
Yong-quan Xia
CLL
250
42
0
29 Nov 2023
CESAR: Automatic Induction of Compositional Instructions for Multi-turn
  Dialogs
CESAR: Automatic Induction of Compositional Instructions for Multi-turn DialogsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Taha İbrahim Aksu
Devamanyu Hazarika
Shikib Mehri
Seokhwan Kim
Dilek Z. Hakkani-Tür
Yang Liu
Mahdi Namazifar
234
2
0
29 Nov 2023
Mug-STAN: Adapting Image-Language Pretrained Models for General Video
  Understanding
Mug-STAN: Adapting Image-Language Pretrained Models for General Video Understanding
Ruyang Liu
Jingjia Huang
Wei-Nan Gao
Thomas H. Li
Ge Li
VLM
267
4
0
25 Nov 2023
Retrieval-Augmented Layout Transformer for Content-Aware Layout
  Generation
Retrieval-Augmented Layout Transformer for Content-Aware Layout GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Daichi Horita
Naoto Inoue
Kotaro Kikuchi
Kota Yamaguchi
Kiyoharu Aizawa
3DV
452
42
0
22 Nov 2023
Learning with Chemical versus Electrical Synapses -- Does it Make a
  Difference?
Learning with Chemical versus Electrical Synapses -- Does it Make a Difference?IEEE International Conference on Robotics and Automation (ICRA), 2023
Mónika Farsang
Mathias Lechner
David Lung
Ramin Hasani
Daniela Rus
Radu Grosu
101
6
0
21 Nov 2023
Few-shot Multispectral Segmentation with Representations Generated by
  Reinforcement Learning
Few-shot Multispectral Segmentation with Representations Generated by Reinforcement Learning
Dilith Jayakody
Thanuja D. Ambegoda
128
0
0
20 Nov 2023
PhytNet -- Tailored Convolutional Neural Networks for Custom Botanical
  Data
PhytNet -- Tailored Convolutional Neural Networks for Custom Botanical Data
Jamie R. Sykes
Katherine Denby
Daniel W. Franks
158
2
0
20 Nov 2023
Coarse-Grained Configurational Polymer Fingerprints for Property
  Prediction using Machine Learning
Coarse-Grained Configurational Polymer Fingerprints for Property Prediction using Machine Learning
Ishan Kumar
P. Jha
43
1
0
20 Nov 2023
Segment Anything in Defect Detection
Segment Anything in Defect Detection
Bozhen Hu
Bin Gao
Cheng Tan
Tongle Wu
Stan Z. Li
102
8
0
17 Nov 2023
Neural machine translation for automated feedback on children's
  early-stage writing
Neural machine translation for automated feedback on children's early-stage writing
Jonas Vestergaard Jensen
Mikkel Jordahn
Michael Riis Andersen
266
0
0
15 Nov 2023
Identifying Self-Disclosures of Use, Misuse and Addiction in
  Community-based Social Media Posts
Identifying Self-Disclosures of Use, Misuse and Addiction in Community-based Social Media Posts
Chenghao Yang
Tuhin Chakrabarty
K. Hochstatter
M. Slavin
N. El-Bassel
Smaranda Muresan
326
4
0
15 Nov 2023
Data Augmentations in Deep Weight Spaces
Data Augmentations in Deep Weight Spaces
Aviv Shamsian
David W. Zhang
Aviv Navon
Yan Zhang
Miltiadis Kofinas
...
E. Gavves
Cees G. M. Snoek
Ethan Fetaya
Gal Chechik
Haggai Maron
332
3
0
15 Nov 2023
Token Prediction as Implicit Classification to Identify LLM-Generated
  Text
Token Prediction as Implicit Classification to Identify LLM-Generated TextConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yutian Chen
Hao Kang
Vivian Zhai
Liangze Li
Rita Singh
Bhiksha Raj
DeLMO
185
43
0
15 Nov 2023
Probabilistic reconstruction of Dark Matter fields from biased tracers
  using diffusion models
Probabilistic reconstruction of Dark Matter fields from biased tracers using diffusion models
Core Francisco Park
Victoria Ono
N. Mudur
Yueying Ni
C. Cuesta-Lázaro
DiffM
161
5
0
14 Nov 2023
The Transient Nature of Emergent In-Context Learning in Transformers
The Transient Nature of Emergent In-Context Learning in TransformersNeural Information Processing Systems (NeurIPS), 2023
Aaditya K. Singh
Stephanie C. Y. Chan
Ted Moskovitz
Erin Grant
Andrew M. Saxe
Felix Hill
481
63
0
14 Nov 2023
On the Behavior of Audio-Visual Fusion Architectures in Identity
  Verification Tasks
On the Behavior of Audio-Visual Fusion Architectures in Identity Verification Tasks
Daniel Claborne
Eric Slyman
Karl Pazdernik
169
0
0
09 Nov 2023
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with
  Modality Collaboration
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality CollaborationComputer Vision and Pattern Recognition (CVPR), 2023
Qinghao Ye
Haiyang Xu
Jiabo Ye
Mingshi Yan
Anwen Hu
Haowei Liu
Qi Qian
Ji Zhang
Fei Huang
Jingren Zhou
MLLMVLM
469
600
0
07 Nov 2023
MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for
  Traffic Forecast via Structural Entropy Optimization
MultiSPANS: A Multi-range Spatial-Temporal Transformer Network for Traffic Forecast via Structural Entropy OptimizationWeb Search and Data Mining (WSDM), 2023
Dongcheng Zou
Senzhang Wang
Xuefeng Li
Hao Peng
Yuandong Wang
Chunyang Liu
Kehua Sheng
Bo Zhang
AI4TS
175
45
0
06 Nov 2023
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values
Farnoosh Javadi
Walid Ahmed
Habib Hajimolahoseini
Foozhan Ataiefard
Mohammad Hassanpour
Saina Asani
Austin Wen
Omar Mohamed Awad
Kangling Liu
Yang Liu
VLM
310
8
0
06 Nov 2023
Robust Generalization Strategies for Morpheme Glossing in an Endangered
  Language Documentation Context
Robust Generalization Strategies for Morpheme Glossing in an Endangered Language Documentation Context
Michael Ginn
Alexis Palmer
197
5
0
05 Nov 2023
A New Korean Text Classification Benchmark for Recognizing the Political
  Intents in Online Newspapers
A New Korean Text Classification Benchmark for Recognizing the Political Intents in Online Newspapers
Beomjune Kim
Eunsun Lee
Dongbin Na
142
1
0
03 Nov 2023
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with
  Hierarchical Semantic Graphs
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic GraphsNeural Information Processing Systems (NeurIPS), 2023
Peng Jin
Yang Wu
Yanbo Fan
Zhongqian Sun
Yang Wei
Li-ming Yuan
DiffM
269
44
0
02 Nov 2023
Previous
123...91011...232425
Next
Page 10 of 25
Pageof 25