ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1711.05101
  4. Cited By
Decoupled Weight Decay Regularization
v1v2v3 (latest)

Decoupled Weight Decay Regularization

14 November 2017
I. Loshchilov
Katharina Eggensperger
    OffRL
ArXiv (abs)PDFHTMLGithub (275★)

Papers citing "Decoupled Weight Decay Regularization"

50 / 1,216 papers shown
Cross-view Geo-localization with Evolving Transformer
Cross-view Geo-localization with Evolving Transformer
Hongji Yang
Xiufan Lu
Yingying Zhu
ViT
146
18
0
02 Jul 2021
Non-isomorphic Inter-modality Graph Alignment and Synthesis for Holistic
  Brain Mapping
Non-isomorphic Inter-modality Graph Alignment and Synthesis for Holistic Brain MappingInformation Processing in Medical Imaging (IPMI), 2021
Islem Mhiri
Ahmed Nebli
Mohamed Ali Mahjoub
I. Rekik
102
13
0
30 Jun 2021
OffRoadTranSeg: Semi-Supervised Segmentation using Transformers on
  OffRoad environments
OffRoadTranSeg: Semi-Supervised Segmentation using Transformers on OffRoad environments
Anukriti Singh
Kartikeya Singh
P. B. Sujit
ViT
100
11
0
26 Jun 2021
Knowledge-Grounded Self-Rationalization via Extractive and Natural
  Language Explanations
Knowledge-Grounded Self-Rationalization via Extractive and Natural Language ExplanationsInternational Conference on Machine Learning (ICML), 2021
Bodhisattwa Prasad Majumder
Oana-Maria Camburu
Thomas Lukasiewicz
Julian McAuley
365
36
0
25 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional
  Encoding
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
204
56
0
23 Jun 2021
Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Adaptive Learning Rate and Momentum for Training Deep Neural Networks
Zhiyong Hao
Yixuan Jiang
Huihua Yu
H. Chiang
ODL
109
14
0
22 Jun 2021
Iterative Network Pruning with Uncertainty Regularization for Lifelong
  Sentiment Classification
Iterative Network Pruning with Uncertainty Regularization for Lifelong Sentiment ClassificationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
Binzong Geng
Min Yang
Fajie Yuan
Shupeng Wang
Xiang Ao
Ruifeng Xu
CLL
122
20
0
21 Jun 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based
  Masked Language-models
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-modelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
1.0K
1,540
0
18 Jun 2021
Label prompt for multi-label text classification
Label prompt for multi-label text classification
Rui Song
Xingbing Chen
Zelong Liu
Haining An
Zhiqi Zhang
Xiaoguang Wang
Hao Xu
VLM
154
4
0
18 Jun 2021
PRGC: Potential Relation and Global Correspondence Based Joint
  Relational Triple Extraction
PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple ExtractionAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Heng Zheng
Rui Wen
Xi Chen
Yifan Yang
Yunyan Zhang
Ziheng Zhang
Ningyu Zhang
Bin Qin
Ming Xu
Yefeng Zheng
240
254
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation
  Learning
Efficient Self-supervised Vision Transformers for Representation LearningInternational Conference on Learning Representations (ICLR), 2021
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
302
222
0
17 Jun 2021
An Empirical Study on Hyperparameter Optimization for Fine-Tuning
  Pre-trained Language Models
An Empirical Study on Hyperparameter Optimization for Fine-Tuning Pre-trained Language Models
Xueqing Liu
Chi Wang
101
22
0
17 Jun 2021
Delving Deep into the Generalization of Vision Transformers under
  Distribution Shifts
Delving Deep into the Generalization of Vision Transformers under Distribution ShiftsComputer Vision and Pattern Recognition (CVPR), 2021
Chongzhi Zhang
Mingyuan Zhang
Shanghang Zhang
Daisheng Jin
Qiang-feng Zhou
Zhongang Cai
Haiyu Zhao
Xianglong Liu
Ziwei Liu
207
126
0
14 Jun 2021
Nested and Balanced Entity Recognition using Multi-Task Learning
Nested and Balanced Entity Recognition using Multi-Task Learning
Andreas Waldis
Luca Mazzola
142
1
0
11 Jun 2021
CAT: Cross Attention in Vision Transformer
CAT: Cross Attention in Vision TransformerIEEE International Conference on Multimedia and Expo (ICME), 2021
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
184
256
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video TransformersNeural Information Processing Systems (NeurIPS), 2021
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
283
340
0
09 Jun 2021
MVT: Mask Vision Transformer for Facial Expression Recognition in the
  wild
MVT: Mask Vision Transformer for Facial Expression Recognition in the wild
Hanting Li
Ming-Fa Sui
Feng Zhao
Zhengjun Zha
Feng Wu
ViT
211
95
0
08 Jun 2021
Reading StackOverflow Encourages Cheating: Adding Question Text Improves
  Extractive Code Generation
Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation
Gabriel Orlanski
Alex Gittens
108
22
0
08 Jun 2021
Let's be explicit about that: Distant supervision for implicit discourse
  relation classification via connective prediction
Let's be explicit about that: Distant supervision for implicit discourse relation classification via connective prediction
Murathan Kurfali
Robert Östling
125
20
0
06 Jun 2021
A Generalizable Approach to Learning Optimizers
A Generalizable Approach to Learning Optimizers
Diogo Almeida
Clemens Winter
Jie Tang
Wojciech Zaremba
AI4CE
277
33
0
02 Jun 2021
What Matters for Adversarial Imitation Learning?
What Matters for Adversarial Imitation Learning?Neural Information Processing Systems (NeurIPS), 2021
Manu Orsini
Anton Raichuk
Léonard Hussenot
Damien Vincent
Robert Dadashi
Sertan Girgin
Matthieu Geist
Olivier Bachem
Olivier Pietquin
Marcin Andrychowicz
229
88
0
01 Jun 2021
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
  Entity Recognition
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity RecognitionKnowledge Discovery and Data Mining (KDD), 2021
Shining Liang
Ming Gong
Jian Pei
Linjun Shou
Wanli Zuo
Xianglin Zuo
Daxin Jiang
207
36
0
01 Jun 2021
VidFace: A Full-Transformer Solver for Video FaceHallucination with
  Unaligned Tiny Snapshots
VidFace: A Full-Transformer Solver for Video FaceHallucination with Unaligned Tiny Snapshots
Y. Gan
Yawei Luo
Xin Yu
Bang Zhang
Yi Yang
ViTCVBM
182
3
0
31 May 2021
Defending Pre-trained Language Models from Adversarial Word
  Substitutions Without Performance Sacrifice
Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance SacrificeFindings (Findings), 2021
Rongzhou Bao
Jiayi Wang
Hai Zhao
AAML
130
50
0
30 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and
  Interpretable Visual Understanding
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual UnderstandingAAAI Conference on Artificial Intelligence (AAAI), 2021
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan O. Arik
Tomas Pfister
ViT
357
206
0
26 May 2021
Read, Listen, and See: Leveraging Multimodal Information Helps Chinese
  Spell Checking
Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell CheckingFindings (Findings), 2021
Heng-Da Xu
Zhongli Li
Qingyu Zhou
Chao Li
Zizhen Wang
Yunbo Cao
Heyan Huang
Xian-Ling Mao
199
109
0
26 May 2021
Guiding the Growth: Difficulty-Controllable Question Generation through
  Step-by-Step Rewriting
Guiding the Growth: Difficulty-Controllable Question Generation through Step-by-Step RewritingAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Yi Cheng
Siyao Li
Bang Liu
Ruihui Zhao
Sujian Li
Chenghua Lin
Yefeng Zheng
195
47
0
25 May 2021
Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic
  Representation
Learning Better Visual Dialog Agents with Pretrained Visual-Linguistic RepresentationComputer Vision and Pattern Recognition (CVPR), 2021
Tao Tu
Q. Ping
Govind Thattai
Gokhan Tur
Premkumar Natarajan
182
18
0
24 May 2021
A Sequence-to-Set Network for Nested Named Entity Recognition
A Sequence-to-Set Network for Nested Named Entity RecognitionInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Zeqi Tan
Yongliang Shen
Shuai Zhang
Weiming Lu
Yueting Zhuang
BDL
256
94
0
19 May 2021
On the Distributional Properties of Adaptive Gradients
On the Distributional Properties of Adaptive GradientsConference on Uncertainty in Artificial Intelligence (UAI), 2021
Z. Zhiyi
Liu Ziyin
140
4
0
15 May 2021
Towards an Online Empathetic Chatbot with Emotion Causes
Towards an Online Empathetic Chatbot with Emotion CausesAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
Yanran Li
K. Li
Hongke Ning
Xiaoqiang Xia
Yalong Guo
Chen Wei
Jianwei Cui
Bin Wang
252
54
0
11 May 2021
Logic-Driven Context Extension and Data Augmentation for Logical
  Reasoning of Text
Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of TextFindings (Findings), 2021
Siyuan Wang
Wanjun Zhong
Duyu Tang
Zhongyu Wei
Zhihao Fan
Daxin Jiang
Ming Zhou
Nan Duan
NAI
316
82
0
08 May 2021
Apply Artificial Neural Network to Solving Manpower Scheduling Problem
Apply Artificial Neural Network to Solving Manpower Scheduling Problem
Tianyu Liu
Lingyu Zhang
84
2
0
07 May 2021
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
SpeechNet: A Universal Modularized Model for Speech Processing Tasks
Yi-Chen Chen
Po-Han Chi
Shu-Wen Yang
Kai-Wei Chang
Jheng-hao Lin
Sung-Feng Huang
Da-Rong Liu
Chi-Liang Liu
Cheng-Kuang Lee
Hung-yi Lee
MoE
295
19
0
07 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Mathilde Caron
Hugo Touvron
Ishan Misra
Edouard Grave
Julien Mairal
Piotr Bojanowski
Armand Joulin
2.0K
7,910
0
29 Apr 2021
ConTNet: Why not use convolution and transformer at the same time?
ConTNet: Why not use convolution and transformer at the same time?
Haotian Yan
Zhe Li
Weijian Li
Changhu Wang
Ming Wu
Chuang Zhang
ViT
274
92
0
27 Apr 2021
Prediction, Selection, and Generation: Exploration of Knowledge-Driven
  Conversation System
Prediction, Selection, and Generation: Exploration of Knowledge-Driven Conversation System
Cheng Luo
Dayiheng Liu
Chanjuan Li
Li Lu
Jiancheng Lv
132
0
0
23 Apr 2021
Multiscale Vision Transformers
Multiscale Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
481
1,513
0
22 Apr 2021
Vision Transformer Pruning
Vision Transformer Pruning
Mingjian Zhu
Yehui Tang
Kai Han
ViT
475
112
0
17 Apr 2021
How to Train BERT with an Academic Budget
How to Train BERT with an Academic BudgetConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Peter Izsak
Moshe Berchansky
Omer Levy
338
128
0
15 Apr 2021
Emotion Dynamics Modeling via BERT
Emotion Dynamics Modeling via BERTIEEE International Joint Conference on Neural Network (IJCNN), 2021
Haiqing Yang
Jianping Shen
207
14
0
15 Apr 2021
From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic
  Text Generation
From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic Text Generation
Jianing Zhou
Hongyu Gong
Srihari Venkat Nanniyur
S. Bhat
181
10
0
13 Apr 2021
Learning and Planning in Complex Action Spaces
Learning and Planning in Complex Action SpacesInternational Conference on Machine Learning (ICML), 2021
Thomas Hubert
Julian Schrittwieser
Ioannis Antonoglou
M. Barekatain
Simon Schmitt
David Silver
224
89
0
13 Apr 2021
Online and Offline Reinforcement Learning by Planning with a Learned
  Model
Online and Offline Reinforcement Learning by Planning with a Learned ModelNeural Information Processing Systems (NeurIPS), 2021
Julian Schrittwieser
Thomas Hubert
Amol Mandhane
M. Barekatain
Ioannis Antonoglou
David Silver
OffRL
223
131
0
13 Apr 2021
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
Georgios Tziafas
Konstantinos Kogkalidis
Tommaso Caselli
130
9
0
12 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
A Deep Learning Based Cost Model for Automatic Code OptimizationConference on Machine Learning and Systems (MLSys), 2021
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
168
88
0
11 Apr 2021
SiT: Self-supervised vIsion Transformer
SiT: Self-supervised vIsion Transformer
Sara Atito Ali Ahmed
Muhammad Awais
J. Kittler
ViT
379
155
0
08 Apr 2021
Modern Hopfield Networks for Few- and Zero-Shot Reaction Template
  Prediction
Modern Hopfield Networks for Few- and Zero-Shot Reaction Template Prediction
Philipp Seidl
Philipp Renz
N. Dyubankova
Paulo Neves
Jonas Verhoeven
Marwin H. S. Segler
J. Wegner
Sepp Hochreiter
Günter Klambauer
271
17
0
07 Apr 2021
Going deeper with Image Transformers
Going deeper with Image TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Edouard Grave
ViT
575
1,188
0
31 Mar 2021
XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head
  Co-Attention for Reading Comprehension of Abstract Meaning
XRJL-HKUST at SemEval-2021 Task 4: WordNet-Enhanced Dual Multi-head Co-Attention for Reading Comprehension of Abstract MeaningInternational Workshop on Semantic Evaluation (SemEval), 2021
Yuxin Jiang
Ziyi Shou
Qijun Wang
Hao Wu
Fangzhen Lin
RALM
230
2
0
30 Mar 2021
Previous
123...202122232425
Next