Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.05101
Cited By
Decoupled Weight Decay Regularization
14 November 2017
I. Loshchilov
Frank Hutter
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Decoupled Weight Decay Regularization"
50 / 329 papers shown
Title
ReGen: Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models
Pierre L. Dognin
Inkit Padhi
Igor Melnyk
Payel Das
OffRL
16
20
0
27 Aug 2021
Disentangling Hate in Online Memes
Rui Cao
Ziqing Fan
Roy Ka-Wei Lee
Wen-Haw Chong
Jing Jiang
24
76
0
09 Aug 2021
Large-Scale Differentially Private BERT
Rohan Anil
Badih Ghazi
Vineet Gupta
Ravi Kumar
Pasin Manurangsi
33
131
0
03 Aug 2021
Improving Robustness and Accuracy via Relative Information Encoding in 3D Human Pose Estimation
Wenkang Shan
Haopeng Lu
Shanshe Wang
Xinfeng Zhang
Wen Gao
3DH
22
63
0
29 Jul 2021
A Deep Learning-based Quality Assessment and Segmentation System with a Large-scale Benchmark Dataset for Optical Coherence Tomographic Angiography Image
Yu-Fang Wang
Yiqing Shen
Meng Yuan
Jing Xu
B. Yang
Chicheng Liu
Wenjia Cai
Weijing Cheng
Wei Wang
25
18
0
22 Jul 2021
Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations
Bodhisattwa Prasad Majumder
Oana-Maria Camburu
Thomas Lukasiewicz
Julian McAuley
23
35
0
25 Jun 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
32
1,149
0
18 Jun 2021
Label prompt for multi-label text classification
Rui Song
Xingbing Chen
Zelong Liu
Haining An
Zhiqi Zhang
Xiaoguang Wang
Hao Xu
VLM
15
4
0
18 Jun 2021
PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction
Heng Zheng
Rui Wen
Xi Chen
Yifan Yang
Yunyan Zhang
Ziheng Zhang
Ningyu Zhang
Bin Qin
Ming Xu
Yefeng Zheng
24
197
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
32
209
0
17 Jun 2021
Delving Deep into the Generalization of Vision Transformers under Distribution Shifts
Chongzhi Zhang
Mingyuan Zhang
Shanghang Zhang
Daisheng Jin
Qiang-feng Zhou
Zhongang Cai
Haiyu Zhao
Xianglong Liu
Ziwei Liu
18
102
0
14 Jun 2021
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
27
149
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
8
274
0
09 Jun 2021
MVT: Mask Vision Transformer for Facial Expression Recognition in the wild
Hanting Li
Ming-Fa Sui
Feng Zhao
Zhengjun Zha
Feng Wu
ViT
31
75
0
08 Jun 2021
Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation
Gabriel Orlanski
Alex Gittens
29
20
0
08 Jun 2021
What Matters for Adversarial Imitation Learning?
Manu Orsini
Anton Raichuk
Léonard Hussenot
Damien Vincent
Robert Dadashi
Sertan Girgin
M. Geist
Olivier Bachem
Olivier Pietquin
Marcin Andrychowicz
42
77
0
01 Jun 2021
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
Shining Liang
Ming Gong
J. Pei
Linjun Shou
Wanli Zuo
Xianglin Zuo
Daxin Jiang
31
34
0
01 Jun 2021
VidFace: A Full-Transformer Solver for Video FaceHallucination with Unaligned Tiny Snapshots
Y. Gan
Yawei Luo
Xin Yu
Bang Zhang
Yi Yang
ViT
CVBM
22
3
0
31 May 2021
Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking
Heng-Da Xu
Zhongli Li
Qingyu Zhou
Chao Li
Zizhen Wang
Yunbo Cao
Heyan Huang
Xian-Ling Mao
40
94
0
26 May 2021
A Sequence-to-Set Network for Nested Named Entity Recognition
Zeqi Tan
Yongliang Shen
Shuai Zhang
Weiming Lu
Yueting Zhuang
BDL
11
78
0
19 May 2021
Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text
Siyuan Wang
Wanjun Zhong
Duyu Tang
Zhongyu Wei
Zhihao Fan
Daxin Jiang
Ming Zhou
Nan Duan
NAI
31
70
0
08 May 2021
ConTNet: Why not use convolution and transformer at the same time?
Haotian Yan
Zhe Li
Weijian Li
Changhu Wang
Ming Wu
Chuang Zhang
ViT
14
76
0
27 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
54
1,222
0
22 Apr 2021
How to Train BERT with an Academic Budget
Peter Izsak
Moshe Berchansky
Omer Levy
12
113
0
15 Apr 2021
Emotion Dynamics Modeling via BERT
Haiqing Yang
Jianping Shen
24
11
0
15 Apr 2021
Fighting the COVID-19 Infodemic with a Holistic BERT Ensemble
Georgios Tziafas
Konstantinos Kogkalidis
Tommaso Caselli
19
9
0
12 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
17
68
0
11 Apr 2021
SiT: Self-supervised vIsion Transformer
Sara Atito Ali Ahmed
Muhammad Awais
J. Kittler
ViT
33
139
0
08 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
25
986
0
31 Mar 2021
R-GSN: The Relation-based Graph Similar Network for Heterogeneous Graph
Xinliang Wu
Mengying Jiang
Guizhong Liu
GNN
22
7
0
14 Mar 2021
Bidirectional Machine Reading Comprehension for Aspect Sentiment Triplet Extraction
Shaowei Chen
Yu Wang
Jie Liu
Yuelin Wang
22
176
0
13 Mar 2021
ZJUKLAB at SemEval-2021 Task 4: Negative Augmentation with Language Model for Reading Comprehension of Abstract Meaning
Xin Xie
Xiangnan Chen
Xiang Chen
Yong Wang
Ningyu Zhang
Shumin Deng
Huajun Chen
34
2
0
25 Feb 2021
Multilingual Answer Sentence Reranking via Automatically Translated Data
Thuy Vu
Alessandro Moschitti
22
5
0
20 Feb 2021
Meta-Learning for Effective Multi-task and Multilingual Modelling
Ishan Tarunesh
Sushil Khyalia
Vishwajeet Kumar
Ganesh Ramakrishnan
P. Jyothi
31
16
0
25 Jan 2021
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
Weizhou Shen
Junqing Chen
Xiaojun Quan
Zhixiang Xie
16
199
0
16 Dec 2020
Topological Planning with Transformers for Vision-and-Language Navigation
Kevin Chen
Junshen K. Chen
Jo Chuang
Marynel Vázquez
Silvio Savarese
LM&Ro
27
99
0
09 Dec 2020
EffiScene: Efficient Per-Pixel Rigidity Inference for Unsupervised Joint Learning of Optical Flow, Depth, Camera Pose and Motion Segmentation
Yang Jiao
T. Tran
Guangming Shi
32
33
0
16 Nov 2020
Reverse engineering learned optimizers reveals known and novel mechanisms
Niru Maheswaranathan
David Sussillo
Luke Metz
Ruoxi Sun
Jascha Narain Sohl-Dickstein
14
21
0
04 Nov 2020
Multi-View Adaptive Fusion Network for 3D Object Detection
Guojun Wang
Bin Tian
Yachen Zhang
Long Chen
Dongpu Cao
Jian Wu
3DPC
21
25
0
02 Nov 2020
EDCNN: Edge enhancement-based Densely Connected Network with Compound Loss for Low-Dose CT Denoising
Tengfei Liang
Yi Jin
Yidong Li
Tao Wang
Songhe Feng
Congyan Lang
8
94
0
30 Oct 2020
Scaling Laws for Autoregressive Generative Modeling
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
...
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
25
405
0
28 Oct 2020
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
Jianguo Zhang
Kazuma Hashimoto
Wenhao Liu
Chien-Sheng Wu
Yao Wan
Philip S. Yu
R. Socher
Caiming Xiong
14
92
0
25 Oct 2020
DialogueTRM: Exploring the Intra- and Inter-Modal Emotional Behaviors in the Conversation
Yuzhao Mao
Qi Sun
Guang Liu
Xiaojie Wang
Weiguo Gao
Xuan Li
Jianping Shen
19
24
0
15 Oct 2020
Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
Tom Hope
Aida Amini
David Wadden
Madeleine van Zuylen
Sravanthi Parasa
Eric Horvitz
Daniel S. Weld
Roy Schwartz
Hannaneh Hajishirzi
24
29
0
08 Oct 2020
Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning
Subhojeet Pramanik
Shashank Mujumdar
Hima Patel
13
31
0
30 Sep 2020
DSC IIT-ISM at SemEval-2020 Task 6: Boosting BERT with Dependencies for Definition Extraction
Aadarsh Singh
Priyanshu Kumar
Aman Sinha
14
4
0
17 Sep 2020
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data
Qianhui Wu
Zijia Lin
Börje F. Karlsson
Biqing Huang
Jian-Guang Lou
13
46
0
15 Jul 2020
Ensemble Transfer Learning for Emergency Landing Field Identification on Moderate Resource Heterogeneous Kubernetes Cluster
Andreas Klos
Marius Rosenbaum
W. Schiffmann
6
2
0
26 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
62
2,618
0
05 Jun 2020
Real-Time Apple Detection System Using Embedded Systems With Hardware Accelerators: An Edge AI Application
Vittorio Mazzia
Francesco Salvetti
Aleem Khaliq
Marcello Chiaberge
22
152
0
28 Apr 2020
Previous
1
2
3
4
5
6
7
Next