Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2008.02496
Cited By
ConvBERT: Improving BERT with Span-based Dynamic Convolution
6 August 2020
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ConvBERT: Improving BERT with Span-based Dynamic Convolution"
26 / 76 papers shown
Title
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
16
28
0
15 Sep 2021
Explainable Identification of Dementia from Transcripts using Transformer Networks
Loukas Ilias
D. Askounis
10
38
0
14 Sep 2021
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
21
2
0
30 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
20
258
0
12 Aug 2021
AutoBERT-Zero: Evolving BERT Backbone from Scratch
Jiahui Gao
Hang Xu
Han Shi
Xiaozhe Ren
Philip L. H. Yu
Xiaodan Liang
Xin Jiang
Zhenguo Li
11
37
0
15 Jul 2021
DaCy: A Unified Framework for Danish NLP
K. Enevoldsen
Lasse Hansen
Kristoffer Laigaard Nielbo
27
13
0
12 Jul 2021
LV-BERT: Exploiting Layer Variety for BERT
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
10
0
0
22 Jun 2021
A Comprehensive Comparison of Pre-training Language Models
Tonglei Guo
VLM
ELM
19
3
0
22 Jun 2021
Can Transformer Language Models Predict Psychometric Properties?
Antonio Laverghetta
Animesh Nighojkar
Jamshidbek Mirzakhalov
John Licato
LM&MA
30
14
0
12 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
11
5
0
10 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Z. Tu
ViT
13
15
0
10 Jun 2021
Training ELECTRA Augmented with Multi-word Selection
Jiaming Shen
Jialu Liu
Tianqi Liu
Cong Yu
Jiawei Han
21
9
0
31 May 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MA
MedIm
18
163
0
16 Apr 2021
Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation
Wenhao Li
Hong Liu
Runwei Ding
Mengyuan Liu
Pichao Wang
Wenming Yang
ViT
15
189
0
26 Mar 2021
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
19
510
0
22 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
25
210
0
11 Mar 2021
M6: A Chinese Multimodal Pretrainer
Junyang Lin
Rui Men
An Yang
Chan Zhou
Ming Ding
...
Yong Li
Wei Lin
Jingren Zhou
J. Tang
Hongxia Yang
VLM
MoE
23
132
0
01 Mar 2021
TransReID: Transformer-based Object Re-Identification
Shuting He
Haowen Luo
Pichao Wang
F. Wang
Hao Li
Wei Jiang
ViT
213
793
0
08 Feb 2021
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
16
2,123
0
23 Dec 2020
Rethinking Transformer-based Set Prediction for Object Detection
Zhiqing Sun
Shengcao Cao
Yiming Yang
Kris M. Kitani
ViT
13
319
0
21 Nov 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
241
1,450
0
18 Mar 2020
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
229
778
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
574
0
12 Sep 2019
Knowledge Enhanced Contextual Word Representations
Matthew E. Peters
Mark Neumann
IV RobertL.Logan
Roy Schwartz
Vidur Joshi
Sameer Singh
Noah A. Smith
224
656
0
09 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
250
13,347
0
25 Aug 2014
Previous
1
2