ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.02496
  4. Cited By
ConvBERT: Improving BERT with Span-based Dynamic Convolution

ConvBERT: Improving BERT with Span-based Dynamic Convolution

6 August 2020
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
ArXivPDFHTML

Papers citing "ConvBERT: Improving BERT with Span-based Dynamic Convolution"

26 / 76 papers shown
Title
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up
  Knowledge Distillation
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
16
28
0
15 Sep 2021
Explainable Identification of Dementia from Transcripts using
  Transformer Networks
Explainable Identification of Dementia from Transcripts using Transformer Networks
Loukas Ilias
D. Askounis
10
38
0
14 Sep 2021
Shatter: An Efficient Transformer Encoder with Single-Headed
  Self-Attention and Relative Sequence Partitioning
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
21
2
0
30 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural
  Language Processing
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLM
LM&MA
20
258
0
12 Aug 2021
AutoBERT-Zero: Evolving BERT Backbone from Scratch
AutoBERT-Zero: Evolving BERT Backbone from Scratch
Jiahui Gao
Hang Xu
Han Shi
Xiaozhe Ren
Philip L. H. Yu
Xiaodan Liang
Xin Jiang
Zhenguo Li
11
37
0
15 Jul 2021
DaCy: A Unified Framework for Danish NLP
DaCy: A Unified Framework for Danish NLP
K. Enevoldsen
Lasse Hansen
Kristoffer Laigaard Nielbo
27
13
0
12 Jul 2021
LV-BERT: Exploiting Layer Variety for BERT
LV-BERT: Exploiting Layer Variety for BERT
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
10
0
0
22 Jun 2021
A Comprehensive Comparison of Pre-training Language Models
A Comprehensive Comparison of Pre-training Language Models
Tonglei Guo
VLM
ELM
19
3
0
22 Jun 2021
Can Transformer Language Models Predict Psychometric Properties?
Can Transformer Language Models Predict Psychometric Properties?
Antonio Laverghetta
Animesh Nighojkar
Jamshidbek Mirzakhalov
John Licato
LM&MA
30
14
0
12 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped
  Structures
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
11
5
0
10 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in
  Pre-trained Language Models
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models
Tyler A. Chang
Yifan Xu
Weijian Xu
Z. Tu
ViT
13
15
0
10 Jun 2021
Training ELECTRA Augmented with Multi-word Selection
Training ELECTRA Augmented with Multi-word Selection
Jiaming Shen
Jialu Liu
Tianqi Liu
Cong Yu
Jiawei Han
21
9
0
31 May 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language
  Models
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MA
MedIm
18
163
0
16 Apr 2021
Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose
  Estimation
Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation
Wenhao Li
Hong Liu
Runwei Ding
Mengyuan Liu
Pichao Wang
Wenming Yang
ViT
15
189
0
26 Mar 2021
DeepViT: Towards Deeper Vision Transformer
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
19
510
0
22 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
  Representation
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
25
210
0
11 Mar 2021
M6: A Chinese Multimodal Pretrainer
M6: A Chinese Multimodal Pretrainer
Junyang Lin
Rui Men
An Yang
Chan Zhou
Ming Ding
...
Yong Li
Wei Lin
Jingren Zhou
J. Tang
Hongxia Yang
VLM
MoE
23
132
0
01 Mar 2021
TransReID: Transformer-based Object Re-Identification
TransReID: Transformer-based Object Re-Identification
Shuting He
Haowen Luo
Pichao Wang
F. Wang
Hao Li
Wei Jiang
ViT
213
793
0
08 Feb 2021
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
16
2,123
0
23 Dec 2020
Rethinking Transformer-based Set Prediction for Object Detection
Rethinking Transformer-based Set Prediction for Object Detection
Zhiqing Sun
Shengcao Cao
Yiming Yang
Kris M. Kitani
ViT
13
319
0
21 Nov 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
241
1,450
0
18 Mar 2020
K-BERT: Enabling Language Representation with Knowledge Graph
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
229
778
0
17 Sep 2019
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
574
0
12 Sep 2019
Knowledge Enhanced Contextual Word Representations
Knowledge Enhanced Contextual Word Representations
Matthew E. Peters
Mark Neumann
IV RobertL.Logan
Roy Schwartz
Vidur Joshi
Sameer Singh
Noah A. Smith
224
656
0
09 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Convolutional Neural Networks for Sentence Classification
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
250
13,347
0
25 Aug 2014
Previous
12