Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1909.10351
Cited By
v1
v2
v3
v4
v5 (latest)
TinyBERT: Distilling BERT for Natural Language Understanding
Findings (Findings), 2019
23 September 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TinyBERT: Distilling BERT for Natural Language Understanding"
50 / 1,056 papers shown
Vision Transformer Pruning
Mingjian Zhu
Yehui Tang
Kai Han
ViT
491
113
0
17 Apr 2021
Annealing Knowledge Distillation
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
A. Jafari
Mehdi Rezagholizadeh
Pranav Sharma
A. Ghodsi
205
91
0
14 Apr 2021
Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2021
Sebastian Hofstatter
Sheng-Chieh Lin
Jheng-Hong Yang
Jimmy J. Lin
Allan Hanbury
VLM
476
456
0
14 Apr 2021
Efficient transfer learning for NLP with ELECTRA
Omer Levy
56
1
0
06 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
IEEE International Conference on Computer Vision (ICCV), 2021
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
283
116
0
05 Apr 2021
Shrinking Bigfoot: Reducing wav2vec 2.0 footprint
Zilun Peng
Akshay Budhkar
Ilana Tuil
J. Levy
Parinaz Sobhani
Raphael Cohen
J. Nassour
186
35
0
29 Mar 2021
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies
P. Jayarao
Arpit Sharma
114
4
0
29 Mar 2021
A Practical Survey on Faster and Lighter Transformers
ACM Computing Surveys (CSUR), 2021
Quentin Fournier
G. Caron
Daniel Aloise
386
139
0
26 Mar 2021
Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers
International Journal of Machine Learning and Cybernetics (IJMLC), 2021
Markus Bayer
M. Kaufhold
Björn Buchhold
Marcel Keller
J. Dallmeyer
Christian A. Reuter
207
145
0
26 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
IEEE Access (IEEE Access), 2021
Sushant Singh
A. Mahmood
AI4TS
325
121
0
23 Mar 2021
ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques
AAAI Conference on Artificial Intelligence (AAAI), 2021
Yuanxin Liu
Zheng Lin
Fengcheng Yuan
VLM
MQ
184
22
0
21 Mar 2021
NameRec*: Highly Accurate and Fine-grained Person Name Recognition
Rui Zhang
Yimeng Dai
Shijie Liu
121
0
0
21 Mar 2021
Cost-effective Deployment of BERT Models in Serverless Environment
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
Katarína Benesová
Andrej Svec
Marek Suppa
167
2
0
19 Mar 2021
Scalable Vision Transformers with Hierarchical Pooling
IEEE International Conference on Computer Vision (ICCV), 2021
Zizheng Pan
Bohan Zhuang
Jing Liu
Haoyu He
Jianfei Cai
ViT
242
147
0
19 Mar 2021
UniParma at SemEval-2021 Task 5: Toxic Spans Detection Using CharacterBERT and Bag-of-Words Model
International Workshop on Semantic Evaluation (SemEval), 2021
Akbar Karimi
L. Rossi
Andrea Prati
247
4
0
17 Mar 2021
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss
International Conference on Learning Representations (ICLR), 2021
Mingyang Yi
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Zhi-Ming Ma
266
23
0
16 Mar 2021
LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
Siqi Sun
Yen-Chun Chen
Linjie Li
Shuohang Wang
Yuwei Fang
Jingjing Liu
VLM
211
89
0
16 Mar 2021
TAG: Gradient Attack on Transformer-based Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Jieren Deng
Yijue Wang
Ji Li
Chao Shang
Hang Liu
Sanguthevar Rajasekaran
Caiwen Ding
FedML
PILM
247
95
0
11 Mar 2021
LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
Fang Wang
Qun Liu
114
10
0
11 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
International Journal of Computer Vision (IJCV), 2021
Andrew Shin
Masato Ishii
T. Narihira
310
51
0
06 Mar 2021
Unbiased Sentence Encoder For Large-Scale Multi-lingual Search Engines
Mahdi Hajiaghayi
Monir Hajiaghayi
Mark R. Bolin
122
0
0
01 Mar 2021
Learning Dynamic BERT via Trainable Gate Variables and a Bi-modal Regularizer
Seohyeong Jeong
Nojun Kwak
84
0
0
19 Feb 2021
Centroid Transformers: Learning to Abstract with Attention
Lemeng Wu
Xingchao Liu
Qiang Liu
3DPC
253
34
0
17 Feb 2021
Improved Customer Transaction Classification using Semi-Supervised Knowledge Distillation
Rohan Sukumaran
88
2
0
15 Feb 2021
Learning Student-Friendly Teacher Networks for Knowledge Distillation
Neural Information Processing Systems (NeurIPS), 2021
D. Park
Moonsu Cha
C. Jeong
Daesin Kim
Bohyung Han
524
117
0
12 Feb 2021
NewsBERT: Distilling Pre-trained Language Model for Intelligent News Application
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Chuhan Wu
Fangzhao Wu
Yang Yu
Tao Qi
Yongfeng Huang
Qi Liu
VLM
187
48
0
09 Feb 2021
FedAUX: Leveraging Unlabeled Auxiliary Data in Federated Learning
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Felix Sattler
Tim Korjakow
R. Rischke
Wojciech Samek
FedML
189
142
0
04 Feb 2021
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Yuhan Liu
Saurabh Agarwal
Shivaram Venkataraman
OffRL
247
72
0
02 Feb 2021
Distilling Large Language Models into Tiny and Effective Students using pQRNN
P. Kaliamoorthi
Aditya Siddhant
Edward Li
Melvin Johnson
MQ
139
18
0
21 Jan 2021
Deep Epidemiological Modeling by Black-box Knowledge Distillation: An Accurate Deep Learning Model for COVID-19
AAAI Conference on Artificial Intelligence (AAAI), 2021
Dongdong Wang
Shunpu Zhang
Liqiang Wang
141
15
0
20 Jan 2021
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation
AAAI Conference on Artificial Intelligence (AAAI), 2021
Lingyun Feng
Minghui Qiu
Yaliang Li
Haitao Zheng
Ying Shen
182
12
0
20 Jan 2021
Model Compression for Domain Adaptation through Causal Effect Estimation
Transactions of the Association for Computational Linguistics (TACL), 2021
Guy Rotman
Amir Feder
Roi Reichart
CML
267
8
0
18 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
194
28
0
15 Jan 2021
SEED: Self-supervised Distillation For Visual Representation
International Conference on Learning Representations (ICLR), 2021
Zhiyuan Fang
Jianfeng Wang
Lijuan Wang
Lei Zhang
Yezhou Yang
Zicheng Liu
SSL
513
209
0
12 Jan 2021
Adversarially Robust and Explainable Model Compression with On-Device Personalization for Text Classification
Yao Qiang
Supriya Tumkur Suresh Kumar
Marco Brocanelli
D. Zhu
AAML
144
0
0
10 Jan 2021
Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed
Eric Luhman
Troy Luhman
DiffM
497
349
0
07 Jan 2021
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Woojeong Jin
Maziar Sanjabi
Shaoliang Nie
L Tan
Xiang Ren
Hamed Firooz
165
6
0
06 Jan 2021
I-BERT: Integer-only BERT Quantization
International Conference on Machine Learning (ICML), 2021
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
476
370
0
05 Jan 2021
WARP: Word-level Adversarial ReProgramming
Annual Meeting of the Association for Computational Linguistics (ACL), 2021
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
679
369
0
01 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zinan Lin
Jingjing Liu
433
104
0
31 Dec 2020
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Findings (Findings), 2020
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
434
349
0
31 Dec 2020
BinaryBERT: Pushing the Limit of BERT Quantization
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
499
251
0
31 Dec 2020
Towards Zero-Shot Knowledge Distillation for Natural Language Processing
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Ahmad Rashid
Vasileios Lioutas
Abbas Ghaddar
Mehdi Rezagholizadeh
255
32
0
31 Dec 2020
Unified Mandarin TTS Front-end Based on Distilled BERT Model
Yang Zhang
Liqun Deng
Yasheng Wang
167
26
0
31 Dec 2020
SemGloVe: Semantic Co-occurrences for GloVe from BERT
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2020
Yaoyao Yu
Zhiyang Teng
Yue Zhang
Linchao Zhu
Leilei Gan
Yi Yang
182
21
0
30 Dec 2020
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Lei Li
Yankai Lin
Deli Chen
Shuhuai Ren
Peng Li
Jie Zhou
Xu Sun
246
59
0
29 Dec 2020
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation
AAAI Conference on Artificial Intelligence (AAAI), 2020
Peyman Passban
Yimeng Wu
Mehdi Rezagholizadeh
Qun Liu
162
134
0
27 Dec 2020
Learning Light-Weight Translation Models from Deep Transformer
AAAI Conference on Artificial Intelligence (AAAI), 2020
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
298
43
0
27 Dec 2020
Towards a Universal Continuous Knowledge Base
AI Open (AO), 2020
Gang Chen
Maosong Sun
Yang Liu
239
3
0
25 Dec 2020
A Survey on Visual Transformer
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
1.1K
3,160
0
23 Dec 2020
Previous
1
2
3
...
18
19
20
21
22
Next
Page 19 of 22
Page
of 22
Go