Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1909.10351
Cited By
v1
v2
v3
v4
v5 (latest)
TinyBERT: Distilling BERT for Natural Language Understanding
Findings (Findings), 2019
23 September 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"TinyBERT: Distilling BERT for Natural Language Understanding"
50 / 1,056 papers shown
Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal
Findings (Findings), 2022
Umang Gupta
Jwala Dhamala
Varun Kumar
Apurv Verma
Yada Pruksachatkun
Satyapriya Krishna
Rahul Gupta
Kai-Wei Chang
Greg Ver Steeg
Aram Galstyan
174
61
0
23 Mar 2022
Input-specific Attention Subnetworks for Adversarial Detection
Findings (Findings), 2022
Emil Biju
Anirudh Sriram
Pratyush Kumar
Mitesh M Khapra
AAML
162
5
0
23 Mar 2022
Text Transformations in Contrastive Self-Supervised Learning: A Review
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Amrita Bhattacharjee
Mansooreh Karami
Huan Liu
SSL
381
23
0
22 Mar 2022
Out-of-distribution Generalization with Causal Invariant Transformations
Computer Vision and Pattern Recognition (CVPR), 2022
Ruoyu Wang
Mingyang Yi
Zhitang Chen
Shengyu Zhu
OOD
OODD
256
80
0
22 Mar 2022
DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Zheng Li
Zijian Wang
Ming Tan
Ramesh Nallapati
Parminder Bhatia
Andrew O. Arnold
Bing Xiang
Dan Roth
MQ
171
46
0
21 Mar 2022
Compression of Generative Pre-trained Language Models via Quantization
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Chaofan Tao
Lu Hou
Wei Zhang
Lifeng Shang
Xin Jiang
Qun Liu
Ping Luo
Ngai Wong
MQ
262
116
0
21 Mar 2022
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation
Findings (Findings), 2022
Ehsan Kamalloo
Mehdi Rezagholizadeh
A. Ghodsi
218
11
0
17 Mar 2022
Compressing Sentence Representation for Semantic Retrieval via Homomorphic Projective Distillation
Findings (Findings), 2022
Xuandong Zhao
Zhiguo Yu
Ming-li Wu
Lei Li
113
8
0
15 Mar 2022
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Eldar Kurtic
Daniel Fernando Campos
Tuan Nguyen
Elias Frantar
Mark Kurtz
Ben Fineran
Michael Goin
Dan Alistarh
VLM
MQ
MedIm
395
146
0
14 Mar 2022
BiBERT: Accurate Fully Binarized BERT
International Conference on Learning Representations (ICLR), 2022
Haotong Qin
Yifu Ding
Mingyuan Zhang
Qing Yan
Aishan Liu
Qingqing Dang
Ziwei Liu
Xianglong Liu
MQ
195
113
0
12 Mar 2022
Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Findings (Findings), 2022
Wenliang Dai
Lu Hou
Lifeng Shang
Xin Jiang
Qun Liu
Pascale Fung
VLM
235
107
0
12 Mar 2022
LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval
Jie Lei
Xinlei Chen
Ning Zhang
Meng-xing Wang
Joey Tianyi Zhou
Tamara L. Berg
Licheng Yu
267
15
0
10 Mar 2022
Knowledge Amalgamation for Object Detection with Transformers
IEEE Transactions on Image Processing (IEEE TIP), 2022
Haofei Zhang
Feng Mao
Mengqi Xue
Gongfan Fang
Zunlei Feng
Mingli Song
Weilong Dai
ViT
385
16
0
07 Mar 2022
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
Findings (Findings), 2022
Tianxiang Sun
Xiangyang Liu
Wei-wei Zhu
Zhichao Geng
Lingling Wu
Yilong He
Yuan Ni
Guotong Xie
Xuanjing Huang
Xipeng Qiu
254
42
0
03 Mar 2022
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mohammad Akbari
Amin Banitalebi-Dehkordi
Yong Zhang
181
9
0
01 Mar 2022
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
R. Liu
Kailun Yang
Alina Roitberg
Kailai Li
Kunyu Peng
Huayao Liu
Yaonan Wang
Rainer Stiefelhagen
ViT
276
58
0
27 Feb 2022
Art Creation with Multi-Conditional StyleGANs
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Konstantin Dobler
Florian Hübscher
Jan Westphal
Alejandro Sierra-Múnera
Gerard de Melo
Ralf Krestel
GAN
AI4CE
267
8
0
23 Feb 2022
LAMP: Extracting Text from Gradients with Language Model Priors
Neural Information Processing Systems (NeurIPS), 2022
Mislav Balunović
Dimitar I. Dimitrov
Nikola Jovanović
Martin Vechev
318
78
0
17 Feb 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
351
276
0
16 Feb 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
AAAI Conference on Artificial Intelligence (AAAI), 2022
Canwen Xu
Julian McAuley
359
87
0
15 Feb 2022
What is Next when Sequential Prediction Meets Implicitly Hard Interaction?
International Conference on Information and Knowledge Management (CIKM), 2021
Kaixi Hu
Lin Li
Qing Xie
Jianquan Liu
Xiaohui Tao
171
22
0
14 Feb 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
209
34
0
09 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
International Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
584
1,037
0
07 Feb 2022
Aspect-based Sentiment Analysis through EDU-level Attentions
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2022
Ting Lin
Aixin Sun
Yequan Wang
150
7
0
05 Feb 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
205
5
0
29 Jan 2022
Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Haoyu Dong
Zhoujun Cheng
Xinyi He
Mengyuan Zhou
Anda Zhou
Fan Zhou
Ao Liu
Shi Han
Dongmei Zhang
LMTD
427
74
0
24 Jan 2022
Can Model Compression Improve NLP Fairness
Guangxuan Xu
Qingyuan Hu
146
30
0
21 Jan 2022
AutoDistill: an End-to-End Framework to Explore and Distill Hardware-Efficient Language Models
Xiaofan Zhang
Zongwei Zhou
Deming Chen
Yu Emma Wang
173
12
0
21 Jan 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Lezhi Li
Tianlong Chen
Xiaolong Ma
Zinan Lin
Yanzhi Wang
ViT
283
54
0
17 Jan 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
235
7
0
15 Jan 2022
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
Zhecan Wang
Noel Codella
Yen-Chun Chen
Luowei Zhou
Jianwei Yang
Xiyang Dai
Bin Xiao
Haoxuan You
Shih-Fu Chang
Lu Yuan
CLIP
VLM
213
44
0
15 Jan 2022
Pretrained Language Models for Text Generation: A Survey
ACM Computing Surveys (ACM CSUR), 2022
Junyi Li
Tianyi Tang
Wayne Xin Zhao
J. Nie
Ji-Rong Wen
AI4CE
535
268
0
14 Jan 2022
Latency Adjustable Transformer Encoder for Language Understanding
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Sajjad Kachuee
M. Sharifkhani
590
1
0
10 Jan 2022
ThreshNet: An Efficient DenseNet Using Threshold Mechanism to Reduce Connections
IEEE Access (IEEE Access), 2022
Ruikang Ju
Ting-Yu Lin
Jia-Hao Jian
Jen-Shiun Chiang
Weida Yang
260
9
0
09 Jan 2022
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis
AAAI Conference on Artificial Intelligence (AAAI), 2022
L. Ein-Dor
Ilya Shnayderman
Artem Spector
Lena Dankin
R. Aharonov
Noam Slonim
213
9
0
06 Jan 2022
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
181
7
0
03 Jan 2022
Automatic Mixed-Precision Quantization Search of BERT
International Joint Conference on Artificial Intelligence (IJCAI), 2021
Changsheng Zhao
Ting Hua
Yilin Shen
Qian Lou
Hongxia Jin
MQ
171
26
0
30 Dec 2021
An Efficient Combinatorial Optimization Model Using Learning-to-Rank Distillation
AAAI Conference on Artificial Intelligence (AAAI), 2021
Honguk Woo
Hyunsung Lee
Sangwook Cho
261
7
0
24 Dec 2021
ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Shuohuan Wang
Yu Sun
Yang Xiang
Zhihua Wu
Siyu Ding
...
Tian Wu
Wei Zeng
Ge Li
Wen Gao
Haifeng Wang
ELM
214
87
0
23 Dec 2021
Distilling the Knowledge of Romanian BERTs Using Multiple Teachers
International Conference on Language Resources and Evaluation (LREC), 2021
Andrei-Marius Avram
Darius Catrina
Dumitru-Clementin Cercel
Mihai Dascualu
Traian Rebedea
Vasile Puaics
Dan Tufics
343
14
0
23 Dec 2021
Sublinear Time Approximation of Text Similarity Matrices
AAAI Conference on Artificial Intelligence (AAAI), 2021
Archan Ray
Nicholas Monath
Andrew McCallum
Cameron Musco
303
7
0
17 Dec 2021
Data Efficient Language-supervised Zero-shot Recognition with Optimal Transport Distillation
Bichen Wu
Rui Cheng
Peizhao Zhang
Tianren Gao
Peter Vajda
Joseph E. Gonzalez
VLM
322
54
0
17 Dec 2021
Distilled Dual-Encoder Model for Vision-Language Understanding
Zekun Wang
Wenhui Wang
Haichao Zhu
Ming Liu
Bing Qin
Furu Wei
VLM
FedML
214
35
0
16 Dec 2021
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
647
449
0
14 Dec 2021
LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework
Mengjie Zhao
Fei Mi
Yasheng Wang
Minglei Li
Xin Jiang
Qun Liu
Hinrich Schütze
RALM
283
12
0
14 Dec 2021
Model Uncertainty-Aware Knowledge Amalgamation for Pre-Trained Language Models
Lei Li
Yankai Lin
Xuancheng Ren
Guangxiang Zhao
Peng Li
Jie Zhou
Xu Sun
MoMe
143
2
0
14 Dec 2021
From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression
Runxin Xu
Fuli Luo
Chengyu Wang
Baobao Chang
Yanjie Liang
Songfang Huang
Fei Huang
VLM
119
31
0
14 Dec 2021
On the Compression of Natural Language Models
S. Damadi
92
0
0
13 Dec 2021
Pruning Pretrained Encoders with a Multitask Objective
Patrick Xia
Richard Shin
132
0
0
10 Dec 2021
DistilCSE: Effective Knowledge Distillation For Contrastive Sentence Embeddings
Chaochen Gao
Xing Wu
Peng Wang
Jue Wang
Liangjun Zang
Zhongyuan Wang
Songlin Hu
174
5
0
10 Dec 2021
Previous
1
2
3
...
14
15
16
...
20
21
22
Next
Page 15 of 22
Page
of 22
Go