Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09482
Cited By
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
20 April 2019
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
FedML
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2250★)
Papers citing
"Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding"
37 / 87 papers shown
Title
An Overview of Neural Network Compression
James OÑeill
AI4CE
160
99
0
05 Jun 2020
Transferring Inductive Biases through Knowledge Distillation
Samira Abnar
Mostafa Dehghani
Willem H. Zuidema
90
60
0
31 May 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.1K
42,651
0
28 May 2020
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining
Yifan Peng
Qingyu Chen
Zhiyong Lu
84
116
0
06 May 2020
Distilling Spikes: Knowledge Distillation in Spiking Neural Networks
R. K. Kushawaha
S. Kumar
Biplab Banerjee
R. Velmurugan
43
33
0
01 May 2020
The Inception Team at NSURL-2019 Task 8: Semantic Question Similarity in Arabic
Hana' Al-Theiabat
Aisha Al-Sadi
110
3
0
24 Apr 2020
A Study of Non-autoregressive Model for Sequence Generation
Yi Ren
Jinglin Liu
Xu Tan
Zhou Zhao
Sheng Zhao
Tie-Yan Liu
109
62
0
22 Apr 2020
DIET: Lightweight Language Understanding for Dialogue Systems
Tanja Bunk
Daksh Varshneya
Vladimir Vlasov
Alan Nichol
74
162
0
21 Apr 2020
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
Subhabrata Mukherjee
Ahmed Hassan Awadallah
80
59
0
12 Apr 2020
Towards Non-task-specific Distillation of BERT via Sentence Representation Approximation
Bowen Wu
Huan Zhang
Mengyuan Li
Zongsheng Wang
Qihang Feng
Junhong Huang
Baoxun Wang
34
4
0
07 Apr 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
390
1,498
0
18 Mar 2020
Style Example-Guided Text Generation using Generative Adversarial Transformers
Kuo-Hao Zeng
Mohammad Shoeybi
Ming-Yuan Liu
GAN
93
18
0
02 Mar 2020
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
Ziqing Yang
Yiming Cui
Zhipeng Chen
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
VLM
75
48
0
28 Feb 2020
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
Yige Xu
Xipeng Qiu
L. Zhou
Xuanjing Huang
83
67
0
24 Feb 2020
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu
Yu Wang
Jianshu Ji
Hao Cheng
Xueyun Zhu
...
Pengcheng He
Weizhu Chen
Hoifung Poon
Guihong Cao
Jianfeng Gao
AI4CE
77
61
0
19 Feb 2020
TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval
Wenhao Lu
Jian Jiao
Ruofei Zhang
60
50
0
14 Feb 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
348
201
0
07 Feb 2020
Aligning the Pretraining and Finetuning Objectives of Language Models
Nuo Wang Pierse
Jing Lu
AI4CE
35
2
0
05 Feb 2020
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
Saurabh Goyal
Anamitra R. Choudhury
Saurabh ManishRaje
Venkatesan T. Chakaravarthy
Yogish Sabharwal
Ashish Verma
96
18
0
24 Jan 2020
MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models
Linqing Liu
Haiquan Wang
Jimmy J. Lin
R. Socher
Caiming Xiong
65
21
0
09 Nov 2019
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
T. Zhao
135
563
0
08 Nov 2019
Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System
Ze Yang
Linjun Shou
Ming Gong
Wutao Lin
Daxin Jiang
67
94
0
18 Oct 2019
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
106
181
0
08 Oct 2019
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Subhabrata Mukherjee
Ahmed Hassan Awadallah
88
25
0
04 Oct 2019
Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks
Haoyang Huang
Yaobo Liang
Nan Duan
Ming Gong
Linjun Shou
Daxin Jiang
M. Zhou
109
233
0
03 Sep 2019
A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition
L. Simeonova
K. Simov
P. Osenova
Preslav Nakov
33
8
0
27 Aug 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
151
843
0
25 Aug 2019
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
Oren Barkan
Noam Razin
Itzik Malkiel
Ori Katz
Avi Caciularu
Noam Koenigstein
FedML
78
37
0
14 Aug 2019
A Hybrid Neural Network Model for Commonsense Reasoning
Pengcheng He
Xiaodong Liu
Weizhu Chen
Jianfeng Gao
LRM
75
29
0
27 Jul 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
782
24,613
0
26 Jul 2019
A Pragmatics-Centered Evaluation Framework for Natural Language Understanding
Damien Sileo
Tim Van de Cruys
Camille Pradel
Philippe Muller
ELM
34
3
0
19 Jul 2019
Task Selection Policies for Multitask Learning
John Glover
Chris Hokamp
OffRL
84
7
0
14 Jul 2019
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Kevin Clark
Minh-Thang Luong
Urvashi Khandelwal
Christopher D. Manning
Quoc V. Le
84
230
0
10 Jul 2019
DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain
Yichong Xu
Xiaodong Liu
Chunyuan Li
Hoifung Poon
Jianfeng Gao
MedIm
81
15
0
11 Jun 2019
Towards Lossless Encoding of Sentences
Gabriele Prato
Mathieu Duchesneau
A. Chandar
Alain Tapp
46
2
0
04 Jun 2019
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems
Alex Jinpeng Wang
Yada Pruksachatkun
Nikita Nangia
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
384
2,328
0
02 May 2019
Multi-Task Deep Neural Networks for Natural Language Understanding
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
AI4CE
158
1,273
0
31 Jan 2019
Previous
1
2