Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.09482
Cited By
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
20 April 2019
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
FedML
Re-assign community
ArXiv (abs)
PDF
HTML
Github (2250★)
Papers citing
"Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding"
50 / 87 papers shown
Title
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data
Anup Shirgaonkar
Nikhil Pandey
Nazmiye Ceren Abay
Tolga Aktas
Vijay Aski
ALM
SyDa
63
1
0
24 Oct 2024
Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices
Baiyu Pan
Jichao Jiao
Jianxin Pang
Jun Cheng
72
3
0
20 May 2024
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models
Jesse Atuhurra
Seiveright Cargill Dujohn
Hidetaka Kamigaito
Hiroyuki Shindo
Taro Watanabe
43
2
0
13 Mar 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
139
1
0
21 Feb 2024
Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction Detection
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
ViT
98
11
0
03 Nov 2023
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Miaoxi Zhu
Qihuang Zhong
Li Shen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
MQ
VLM
66
1
0
20 Oct 2023
SPICED: News Similarity Detection Dataset with Multiple Topics and Complexity Levels
Elena Shushkevich
Long Mai
Manuel V. Loureiro
Steven Derby
Tri Kurniawan Wijaya
AI4TS
76
0
0
21 Sep 2023
The Impact of Artificial Intelligence on the Evolution of Digital Education: A Comparative Study of OpenAI Text Generation Tools including ChatGPT, Bing Chat, Bard, and Ernie
Negin Yazdani Motlagh
Matin Khajavi
Abbas Sharifi
Mohsen Ahmadi
79
33
0
05 Sep 2023
Multi-Objective Optimization for Sparse Deep Multi-Task Learning
S. S. Hotegni
M. Berkemeier
S. Peitz
45
6
0
23 Aug 2023
Shared Growth of Graph Neural Networks via Prompted Free-direction Knowledge Distillation
Kaituo Feng
Yikun Miao
Changsheng Li
Ye Yuan
Guoren Wang
121
0
0
02 Jul 2023
FLamE: Few-shot Learning from Natural Language Explanations
Yangqiaoyu Zhou
Yiming Zhang
Chenhao Tan
LRM
FAtt
95
11
0
13 Jun 2023
minOffense: Inter-Agreement Hate Terms for Stable Rules, Concepts, Transitivities, and Lattices
Animesh Chaturvedi
Rajesh Sharma
69
6
0
29 May 2023
A Comparison of Document Similarity Algorithms
Nicholas Gahman
V. Elangovan
AI4TS
45
4
0
03 Apr 2023
Graph-based Knowledge Distillation: A survey and experimental evaluation
Jing Liu
Tongya Zheng
Guanzheng Zhang
Qinfen Hao
65
8
0
27 Feb 2023
Protein Language Models and Structure Prediction: Connection and Progression
Bozhen Hu
Jun Xia
Jiangbin Zheng
Cheng Tan
Yufei Huang
Yongjie Xu
Stan Z. Li
70
41
0
30 Nov 2022
Learning by Distilling Context
Charles Burton Snell
Dan Klein
Ruiqi Zhong
ReLM
LRM
233
48
0
30 Sep 2022
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks
Kaituo Feng
Changsheng Li
Ye Yuan
Guoren Wang
105
35
0
14 Jun 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning
Orion Weller
Kevin Seppi
Matt Gardner
57
23
0
17 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
127
103
0
14 May 2022
Unified and Effective Ensemble Knowledge Distillation
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
FedML
56
11
0
01 Apr 2022
Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Zuzana Jelčicová
Marian Verhelst
92
5
0
20 Mar 2022
Survey on Automated Short Answer Grading with Deep Learning: from Word Embeddings to Transformers
Stefan Haller
Adina Aldea
C. Seifert
N. Strisciuglio
67
39
0
11 Mar 2022
On Steering Multi-Annotations per Sample for Multi-Task Learning
Yuan Li
Yiwen Guo
Qizhang Li
Hongzhi Zhang
W. Zuo
56
0
0
06 Mar 2022
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
95
45
0
22 Dec 2021
Leveraging Sentiment Analysis Knowledge to Solve Emotion Detection Tasks
Maude Nguyen-The
Guillaume-Alexandre Bilodeau
Jan Rockemann
59
4
0
05 Nov 2021
BERT-DRE: BERT with Deep Recursive Encoder for Natural Language Sentence Matching
Ehsan Tavan
A. Rahmati
M. Najafi
Saeed Bibak
Zahed Rahmati
78
5
0
03 Nov 2021
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression
Chenhe Dong
Yaliang Li
Ying Shen
Minghui Qiu
VLM
99
7
0
16 Oct 2021
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
83
16
0
26 Sep 2021
Multi-Task Self-Training for Learning General Representations
Golnaz Ghiasi
Barret Zoph
E. D. Cubuk
Quoc V. Le
Nayeon Lee
SSL
91
101
0
25 Aug 2021
Explaining Bayesian Neural Networks
Kirill Bykov
Marina M.-C. Höhne
Adelaida Creosteanu
Klaus-Robert Muller
Frederick Klauschen
Shinichi Nakajima
Marius Kloft
BDL
AAML
72
25
0
23 Aug 2021
A distillation based approach for the diagnosis of diseases
Hmrishav Bandyopadhyay
Shuvayan Ghosh Dastidar
Bisakh Mondal
Biplab Banerjee
N. Das
51
1
0
07 Aug 2021
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Jianfeng Gao
57
22
0
08 Jun 2021
Multi-task Graph Convolutional Neural Network for Calcification Morphology and Distribution Analysis in Mammograms
Hao Du
Melissa Min-Szu Yao
Liangyu Chen
Wing P. Chan
Mengling Feng
MedIm
37
1
0
14 May 2021
Dual-View Distilled BERT for Sentence Embedding
Xingyi Cheng
3DV
52
14
0
18 Apr 2021
Industry Scale Semi-Supervised Learning for Natural Language Understanding
Luoxin Chen
Francisco Garcia
Varun Kumar
He Xie
Jianhua Lu
38
7
0
29 Mar 2021
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation
Peyman Passban
Yimeng Wu
Mehdi Rezagholizadeh
Qun Liu
87
123
0
27 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
187
376
0
17 Dec 2020
Medical Knowledge-enriched Textual Entailment Framework
S. Yadav
Vishal Pallagani
A. Sheth
AI4MH
22
8
0
10 Nov 2020
Improved Synthetic Training for Reading Comprehension
Yanda Chen
Md Arafat Sultan
T. J. W. R. Center
SyDa
67
5
0
24 Oct 2020
Is Retriever Merely an Approximator of Reader?
Sohee Yang
Minjoon Seo
RALM
85
42
0
21 Oct 2020
BERT2DNN: BERT Distillation with Massive Unlabeled Data for Online E-Commerce Search
Yunjiang Jiang
Yue Shang
Ziyang Liu
Hongwei Shen
Yun Xiao
Wei Xiong
Sulong Xu
Weipeng P. Yan
Di Jin
62
17
0
20 Oct 2020
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models
Shuohuan Wang
Jiaxiang Liu
Ouyang Xuan
Yu Sun
68
36
0
07 Oct 2020
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
217
626
0
10 Sep 2020
DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild
Philippe Weinzaepfel
Romain Brégier
Hadrien Combaluzier
Vincent Leroy
Grégory Rogez
3DH
56
50
0
21 Aug 2020
Small Towers Make Big Differences
Yuyan Wang
Zhe Zhao
Bo Dai
Christopher Fifty
Dong Lin
Lichan Hong
Ed H. Chi
63
10
0
13 Aug 2020
Compression of Deep Learning Models for Text: A Survey
Manish Gupta
Puneet Agrawal
VLM
MedIm
AI4CE
77
119
0
12 Aug 2020
CORD19STS: COVID-19 Semantic Textual Similarity Dataset
Xiao Guo
H. Mirzaalian
Ekraam Sabir
Aysush Jaiswal
Wael AbdAlmageed
38
33
0
05 Jul 2020
Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution
Hadi Pouransari
Mojan Javaheripi
Vinay Sharma
Oncel Tuzel
36
5
0
30 Jun 2020
CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning
Hongru Wang
Xiangru Tang
Sunny Lai
Kwong Sak Leung
Jia Zhu
Gabriel Pui Cheong Fung
K. Leung
ReLM
LRM
60
4
0
12 Jun 2020
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
280
3,016
0
09 Jun 2020
1
2
Next