ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.09482
  4. Cited By
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for
  Natural Language Understanding

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

20 April 2019
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
    FedML
ArXiv (abs)PDFHTMLGithub (2250★)

Papers citing "Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding"

50 / 87 papers shown
Title
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability
  and the Role of Synthetic Data
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Data
Anup Shirgaonkar
Nikhil Pandey
Nazmiye Ceren Abay
Tolga Aktas
Vijay Aski
ALMSyDa
63
1
0
24 Oct 2024
Distill-then-prune: An Efficient Compression Framework for Real-time
  Stereo Matching Network on Edge Devices
Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices
Baiyu Pan
Jichao Jiao
Jianxin Pang
Jun Cheng
72
3
0
20 May 2024
Distilling Named Entity Recognition Models for Endangered Species from
  Large Language Models
Distilling Named Entity Recognition Models for Endangered Species from Large Language Models
Jesse Atuhurra
Seiveright Cargill Dujohn
Hidetaka Kamigaito
Hiroyuki Shindo
Taro Watanabe
43
2
0
13 Mar 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
139
1
0
21 Feb 2024
Towards a Unified Transformer-based Framework for Scene Graph Generation
  and Human-object Interaction Detection
Towards a Unified Transformer-based Framework for Scene Graph Generation and Human-object Interaction Detection
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
ViT
98
11
0
03 Nov 2023
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models
Miaoxi Zhu
Qihuang Zhong
Li Shen
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
MQVLM
66
1
0
20 Oct 2023
SPICED: News Similarity Detection Dataset with Multiple Topics and
  Complexity Levels
SPICED: News Similarity Detection Dataset with Multiple Topics and Complexity Levels
Elena Shushkevich
Long Mai
Manuel V. Loureiro
Steven Derby
Tri Kurniawan Wijaya
AI4TS
76
0
0
21 Sep 2023
The Impact of Artificial Intelligence on the Evolution of Digital Education: A Comparative Study of OpenAI Text Generation Tools including ChatGPT, Bing Chat, Bard, and Ernie
The Impact of Artificial Intelligence on the Evolution of Digital Education: A Comparative Study of OpenAI Text Generation Tools including ChatGPT, Bing Chat, Bard, and Ernie
Negin Yazdani Motlagh
Matin Khajavi
Abbas Sharifi
Mohsen Ahmadi
79
33
0
05 Sep 2023
Multi-Objective Optimization for Sparse Deep Multi-Task Learning
Multi-Objective Optimization for Sparse Deep Multi-Task Learning
S. S. Hotegni
M. Berkemeier
S. Peitz
45
6
0
23 Aug 2023
Shared Growth of Graph Neural Networks via Prompted Free-direction
  Knowledge Distillation
Shared Growth of Graph Neural Networks via Prompted Free-direction Knowledge Distillation
Kaituo Feng
Yikun Miao
Changsheng Li
Ye Yuan
Guoren Wang
121
0
0
02 Jul 2023
FLamE: Few-shot Learning from Natural Language Explanations
FLamE: Few-shot Learning from Natural Language Explanations
Yangqiaoyu Zhou
Yiming Zhang
Chenhao Tan
LRMFAtt
95
11
0
13 Jun 2023
minOffense: Inter-Agreement Hate Terms for Stable Rules, Concepts,
  Transitivities, and Lattices
minOffense: Inter-Agreement Hate Terms for Stable Rules, Concepts, Transitivities, and Lattices
Animesh Chaturvedi
Rajesh Sharma
69
6
0
29 May 2023
A Comparison of Document Similarity Algorithms
A Comparison of Document Similarity Algorithms
Nicholas Gahman
V. Elangovan
AI4TS
45
4
0
03 Apr 2023
Graph-based Knowledge Distillation: A survey and experimental evaluation
Graph-based Knowledge Distillation: A survey and experimental evaluation
Jing Liu
Tongya Zheng
Guanzheng Zhang
Qinfen Hao
65
8
0
27 Feb 2023
Protein Language Models and Structure Prediction: Connection and
  Progression
Protein Language Models and Structure Prediction: Connection and Progression
Bozhen Hu
Jun Xia
Jiangbin Zheng
Cheng Tan
Yufei Huang
Yongjie Xu
Stan Z. Li
70
41
0
30 Nov 2022
Learning by Distilling Context
Learning by Distilling Context
Charles Burton Snell
Dan Klein
Ruiqi Zhong
ReLMLRM
233
48
0
30 Sep 2022
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks
FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks
Kaituo Feng
Changsheng Li
Ye Yuan
Guoren Wang
105
35
0
14 Jun 2022
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for
  Pre-Trained Encoder Transfer Learning
When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning
Orion Weller
Kevin Seppi
Matt Gardner
57
23
0
17 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in
  Image Classification
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification
Babak Rokh
A. Azarpeyvand
Alireza Khanteymoori
MQ
127
103
0
14 May 2022
Unified and Effective Ensemble Knowledge Distillation
Unified and Effective Ensemble Knowledge Distillation
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
FedML
56
11
0
01 Apr 2022
Delta Keyword Transformer: Bringing Transformers to the Edge through
  Dynamically Pruned Multi-Head Self-Attention
Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Zuzana Jelčicová
Marian Verhelst
92
5
0
20 Mar 2022
Survey on Automated Short Answer Grading with Deep Learning: from Word
  Embeddings to Transformers
Survey on Automated Short Answer Grading with Deep Learning: from Word Embeddings to Transformers
Stefan Haller
Adina Aldea
C. Seifert
N. Strisciuglio
67
39
0
11 Mar 2022
On Steering Multi-Annotations per Sample for Multi-Task Learning
On Steering Multi-Annotations per Sample for Multi-Task Learning
Yuan Li
Yiwen Guo
Qizhang Li
Hongzhi Zhang
W. Zuo
56
0
0
06 Mar 2022
Domain Adaptation with Pre-trained Transformers for Query Focused
  Abstractive Text Summarization
Domain Adaptation with Pre-trained Transformers for Query Focused Abstractive Text Summarization
Md Tahmid Rahman Laskar
Enamul Hoque
J. Huang
95
45
0
22 Dec 2021
Leveraging Sentiment Analysis Knowledge to Solve Emotion Detection Tasks
Leveraging Sentiment Analysis Knowledge to Solve Emotion Detection Tasks
Maude Nguyen-The
Guillaume-Alexandre Bilodeau
Jan Rockemann
59
4
0
05 Nov 2021
BERT-DRE: BERT with Deep Recursive Encoder for Natural Language Sentence
  Matching
BERT-DRE: BERT with Deep Recursive Encoder for Natural Language Sentence Matching
Ehsan Tavan
A. Rahmati
M. Najafi
Saeed Bibak
Zahed Rahmati
78
5
0
03 Nov 2021
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
  Language Model Compression
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression
Chenhe Dong
Yaliang Li
Ying Shen
Minghui Qiu
VLM
99
7
0
16 Oct 2021
Improving Question Answering Performance Using Knowledge Distillation
  and Active Learning
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
83
16
0
26 Sep 2021
Multi-Task Self-Training for Learning General Representations
Multi-Task Self-Training for Learning General Representations
Golnaz Ghiasi
Barret Zoph
E. D. Cubuk
Quoc V. Le
Nayeon Lee
SSL
91
101
0
25 Aug 2021
Explaining Bayesian Neural Networks
Explaining Bayesian Neural Networks
Kirill Bykov
Marina M.-C. Höhne
Adelaida Creosteanu
Klaus-Robert Muller
Frederick Klauschen
Shinichi Nakajima
Marius Kloft
BDLAAML
72
25
0
23 Aug 2021
A distillation based approach for the diagnosis of diseases
A distillation based approach for the diagnosis of diseases
Hmrishav Bandyopadhyay
Shuvayan Ghosh Dastidar
Bisakh Mondal
Biplab Banerjee
N. Das
51
1
0
07 Aug 2021
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Jianfeng Gao
57
22
0
08 Jun 2021
Multi-task Graph Convolutional Neural Network for Calcification
  Morphology and Distribution Analysis in Mammograms
Multi-task Graph Convolutional Neural Network for Calcification Morphology and Distribution Analysis in Mammograms
Hao Du
Melissa Min-Szu Yao
Liangyu Chen
Wing P. Chan
Mengling Feng
MedIm
37
1
0
14 May 2021
Dual-View Distilled BERT for Sentence Embedding
Dual-View Distilled BERT for Sentence Embedding
Xingyi Cheng
3DV
52
14
0
18 Apr 2021
Industry Scale Semi-Supervised Learning for Natural Language
  Understanding
Industry Scale Semi-Supervised Learning for Natural Language Understanding
Luoxin Chen
Francisco Garcia
Varun Kumar
He Xie
Jianhua Lu
38
7
0
29 Mar 2021
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation
ALP-KD: Attention-Based Layer Projection for Knowledge Distillation
Peyman Passban
Yimeng Wu
Mehdi Rezagholizadeh
Qun Liu
87
123
0
27 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and
  Self-Distillation in Deep Learning
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
187
376
0
17 Dec 2020
Medical Knowledge-enriched Textual Entailment Framework
Medical Knowledge-enriched Textual Entailment Framework
S. Yadav
Vishal Pallagani
A. Sheth
AI4MH
22
8
0
10 Nov 2020
Improved Synthetic Training for Reading Comprehension
Improved Synthetic Training for Reading Comprehension
Yanda Chen
Md Arafat Sultan
T. J. W. R. Center
SyDa
67
5
0
24 Oct 2020
Is Retriever Merely an Approximator of Reader?
Is Retriever Merely an Approximator of Reader?
Sohee Yang
Minjoon Seo
RALM
85
42
0
21 Oct 2020
BERT2DNN: BERT Distillation with Massive Unlabeled Data for Online
  E-Commerce Search
BERT2DNN: BERT Distillation with Massive Unlabeled Data for Online E-Commerce Search
Yunjiang Jiang
Yue Shang
Ziyang Liu
Hongwei Shen
Yun Xiao
Wei Xiong
Sulong Xu
Weipeng P. Yan
Di Jin
62
17
0
20 Oct 2020
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive
  Language Identification using Pre-trained Language Models
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models
Shuohuan Wang
Jiaxiang Liu
Ouyang Xuan
Yu Sun
68
36
0
07 Oct 2020
Multi-Task Learning with Deep Neural Networks: A Survey
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
217
626
0
10 Sep 2020
DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in
  the wild
DOPE: Distillation Of Part Experts for whole-body 3D pose estimation in the wild
Philippe Weinzaepfel
Romain Brégier
Hadrien Combaluzier
Vincent Leroy
Grégory Rogez
3DH
56
50
0
21 Aug 2020
Small Towers Make Big Differences
Small Towers Make Big Differences
Yuyan Wang
Zhe Zhao
Bo Dai
Christopher Fifty
Dong Lin
Lichan Hong
Ed H. Chi
63
10
0
13 Aug 2020
Compression of Deep Learning Models for Text: A Survey
Compression of Deep Learning Models for Text: A Survey
Manish Gupta
Puneet Agrawal
VLMMedImAI4CE
77
119
0
12 Aug 2020
CORD19STS: COVID-19 Semantic Textual Similarity Dataset
CORD19STS: COVID-19 Semantic Textual Similarity Dataset
Xiao Guo
H. Mirzaalian
Ekraam Sabir
Aysush Jaiswal
Wael AbdAlmageed
38
33
0
05 Jul 2020
Extracurricular Learning: Knowledge Transfer Beyond Empirical
  Distribution
Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution
Hadi Pouransari
Mojan Javaheripi
Vinay Sharma
Oncel Tuzel
36
5
0
30 Jun 2020
CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and
  Prediction with Multi-task Learning
CUHK at SemEval-2020 Task 4: CommonSense Explanation, Reasoning and Prediction with Multi-task Learning
Hongru Wang
Xiangru Tang
Sunny Lai
Kwong Sak Leung
Jia Zhu
Gabriel Pui Cheong Fung
K. Leung
ReLMLRM
60
4
0
12 Jun 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
280
3,016
0
09 Jun 2020
12
Next