ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.08381
  4. Cited By
Model Compression with Two-stage Multi-teacher Knowledge Distillation
  for Web Question Answering System

Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System

18 October 2019
Ze Yang
Linjun Shou
Ming Gong
Wutao Lin
Daxin Jiang
ArXivPDFHTML

Papers citing "Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System"

47 / 47 papers shown
Title
Larger models yield better results? Streamlined severity classification
  of ADHD-related concerns using BERT-based knowledge distillation
Larger models yield better results? Streamlined severity classification of ADHD-related concerns using BERT-based knowledge distillation
Ahmed Akib Jawad Karim
Kazi Hafiz Md. Asad
Md. Golam Rabiul Alam
AI4MH
44
2
0
30 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
Mike Ranzinger
Jon Barker
Greg Heinrich
Pavlo Molchanov
Bryan Catanzaro
Andrew Tao
42
5
0
02 Oct 2024
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
56
24
0
19 Aug 2024
Self-Regulated Data-Free Knowledge Amalgamation for Text Classification
Self-Regulated Data-Free Knowledge Amalgamation for Text Classification
Prashanth Vijayaraghavan
Hongzhi Wang
Luyao Shi
Tyler Baldwin
David Beymer
Ehsan Degan
37
1
0
16 Jun 2024
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs
  Improves Reasoning in Smaller Language Models
MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models
Justin Chih-Yao Chen
Swarnadeep Saha
Elias Stengel-Eskin
Mohit Bansal
LRM
LLMAG
32
15
0
02 Feb 2024
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains
  Into One
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One
Michael Ranzinger
Greg Heinrich
Jan Kautz
Pavlo Molchanov
VLM
44
42
0
10 Dec 2023
MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient
  image-text retrieval
MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval
Youbo Lei
Feifei He
Chen Chen
Yingbin Mo
Sijia Li
Defeng Xie
H. Lu
VLM
59
0
0
30 Oct 2023
Ensemble Distillation for Unsupervised Constituency Parsing
Ensemble Distillation for Unsupervised Constituency Parsing
Behzad Shayegh
Yanshuai Cao
Xiaodan Zhu
Jackie C.K. Cheung
Lili Mou
50
5
0
03 Oct 2023
Inherit with Distillation and Evolve with Contrast: Exploring Class
  Incremental Semantic Segmentation Without Exemplar Memory
Inherit with Distillation and Evolve with Contrast: Exploring Class Incremental Semantic Segmentation Without Exemplar Memory
Danpei Zhao
Bo Yuan
Z. Shi
VLM
CLL
31
9
0
27 Sep 2023
Adaptive Prompt Learning with Distilled Connective Knowledge for
  Implicit Discourse Relation Recognition
Adaptive Prompt Learning with Distilled Connective Knowledge for Implicit Discourse Relation Recognition
Bang Wang
Zhenglin Wang
Wei Xiang
Yijun Mo
CLL
24
2
0
14 Sep 2023
Teacher-Student Architecture for Knowledge Distillation: A Survey
Teacher-Student Architecture for Knowledge Distillation: A Survey
Chengming Hu
Xuan Li
Danyang Liu
Haolun Wu
Xi Chen
Ju Wang
Xue Liu
21
16
0
08 Aug 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
The Staged Knowledge Distillation in Video Classification: Harmonizing
  Student Progress by a Complementary Weakly Supervised Framework
The Staged Knowledge Distillation in Video Classification: Harmonizing Student Progress by a Complementary Weakly Supervised Framework
Chao Wang
Zhenghang Tang
32
1
0
11 Jul 2023
GKD: A General Knowledge Distillation Framework for Large-scale
  Pre-trained Language Model
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Yang Yang
...
Jiahao Liu
Jingang Wang
Shuo Zhao
Peng-Zhen Zhang
Jie Tang
ALM
MoE
33
11
0
11 Jun 2023
Are Intermediate Layers and Labels Really Necessary? A General Language
  Model Distillation Method
Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Shuo Zhao
Peng-Zhen Zhang
Jie Tang
VLM
27
1
0
11 Jun 2023
AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation
  Framework For Multilingual Language Inference
AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation Framework For Multilingual Language Inference
Qianglong Chen
Feng Ji
Feng-Lin Li
Guohai Xu
Ming Yan
Ji Zhang
Yin Zhang
23
0
0
13 May 2023
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech
  Recognition Models
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models
Steven M. Hernandez
Ding Zhao
Shaojin Ding
A. Bruguier
Rohit Prabhavalkar
Tara N. Sainath
Yanzhang He
Ian McGraw
26
7
0
15 Mar 2023
LightTS: Lightweight Time Series Classification with Adaptive Ensemble
  Distillation -- Extended Version
LightTS: Lightweight Time Series Classification with Adaptive Ensemble Distillation -- Extended Version
David Campos
Miao Zhang
B. Yang
Tung Kieu
Chenjuan Guo
Christian S. Jensen
AI4TS
45
47
0
24 Feb 2023
ProKD: An Unsupervised Prototypical Knowledge Distillation Network for
  Zero-Resource Cross-Lingual Named Entity Recognition
ProKD: An Unsupervised Prototypical Knowledge Distillation Network for Zero-Resource Cross-Lingual Named Entity Recognition
Ling Ge
Chuming Hu
Guanghui Ma
Hong Zhang
Jihong Liu
16
3
0
21 Jan 2023
EPIK: Eliminating multi-model Pipelines with Knowledge-distillation
EPIK: Eliminating multi-model Pipelines with Knowledge-distillation
Bhavesh Laddagiri
Yash Raj
Anshuman Dash
16
0
0
27 Nov 2022
RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy
  Medical Imaging
RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging
A. Jaiswal
Kumar Ashutosh
Justin F. Rousseau
Yifan Peng
Zhangyang Wang
Ying Ding
20
9
0
15 Oct 2022
Boosting Graph Neural Networks via Adaptive Knowledge Distillation
Boosting Graph Neural Networks via Adaptive Knowledge Distillation
Zhichun Guo
Chunhui Zhang
Yujie Fan
Yijun Tian
Chuxu Zhang
Nitesh V. Chawla
21
32
0
12 Oct 2022
Linkless Link Prediction via Relational Distillation
Linkless Link Prediction via Relational Distillation
Zhichun Guo
William Shiao
Shichang Zhang
Yozen Liu
Nitesh V. Chawla
Neil Shah
Tong Zhao
21
41
0
11 Oct 2022
Learning by Distilling Context
Learning by Distilling Context
Charles Burton Snell
Dan Klein
Ruiqi Zhong
ReLM
LRM
168
44
0
30 Sep 2022
Integrating Object-aware and Interaction-aware Knowledge for Weakly
  Supervised Scene Graph Generation
Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation
Xingchen Li
Long Chen
Wenbo Ma
Yi Yang
Jun Xiao
18
26
0
03 Aug 2022
NICEST: Noisy Label Correction and Training for Robust Scene Graph
  Generation
NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation
Lin Li
Jun Xiao
Hanrong Shi
Hanwang Zhang
Yi Yang
Wei Liu
Long Chen
26
22
0
27 Jul 2022
Toward Student-Oriented Teacher Network Training For Knowledge
  Distillation
Toward Student-Oriented Teacher Network Training For Knowledge Distillation
Chengyu Dong
Liyuan Liu
Jingbo Shang
43
6
0
14 Jun 2022
Towards Data-Free Model Stealing in a Hard Label Setting
Towards Data-Free Model Stealing in a Hard Label Setting
Sunandini Sanyal
Sravanti Addepalli
R. Venkatesh Babu
AAML
35
85
0
23 Apr 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an
  Application to Question Answering Systems
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
26
6
0
15 Jan 2022
Improving Neural Cross-Lingual Summarization via Employing Optimal
  Transport Distance for Knowledge Distillation
Improving Neural Cross-Lingual Summarization via Employing Optimal Transport Distance for Knowledge Distillation
Thong Nguyen
A. Luu
60
40
0
07 Dec 2021
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
  Language Model Compression
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression
Chenhe Dong
Yaliang Li
Ying Shen
Minghui Qiu
VLM
34
7
0
16 Oct 2021
Improving Question Answering Performance Using Knowledge Distillation
  and Active Learning
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
32
15
0
26 Sep 2021
Multihop: Leveraging Complex Models to Learn Accurate Simple Models
Multihop: Leveraging Complex Models to Learn Accurate Simple Models
Amit Dhurandhar
Tejaswini Pedapati
19
0
0
14 Sep 2021
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
RefBERT: Compressing BERT by Referencing to Pre-computed Representations
Xinyi Wang
Haiqing Yang
Liang Zhao
Yang Mo
Jianping Shen
MQ
20
3
0
11 Jun 2021
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
  Entity Recognition
Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
Shining Liang
Ming Gong
J. Pei
Linjun Shou
Wanli Zuo
Xianglin Zuo
Daxin Jiang
31
34
0
01 Jun 2021
Self-Teaching Machines to Read and Comprehend with Large-Scale
  Multi-Subject Question-Answering Data
Self-Teaching Machines to Read and Comprehend with Large-Scale Multi-Subject Question-Answering Data
Dian Yu
Kai Sun
Dong Yu
Claire Cardie
34
7
0
01 Feb 2021
Improving Multi-hop Knowledge Base Question Answering by Learning
  Intermediate Supervision Signals
Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signals
Gaole He
Yunshi Lan
Jing Jiang
Wayne Xin Zhao
Ji-Rong Wen
120
187
0
11 Jan 2021
Reinforced Multi-Teacher Selection for Knowledge Distillation
Reinforced Multi-Teacher Selection for Knowledge Distillation
Fei Yuan
Linjun Shou
J. Pei
Wutao Lin
Ming Gong
Yan Fu
Daxin Jiang
15
121
0
11 Dec 2020
Meta-KD: A Meta Knowledge Distillation Framework for Language Model
  Compression across Domains
Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains
Haojie Pan
Chengyu Wang
Minghui Qiu
Yichang Zhang
Yaliang Li
Jun Huang
23
49
0
02 Dec 2020
Cross-lingual Machine Reading Comprehension with Language Branch
  Knowledge Distillation
Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation
Junhao Liu
Linjun Shou
J. Pei
Ming Gong
Min Yang
Daxin Jiang
26
13
0
27 Oct 2020
Improved Synthetic Training for Reading Comprehension
Improved Synthetic Training for Reading Comprehension
Yanda Chen
Md Arafat Sultan
T. J. W. R. Center
SyDa
29
5
0
24 Oct 2020
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual
  Named Entity Recognition with Unlabeled Data
UniTrans: Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data
Qianhui Wu
Zijia Lin
Börje F. Karlsson
Biqing Huang
Jian-Guang Lou
21
46
0
15 Jul 2020
Mining Implicit Relevance Feedback from User Behavior for Web Question
  Answering
Mining Implicit Relevance Feedback from User Behavior for Web Question Answering
Linjun Shou
Shining Bo
Feixiang Cheng
Ming Gong
J. Pei
Daxin Jiang
14
9
0
13 Jun 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
19
2,843
0
09 Jun 2020
Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on
  Unlabeled Data in Target Language
Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language
Qianhui Wu
Zijia Lin
Börje F. Karlsson
Jian-Guang Lou
Biqing Huang
16
69
0
26 Apr 2020
Teacher-Class Network: A Neural Network Compression Mechanism
Teacher-Class Network: A Neural Network Compression Mechanism
Shaiq Munir Malik
Muhammad Umair Haider
Fnu Mohbat
Musab Rasheed
M. Taj
17
5
0
07 Apr 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,984
0
20 Apr 2018
1