Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.03393
Cited By
Improved Knowledge Distillation via Teacher Assistant
9 February 2019
Seyed Iman Mirzadeh
Mehrdad Farajtabar
Ang Li
Nir Levine
Akihiro Matsukawa
H. Ghasemzadeh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improved Knowledge Distillation via Teacher Assistant"
50 / 151 papers shown
Title
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via
D
\mathbf{\texttt{D}}
D
ual-
H
\mathbf{\texttt{H}}
H
ead
O
\mathbf{\texttt{O}}
O
ptimization
Seongjae Kang
Dong Bok Lee
Hyungjoon Jang
Sung Ju Hwang
VLM
57
0
0
12 May 2025
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via
α
α
α
-
β
β
β
-Divergence
Guanghui Wang
Zhiyong Yang
Z. Wang
Shi Wang
Qianqian Xu
Q. Huang
42
0
0
07 May 2025
Swapped Logit Distillation via Bi-level Teacher Alignment
Stephen Ekaputra Limantoro
Jhe-Hao Lin
Chih-Yu Wang
Yi-Lung Tsai
Hong-Han Shuai
Ching-Chun Huang
Wen-Huang Cheng
54
0
0
27 Apr 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
165
0
0
20 Mar 2025
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo
Yoong Guo
Xuehui Yu
Wenbo Li
Yaoxing Wang
Shan Gao
VLM
77
0
0
16 Mar 2025
ProReflow: Progressive Reflow with Decomposed Velocity
Lei Ke
Haohang Xu
Xuefei Ning
Y. Li
J. Li
Haoling Li
Yuxuan Lin
Dongsheng Jiang
Y. Yang
Linfeng Zhang
DiffM
62
1
0
05 Mar 2025
VRM: Knowledge Distillation via Virtual Relation Matching
W. Zhang
Fei Xie
Weidong Cai
Chao Ma
76
0
0
28 Feb 2025
TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models
Makoto Shing
Kou Misaki
Han Bao
Sho Yokoi
Takuya Akiba
VLM
57
1
0
28 Jan 2025
Knowledge Distillation with Adapted Weight
Sirong Wu
Xi Luo
Junjie Liu
Yuhui Deng
40
0
0
06 Jan 2025
GazeGen: Gaze-Driven User Interaction for Visual Content Generation
He-Yen Hsieh
Ziyun Li
Sai Qian Zhang
W. Ting
Kao-Den Chang
B. D. Salvo
Chiao Liu
H. T. Kung
VGen
35
0
0
07 Nov 2024
Improving DNN Modularization via Activation-Driven Training
Tuan Ngo
Abid Hassan
Saad Shafiq
Nenad Medvidovic
MoMe
27
0
0
01 Nov 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
57
0
0
25 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
67
5
0
22 Oct 2024
CREAM: Consistency Regularized Self-Rewarding Language Models
Z. Wang
Weilei He
Zhiyuan Liang
Xuchao Zhang
Chetan Bansal
Ying Wei
Weitong Zhang
Huaxiu Yao
ALM
101
7
0
16 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
Mike Ranzinger
Jon Barker
Greg Heinrich
Pavlo Molchanov
Bryan Catanzaro
Andrew Tao
35
5
0
02 Oct 2024
Classroom-Inspired Multi-Mentor Distillation with Adaptive Learning Strategies
Shalini Sarode
Muhammad Saif Ullah Khan
Tahira Shehzadi
Didier Stricker
Muhammad Zeshan Afzal
36
0
0
30 Sep 2024
Towards Model-Agnostic Dataset Condensation by Heterogeneous Models
Jun-Yeong Moon
Jung Uk Kim
Gyeong-Moon Park
DD
33
1
0
22 Sep 2024
MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU
Yan Li
So-Eon Kim
Seong-Bae Park
S. Han
21
0
0
15 Aug 2024
Relational Representation Distillation
Nikolaos Giakoumoglou
Tania Stathaki
34
0
0
16 Jul 2024
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
39
6
0
28 Jun 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Jordy Van Landeghem
Subhajit Maity
Ayan Banerjee
Matthew Blaschko
Marie-Francine Moens
Josep Lladós
Sanket Biswas
50
2
0
12 Jun 2024
Trans-LoRA
\textit{Trans-LoRA}
Trans-LoRA
: towards data-free Transferable Parameter Efficient Finetuning
Runqian Wang
Soumya Ghosh
David D. Cox
Diego Antognini
Aude Oliva
Rogerio Feris
Leonid Karlinsky
32
1
0
27 May 2024
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren
Mingjie Zhan
Zhongyuan Wu
Aojun Zhou
Junting Pan
Hongsheng Li
SyDa
36
7
0
27 May 2024
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective
Wencheng Zhu
Xin Zhou
Pengfei Zhu
Yu Wang
Qinghua Hu
VLM
56
1
0
22 Apr 2024
MonoTAKD: Teaching Assistant Knowledge Distillation for Monocular 3D Object Detection
Hou-I Liu
Christine Wu
Jen-Hao Cheng
Wenhao Chai
Shian-Yun Wang
...
Jenq-Neng Hwang
Hong-Han Shuai
Wen-Huang Cheng
Hong-Han Shuai
Wen-Huang Cheng
42
2
0
07 Apr 2024
Adversarial Sparse Teacher: Defense Against Distillation-Based Model Stealing Attacks Using Adversarial Examples
Eda Yilmaz
H. Keles
AAML
16
2
0
08 Mar 2024
GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation
Ayan Banerjee
Sanket Biswas
Josep Lladós
Umapada Pal
38
1
0
17 Feb 2024
TinyGSM: achieving >80% on GSM8k with small language models
Bingbin Liu
Sébastien Bubeck
Ronen Eldan
Janardhan Kulkarni
Yuanzhi Li
Anh Nguyen
Rachel A. Ward
Yi Zhang
ALM
21
47
0
14 Dec 2023
Cooperative Learning for Cost-Adaptive Inference
Xingli Fang
Richard M. Bradford
Jung-Eun Kim
32
1
0
13 Dec 2023
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One
Michael Ranzinger
Greg Heinrich
Jan Kautz
Pavlo Molchanov
VLM
36
42
0
10 Dec 2023
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP
Yoshitomo Matsubara
VLM
26
1
0
26 Oct 2023
Understanding the Effects of Projectors in Knowledge Distillation
Yudong Chen
Sen Wang
Jiajun Liu
Xuwei Xu
Frank de Hoog
Brano Kusy
Zi Huang
26
0
0
26 Oct 2023
Knowledge Distillation for Anomaly Detection
Adrian Alan Pol
E. Govorkova
Sonja Grönroos
N. Chernyavskaya
Philip C. Harris
M. Pierini
I. Ojalvo
P. Elmer
21
1
0
09 Oct 2023
Multi-Label Knowledge Distillation
Penghui Yang
Ming-Kun Xie
Chen-Chen Zong
Lei Feng
Gang Niu
Masashi Sugiyama
Sheng-Jun Huang
33
10
0
12 Aug 2023
Teacher-Student Architecture for Knowledge Distillation: A Survey
Chengming Hu
Xuan Li
Danyang Liu
Haolun Wu
Xi Chen
Ju Wang
Xue Liu
21
16
0
08 Aug 2023
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
Seungcheol Park
Ho-Jin Choi
U. Kang
VLM
32
5
0
07 Aug 2023
Review helps learn better: Temporal Supervised Knowledge Distillation
Dongwei Wang
Zhi-Long Han
Yanmei Wang
Xi’ai Chen
Baicheng Liu
Yandong Tang
57
1
0
03 Jul 2023
CrossKD: Cross-Head Knowledge Distillation for Object Detection
Jiabao Wang
Yuming Chen
Zhaohui Zheng
Xiang Li
Ming-Ming Cheng
Qibin Hou
40
32
0
20 Jun 2023
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Yang Yang
...
Jiahao Liu
Jingang Wang
Shuo Zhao
Peng-Zhen Zhang
Jie Tang
ALM
MoE
25
11
0
11 Jun 2023
Knowledge Diffusion for Distillation
Tao Huang
Yuan Zhang
Mingkai Zheng
Shan You
Fei Wang
Chao Qian
Chang Xu
37
50
0
25 May 2023
Decoupled Kullback-Leibler Divergence Loss
Jiequan Cui
Zhuotao Tian
Zhisheng Zhong
Xiaojuan Qi
Bei Yu
Hanwang Zhang
39
38
0
23 May 2023
Learning from Mistakes via Cooperative Study Assistant for Large Language Models
Danqing Wang
Lei Li
30
6
0
23 May 2023
Lifting the Curse of Capacity Gap in Distilling Language Models
Chen Zhang
Yang Yang
Jiahao Liu
Jingang Wang
Yunsen Xian
Benyou Wang
Dawei Song
MoE
32
19
0
20 May 2023
Student-friendly Knowledge Distillation
Mengyang Yuan
Bo Lang
Fengnan Quan
20
17
0
18 May 2023
Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling
Shitong Shao
Xu Dai
Shouyi Yin
Lujun Li
Huanran Chen
Yang Hu
24
17
0
18 May 2023
Boost Vision Transformer with GPU-Friendly Sparsity and Quantization
Chong Yu
Tao Chen
Zhongxue Gan
Jiayuan Fan
MQ
ViT
27
23
0
18 May 2023
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation
Yuxin Ren
Zi-Qi Zhong
Xingjian Shi
Yi Zhu
Chun Yuan
Mu Li
21
7
0
16 May 2023
Analyzing Compression Techniques for Computer Vision
Maniratnam Mandal
Imran Khan
24
1
0
14 May 2023
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
31
17
0
03 May 2023
SoK: Pragmatic Assessment of Machine Learning for Network Intrusion Detection
Giovanni Apruzzese
P. Laskov
J. Schneider
36
24
0
30 Apr 2023
1
2
3
4
Next