Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.05010
Cited By
Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation
8 May 2023
Rongzhi Zhang
Jiaming Shen
Tianqi Liu
Jia-Ling Liu
Michael Bendersky
Marc Najork
Chao Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation"
16 / 16 papers shown
Title
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via
D
\mathbf{\texttt{D}}
D
ual-
H
\mathbf{\texttt{H}}
H
ead
O
\mathbf{\texttt{O}}
O
ptimization
Seongjae Kang
Dong Bok Lee
Hyungjoon Jang
Sung Ju Hwang
VLM
57
0
0
12 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
139
0
0
06 May 2025
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
Anshumann
Mohd Abbas Zaidi
Akhil Kedia
Jinwoo Ahn
Taehwak Kwon
Kangwook Lee
Haejun Lee
Joohyung Lee
FedML
161
0
0
21 Mar 2025
FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices
Zhidong Gao
Yu Zhang
Zhenxiao Zhang
Yanmin Gong
Yuanxiong Guo
18
0
0
01 Oct 2024
ProFuser: Progressive Fusion of Large Language Models
Tianyuan Shi
Fanqi Wan
Canbin Huang
Xiaojun Quan
Chenliang Li
Ming Yan
Ji Zhang
MoMe
28
2
0
09 Aug 2024
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
37
6
0
28 Jun 2024
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Rongzhi Zhang
Jiaming Shen
Tianqi Liu
Haorui Wang
Zhen Qin
Feng Han
Jialu Liu
Simon Baumgartner
Michael Bendersky
Chao Zhang
37
6
0
05 Jun 2024
Estimating Human Poses Across Datasets: A Unified Skeleton and Multi-Teacher Distillation Approach
Muhammad Gul Zain Ali Khan
Dhavalkumar Limbachiya
Didier Stricker
Muhammad Zeshan Afzal
3DH
32
0
0
30 May 2024
Aligning in a Compact Space: Contrastive Knowledge Distillation between Heterogeneous Architectures
Hongjun Wu
Li Xiao
Xingkuo Zhang
Yining Miao
38
1
0
28 May 2024
Logit Standardization in Knowledge Distillation
Shangquan Sun
Wenqi Ren
Jingzhi Li
Rui Wang
Xiaochun Cao
37
56
0
03 Mar 2024
MiniLLM: Knowledge Distillation of Large Language Models
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
ALM
31
77
0
14 Jun 2023
Local Boosting for Weakly-Supervised Learning
Rongzhi Zhang
Yue Yu
Jiaming Shen
Xiquan Cui
Chao Zhang
WSOL
39
2
0
05 Jun 2023
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Yue Yu
Yuchen Zhuang
Rongzhi Zhang
Yu Meng
Jiaming Shen
Chao Zhang
VLM
30
33
0
18 May 2023
Distilling Knowledge via Knowledge Review
Pengguang Chen
Shu-Lin Liu
Hengshuang Zhao
Jiaya Jia
149
420
0
19 Apr 2021
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
51
42
0
20 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
1