ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.05010
  4. Cited By
Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge
  Distillation

Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation

8 May 2023
Rongzhi Zhang
Jiaming Shen
Tianqi Liu
Jia-Ling Liu
Michael Bendersky
Marc Najork
Chao Zhang
ArXivPDFHTML

Papers citing "Do Not Blindly Imitate the Teacher: Using Perturbed Loss for Knowledge Distillation"

16 / 16 papers shown
Title
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via $\mathbf{\texttt{D}}$ual-$\mathbf{\texttt{H}}$ead $\mathbf{\texttt{O}}$ptimization
Simple Semi-supervised Knowledge Distillation from Vision-Language Models via D\mathbf{\texttt{D}}Dual-H\mathbf{\texttt{H}}Head O\mathbf{\texttt{O}}Optimization
Seongjae Kang
Dong Bok Lee
Hyungjoon Jang
Sung Ju Hwang
VLM
57
0
0
12 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
W. Xu
Shibiao Xu
ViT
139
0
0
06 May 2025
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
Anshumann
Mohd Abbas Zaidi
Akhil Kedia
Jinwoo Ahn
Taehwak Kwon
Kangwook Lee
Haejun Lee
Joohyung Lee
FedML
161
0
0
21 Mar 2025
FedPT: Federated Proxy-Tuning of Large Language Models on
  Resource-Constrained Edge Devices
FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices
Zhidong Gao
Yu Zhang
Zhenxiao Zhang
Yanmin Gong
Yuanxiong Guo
18
0
0
01 Oct 2024
ProFuser: Progressive Fusion of Large Language Models
ProFuser: Progressive Fusion of Large Language Models
Tianyuan Shi
Fanqi Wan
Canbin Huang
Xiaojun Quan
Chenliang Li
Ming Yan
Ji Zhang
MoMe
28
2
0
09 Aug 2024
Direct Preference Knowledge Distillation for Large Language Models
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
37
6
0
28 Jun 2024
PLaD: Preference-based Large Language Model Distillation with
  Pseudo-Preference Pairs
PLaD: Preference-based Large Language Model Distillation with Pseudo-Preference Pairs
Rongzhi Zhang
Jiaming Shen
Tianqi Liu
Haorui Wang
Zhen Qin
Feng Han
Jialu Liu
Simon Baumgartner
Michael Bendersky
Chao Zhang
37
6
0
05 Jun 2024
Estimating Human Poses Across Datasets: A Unified Skeleton and
  Multi-Teacher Distillation Approach
Estimating Human Poses Across Datasets: A Unified Skeleton and Multi-Teacher Distillation Approach
Muhammad Gul Zain Ali Khan
Dhavalkumar Limbachiya
Didier Stricker
Muhammad Zeshan Afzal
3DH
32
0
0
30 May 2024
Aligning in a Compact Space: Contrastive Knowledge Distillation between
  Heterogeneous Architectures
Aligning in a Compact Space: Contrastive Knowledge Distillation between Heterogeneous Architectures
Hongjun Wu
Li Xiao
Xingkuo Zhang
Yining Miao
38
1
0
28 May 2024
Logit Standardization in Knowledge Distillation
Logit Standardization in Knowledge Distillation
Shangquan Sun
Wenqi Ren
Jingzhi Li
Rui Wang
Xiaochun Cao
37
56
0
03 Mar 2024
MiniLLM: Knowledge Distillation of Large Language Models
MiniLLM: Knowledge Distillation of Large Language Models
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
ALM
31
77
0
14 Jun 2023
Local Boosting for Weakly-Supervised Learning
Local Boosting for Weakly-Supervised Learning
Rongzhi Zhang
Yue Yu
Jiaming Shen
Xiquan Cui
Chao Zhang
WSOL
39
2
0
05 Jun 2023
ReGen: Zero-Shot Text Classification via Training Data Generation with
  Progressive Dense Retrieval
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval
Yue Yu
Yuchen Zhuang
Rongzhi Zhang
Yu Meng
Jiaming Shen
Chao Zhang
VLM
30
33
0
18 May 2023
Distilling Knowledge via Knowledge Review
Distilling Knowledge via Knowledge Review
Pengguang Chen
Shu-Lin Liu
Hengshuang Zhao
Jiaya Jia
149
420
0
19 Apr 2021
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data
  Efficiency and Imperfect Teacher
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
51
42
0
20 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,950
0
20 Apr 2018
1