ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.06270
19
0

Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting

6 May 2025
Seongmin Kim
Kwanho Kim
Minseung Kim
Kanghyun Jo
ArXivPDFHTML
Abstract

Although deep learning models owe their remarkable success to deep and complex architectures, this very complexity typically comes at the expense of real-time performance. To address this issue, a variety of model compression techniques have been proposed, among which knowledge distillation (KD) stands out for its strong empirical performance. The KD contains two concurrent processes: (i) matching the outputs of a large, pre-trained teacher network and a lightweight student network, and (ii) training the student to solve its designated downstream task. The associated loss functions are termed the distillation loss and the downsteam-task loss, respectively. Numerous prior studies report that KD is most effective when the influence of the distillation loss outweighs that of the downstream-task loss. The influence(or importance) is typically regulated by a balancing parameter. This paper provides a mathematical rationale showing that in a simple KD setting when the loss is decreasing, the balancing parameter should be dynamically adjusted

View on arXiv
@article{kim2025_2505.06270,
  title={ Importance Analysis for Dynamic Control of Balancing Parameter in a Simple Knowledge Distillation Setting },
  author={ Seongmin Kim and Kwanho Kim and Minseung Kim and Kanghyun Jo },
  journal={arXiv preprint arXiv:2505.06270},
  year={ 2025 }
}
Comments on this paper