ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.05237
  4. Cited By
Knowledge distillation: A good teacher is patient and consistent

Knowledge distillation: A good teacher is patient and consistent

9 June 2021
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
    VLM
ArXivPDFHTML

Papers citing "Knowledge distillation: A good teacher is patient and consistent"

50 / 203 papers shown
Title
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Md. Naimur Asif Borno
Md Sakib Hossain Shovon
Asmaa Soliman Al-Moisheer
Mohammad Ali Moni
29
0
0
11 May 2025
A Computational Model of Inclusive Pedagogy: From Understanding to Application
A Computational Model of Inclusive Pedagogy: From Understanding to Application
Francesco Balzan
Pedro P. Santos
Maurizio Gabbrielli
Mahault Albarracin
Manuel Lopes
27
0
0
02 May 2025
Scaling Laws for Data-Efficient Visual Transfer Learning
Scaling Laws for Data-Efficient Visual Transfer Learning
Wenxuan Yang
Qingqu Wei
Chenxi Ma
Weimin Tan
Bo Yan
28
0
0
17 Apr 2025
DUDA: Distilled Unsupervised Domain Adaptation for Lightweight Semantic Segmentation
DUDA: Distilled Unsupervised Domain Adaptation for Lightweight Semantic Segmentation
Beomseok Kang
Niluthpol Chowdhury Mithun
Abhinav Rajvanshi
Han-Pang Chiu
S. Samarasekera
24
0
0
14 Apr 2025
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation
Robert M. Gower
Guillaume Garrigos
Nicolas Loizou
Dimitris Oikonomou
Konstantin Mishchenko
Fabian Schaipp
31
0
0
02 Apr 2025
TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance
TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance
Jingxian Xu
Mengyu Zhou
W. Liu
Hanbing Liu
Shi Han
Dongmei Zhang
LRM
40
1
0
31 Mar 2025
Delving Deep into Semantic Relation Distillation
Delving Deep into Semantic Relation Distillation
Zhaoyi Yan
Kangjun Liu
Qixiang Ye
54
0
0
27 Mar 2025
Distilling Stereo Networks for Performant and Efficient Leaner Networks
Distilling Stereo Networks for Performant and Efficient Leaner Networks
Rafia Rahim
Samuel Woerz
A. Zell
77
0
0
24 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
87
0
0
27 Feb 2025
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
Rui Liu
Yu-cui Shen
Peng Gao
Pratap Tokekar
Ming C. Lin
59
0
0
25 Feb 2025
Simpler Fast Vision Transformers with a Jumbo CLS Token
Simpler Fast Vision Transformers with a Jumbo CLS Token
A. Fuller
Yousef Yassin
Daniel G. Kyrollos
Evan Shelhamer
James R. Green
69
0
0
24 Feb 2025
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization
Wei Yao
Wenkai Yang
Z. Wang
Yankai Lin
Yong Liu
ELM
103
1
0
03 Feb 2025
Towards Mitigating Architecture Overfitting on Distilled Datasets
Towards Mitigating Architecture Overfitting on Distilled Datasets
Xuyang Zhong
Chen Liu
DD
55
2
0
08 Jan 2025
Cross-View Consistency Regularisation for Knowledge Distillation
Cross-View Consistency Regularisation for Knowledge Distillation
W. Zhang
Dongnan Liu
Weidong Cai
Chao Ma
68
1
0
21 Dec 2024
TT-MPD: Test Time Model Pruning and Distillation
TT-MPD: Test Time Model Pruning and Distillation
Haihang Wu
Wei Wang
T. Malepathirana
Sachith Seneviratne
D. Oetomo
Saman K. Halgamuge
74
0
0
10 Dec 2024
How to Merge Your Multimodal Models Over Time?
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
98
2
0
09 Dec 2024
CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance
CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance
Chu Myaet Thwal
Ye Lin Tun
Minh N. H. Nguyen
Eui-nam Huh
Choong Seon Hong
VLM
74
0
0
05 Dec 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Yuxiang Lu
Shengcao Cao
Yu-xiong Wang
49
1
0
18 Oct 2024
Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for
  Autonomous Driving
Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for Autonomous Driving
Sihao Wu
Jiaxu Liu
Xiangyu Yin
Guangliang Cheng
Xingyu Zhao
Meng Fang
Xinping Yi
Xiaowei Huang
30
0
0
16 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
67
3
0
14 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation
Mike Ranzinger
Jon Barker
Greg Heinrich
Pavlo Molchanov
Bryan Catanzaro
Andrew Tao
35
4
0
02 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
Xiaoyu Yang
Qiujia Li
Chao Zhang
P. Woodland
18
0
0
25 Sep 2024
Generalization in birdsong classification: impact of transfer learning
  methods and dataset characteristics
Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics
Burooj Ghani
Vincent J. Kalkman
Bob Planqué
Willem-Pier Vellinga
L. Gill
Dan Stowell
VLM
29
5
0
21 Sep 2024
Efficient Knowledge Distillation: Empowering Small Language Models with
  Teacher Model Insights
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
33
2
0
19 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment
Your Weak LLM is Secretly a Strong Teacher for Alignment
Leitian Tao
Yixuan Li
86
5
0
13 Sep 2024
Low-Resolution Object Recognition with Cross-Resolution Relational
  Contrastive Distillation
Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation
Kangkai Zhang
Shiming Ge
Ruixin Shi
Dan Zeng
49
13
0
04 Sep 2024
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech
  emotion recognition
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition
Dionyssos Kounadis-Bastian
Oliver Schrufer
Anna Derington
H. Wierstorf
F. Eyben
Felix Burkhardt
Björn Schuller
23
1
0
25 Aug 2024
The First Competition on Resource-Limited Infrared Small Target
  Detection Challenge: Methods and Results
The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and Results
Boyang Li
Xinyi Ying
Ruojing Li
Yongxian Liu
Yangsi Shi
Miao Li
29
1
0
18 Aug 2024
How to Train the Teacher Model for Effective Knowledge Distillation
How to Train the Teacher Model for Effective Knowledge Distillation
Shayan Mohajer Hamidi
Xizhen Deng
Renhao Tan
Linfeng Ye
Ahmed H. Salamah
32
2
0
25 Jul 2024
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
AMD: Automatic Multi-step Distillation of Large-scale Vision Models
Cheng Han
Qifan Wang
S. Dianat
Majid Rabbani
Raghuveer M. Rao
Yi Fang
Qiang Guan
Lifu Huang
Dongfang Liu
VLM
33
4
0
05 Jul 2024
Improving Zero-shot Generalization of Learned Prompts via Unsupervised
  Knowledge Distillation
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation
Marco Mistretta
Alberto Baldrati
Marco Bertini
Andrew D. Bagdanov
VPVLM
VLM
35
6
0
03 Jul 2024
Enhancing Accuracy and Parameter-Efficiency of Neural Representations
  for Network Parameterization
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization
Hongjun Choi
Jayaraman J. Thiagarajan
Ruben Glatt
Shusen Liu
43
0
0
29 Jun 2024
SCOPE: Stochastic Cartographic Occupancy Prediction Engine for
  Uncertainty-Aware Dynamic Navigation
SCOPE: Stochastic Cartographic Occupancy Prediction Engine for Uncertainty-Aware Dynamic Navigation
Zhanteng Xie
P. Dames
32
1
0
28 Jun 2024
Leveraging Knowledge Distillation for Lightweight Skin Cancer
  Classification: Balancing Accuracy and Computational Efficiency
Leveraging Knowledge Distillation for Lightweight Skin Cancer Classification: Balancing Accuracy and Computational Efficiency
Niful Islam
Khan Md. Hasib
Fahmida Akter Joti
Asif Karim
Sami Azam
29
2
0
24 Jun 2024
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
Chenghao Fan
Zhenyi Lu
Wei Wei
Jie Tian
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
48
5
0
17 Jun 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging
  and Cross-Model Knowledge Distillation
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation
Swarup Ranjan Behera
Abhishek Dhiman
Karthik Gowda
Aalekhya Satya Narayani
21
1
0
11 Jun 2024
A Comparative Survey of Vision Transformers for Feature Extraction in
  Texture Analysis
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis
Leonardo F. S. Scabini
Andre Sacilotti
Kallil M. C. Zielinski
L. C. Ribas
B. De Baets
Odemir M. Bruno
ViT
33
3
0
10 Jun 2024
Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders
Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders
Tingxu Han
Weisong Sun
Ziqi Ding
Chunrong Fang
Hanwei Qian
Jiaxun Li
Zhenyu Chen
Xiangyu Zhang
AAML
36
7
0
05 Jun 2024
LADI v2: Multi-label Dataset and Classifiers for Low-Altitude Disaster
  Imagery
LADI v2: Multi-label Dataset and Classifiers for Low-Altitude Disaster Imagery
Samuel Scheele
Katherine Picchione
Jeffrey Liu
32
0
0
04 Jun 2024
Estimating Depth of Monocular Panoramic Image with Teacher-Student Model
  Fusing Equirectangular and Spherical Representations
Estimating Depth of Monocular Panoramic Image with Teacher-Student Model Fusing Equirectangular and Spherical Representations
Jingguo Liu
Yijun Xu
Shigang Li
Jianfeng Li
MDE
34
3
0
27 May 2024
Feature Expansion and enhanced Compression for Class Incremental
  Learning
Feature Expansion and enhanced Compression for Class Incremental Learning
Quentin Ferdinand
G. Chenadec
Benoit Clement
Panagiotis Papadakis
Quentin Oliveau
CLL
16
0
0
13 May 2024
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image
  Retrieval
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval
Lorenzo Agnolucci
Alberto Baldrati
Marco Bertini
A. Bimbo
38
10
0
05 May 2024
Wake Vision: A Large-scale, Diverse Dataset and Benchmark Suite for
  TinyML Person Detection
Wake Vision: A Large-scale, Diverse Dataset and Benchmark Suite for TinyML Person Detection
Colby R. Banbury
Emil Njor
Matthew P. Stewart
Pete Warden
M. Kudlur
Nat Jeffries
Xenofon Fafoutis
Vijay Janapa Reddi
VLM
42
0
0
01 May 2024
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective
Wencheng Zhu
Xin Zhou
Pengfei Zhu
Yu Wang
Qinghua Hu
VLM
56
1
0
22 Apr 2024
EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven
  Generalized Converting Autoencoder
EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder
Hasanul Mahmud
Kevin Desai
P. Lama
Sushil Prasad
22
0
0
21 Apr 2024
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and
  Modalities
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities
Kunxi Li
Tianyu Zhan
Kairui Fu
Shengyu Zhang
Kun Kuang
Jiwei Li
Zhou Zhao
Fei Wu
MoMe
22
0
0
20 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision
  Transformers via Masked Image Modeling Pre-Training
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
39
1
0
18 Apr 2024
MobileNetV4 - Universal Models for the Mobile Ecosystem
MobileNetV4 - Universal Models for the Mobile Ecosystem
Danfeng Qin
Chas Leichner
M. Delakis
Marco Fornoni
Shixin Luo
...
Berkin Akin
Vaibhav Aggarwal
Tenghui Zhu
Daniele Moro
Andrew G. Howard
MQ
28
85
0
16 Apr 2024
Camera clustering for scalable stream-based active distillation
Camera clustering for scalable stream-based active distillation
Dani Manjah
Davide Cacciarelli
Christophe De Vleeschouwer
Benoit Macq
28
1
0
16 Apr 2024
ReffAKD: Resource-efficient Autoencoder-based Knowledge Distillation
ReffAKD: Resource-efficient Autoencoder-based Knowledge Distillation
Divyang Doshi
Jung-Eun Kim
21
1
0
15 Apr 2024
12345
Next