Knowledge distillation: A good teacher is patient and consistent

9 June 2021

Papers citing "Knowledge distillation: A good teacher is patient and consistent"

50 / 203 papers shown

Title
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation Md. Naimur Asif Borno Md Sakib Hossain Shovon Asmaa Soliman Al-Moisheer Mohammad Ali Moni 29 0 0 11 May 2025
A Computational Model of Inclusive Pedagogy: From Understanding to Application Francesco Balzan Pedro P. Santos Maurizio Gabbrielli Mahault Albarracin Manuel Lopes 27 0 0 02 May 2025
Scaling Laws for Data-Efficient Visual Transfer Learning Wenxuan Yang Qingqu Wei Chenxi Ma Weimin Tan Bo Yan 28 0 0 17 Apr 2025
DUDA: Distilled Unsupervised Domain Adaptation for Lightweight Semantic Segmentation Beomseok Kang Niluthpol Chowdhury Mithun Abhinav Rajvanshi Han-Pang Chiu S. Samarasekera 24 0 0 14 Apr 2025
Analysis of an Idealized Stochastic Polyak Method and its Application to Black-Box Model Distillation Robert M. Gower Guillaume Garrigos Nicolas Loizou Dimitris Oikonomou Konstantin Mishchenko Fabian Schaipp 31 0 0 02 Apr 2025
TwT: Thinking without Tokens by Habitual Reasoning Distillation with Multi-Teachers' Guidance Jingxian Xu Mengyu Zhou W. Liu Hanbing Liu Shi Han Dongmei Zhang LRM 40 1 0 31 Mar 2025
Delving Deep into Semantic Relation Distillation Zhaoyi Yan Kangjun Liu Qixiang Ye 54 0 0 27 Mar 2025
Distilling Stereo Networks for Performant and Efficient Leaner Networks Rafia Rahim Samuel Woerz A. Zell 77 0 0 24 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute Sotiris Anagnostidis Gregor Bachmann Yeongmin Kim Jonas Kohler Markos Georgopoulos A. Sanakoyeu Yuming Du Albert Pumarola Ali K. Thabet Edgar Schönfeld 87 0 0 27 Feb 2025
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems Rui Liu Yu-cui Shen Peng Gao Pratap Tokekar Ming C. Lin 59 0 0 25 Feb 2025
Simpler Fast Vision Transformers with a Jumbo CLS Token A. Fuller Yousef Yassin Daniel G. Kyrollos Evan Shelhamer James R. Green 69 0 0 24 Feb 2025
Understanding the Capabilities and Limitations of Weak-to-Strong Generalization Wei Yao Wenkai Yang Z. Wang Yankai Lin Yong Liu ELM 103 1 0 03 Feb 2025
Towards Mitigating Architecture Overfitting on Distilled Datasets Xuyang Zhong Chen Liu DD 55 2 0 08 Jan 2025
Cross-View Consistency Regularisation for Knowledge Distillation W. Zhang Dongnan Liu Weidong Cai Chao Ma 68 1 0 21 Dec 2024
TT-MPD: Test Time Model Pruning and Distillation Haihang Wu Wei Wang T. Malepathirana Sachith Seneviratne D. Oetomo Saman K. Halgamuge 74 0 0 10 Dec 2024
How to Merge Your Multimodal Models Over Time? Sebastian Dziadzio Vishaal Udandarao Karsten Roth Ameya Prabhu Zeynep Akata Samuel Albanie Matthias Bethge MoMe 98 2 0 09 Dec 2024
CLIP-PING: Boosting Lightweight Vision-Language Models with Proximus Intrinsic Neighbors Guidance Chu Myaet Thwal Ye Lin Tun Minh N. H. Nguyen Eui-nam Huh Choong Seon Hong VLM 74 0 0 05 Dec 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning Yuxiang Lu Shengcao Cao Yu-xiong Wang 49 1 0 18 Oct 2024
Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for Autonomous Driving Sihao Wu Jiaxu Liu Xiangyu Yin Guangliang Cheng Xingyu Zhao Meng Fang Xinping Yi Xiaowei Huang 30 0 0 16 Oct 2024
Locality Alignment Improves Vision-Language Models Ian Covert Tony Sun James Y. Zou Tatsunori Hashimoto VLM 67 3 0 14 Oct 2024
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Mike Ranzinger Jon Barker Greg Heinrich Pavlo Molchanov Bryan Catanzaro Andrew Tao 35 4 0 02 Oct 2024
MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events Xiaoyu Yang Qiujia Li Chao Zhang P. Woodland 18 0 0 25 Sep 2024
Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics Burooj Ghani Vincent J. Kalkman Bob Planqué Willem-Pier Vellinga L. Gill Dan Stowell VLM 29 5 0 21 Sep 2024
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights Mohamad Ballout U. Krumnack Gunther Heidemann Kai-Uwe Kühnberger 33 2 0 19 Sep 2024
Your Weak LLM is Secretly a Strong Teacher for Alignment Leitian Tao Yixuan Li 86 5 0 13 Sep 2024
Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation Kangkai Zhang Shiming Ge Ruixin Shi Dan Zeng 49 13 0 04 Sep 2024
Wav2Small: Distilling Wav2Vec2 to 72K parameters for Low-Resource Speech emotion recognition Dionyssos Kounadis-Bastian Oliver Schrufer Anna Derington H. Wierstorf F. Eyben Felix Burkhardt Björn Schuller 23 1 0 25 Aug 2024
The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and Results Boyang Li Xinyi Ying Ruojing Li Yongxian Liu Yangsi Shi Miao Li 29 1 0 18 Aug 2024
How to Train the Teacher Model for Effective Knowledge Distillation Shayan Mohajer Hamidi Xizhen Deng Renhao Tan Linfeng Ye Ahmed H. Salamah 32 2 0 25 Jul 2024
AMD: Automatic Multi-step Distillation of Large-scale Vision Models Cheng Han Qifan Wang S. Dianat Majid Rabbani Raghuveer M. Rao Yi Fang Qiang Guan Lifu Huang Dongfang Liu VLM 33 4 0 05 Jul 2024
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation Marco Mistretta Alberto Baldrati Marco Bertini Andrew D. Bagdanov VPVLM VLM 35 6 0 03 Jul 2024
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization Hongjun Choi Jayaraman J. Thiagarajan Ruben Glatt Shusen Liu 43 0 0 29 Jun 2024
SCOPE: Stochastic Cartographic Occupancy Prediction Engine for Uncertainty-Aware Dynamic Navigation Zhanteng Xie P. Dames 32 1 0 28 Jun 2024
Leveraging Knowledge Distillation for Lightweight Skin Cancer Classification: Balancing Accuracy and Computational Efficiency Niful Islam Khan Md. Hasib Fahmida Akter Joti Asif Karim Sami Azam 29 2 0 24 Jun 2024
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion Chenghao Fan Zhenyi Lu Wei Wei Jie Tian Xiaoye Qu Dangyang Chen Yu Cheng MoMe 48 5 0 17 Jun 2024
FastAST: Accelerating Audio Spectrogram Transformer via Token Merging and Cross-Model Knowledge Distillation Swarup Ranjan Behera Abhishek Dhiman Karthik Gowda Aalekhya Satya Narayani 21 1 0 11 Jun 2024
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis Leonardo F. S. Scabini Andre Sacilotti Kallil M. C. Zielinski L. C. Ribas B. De Baets Odemir M. Bruno ViT 33 3 0 10 Jun 2024
Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders Tingxu Han Weisong Sun Ziqi Ding Chunrong Fang Hanwei Qian Jiaxun Li Zhenyu Chen Xiangyu Zhang AAML 36 7 0 05 Jun 2024
LADI v2: Multi-label Dataset and Classifiers for Low-Altitude Disaster Imagery Samuel Scheele Katherine Picchione Jeffrey Liu 32 0 0 04 Jun 2024
Estimating Depth of Monocular Panoramic Image with Teacher-Student Model Fusing Equirectangular and Spherical Representations Jingguo Liu Yijun Xu Shigang Li Jianfeng Li MDE 34 3 0 27 May 2024
Feature Expansion and enhanced Compression for Class Incremental Learning Quentin Ferdinand G. Chenadec Benoit Clement Panagiotis Papadakis Quentin Oliveau CLL 16 0 0 13 May 2024
iSEARLE: Improving Textual Inversion for Zero-Shot Composed Image Retrieval Lorenzo Agnolucci Alberto Baldrati Marco Bertini A. Bimbo 38 10 0 05 May 2024
Wake Vision: A Large-scale, Diverse Dataset and Benchmark Suite for TinyML Person Detection Colby R. Banbury Emil Njor Matthew P. Stewart Pete Warden M. Kudlur Nat Jeffries Xenofon Fafoutis Vijay Janapa Reddi VLM 42 0 0 01 May 2024
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective Wencheng Zhu Xin Zhou Pengfei Zhu Yu Wang Qinghua Hu VLM 56 1 0 22 Apr 2024
EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder Hasanul Mahmud Kevin Desai P. Lama Sushil Prasad 22 0 0 21 Apr 2024
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities Kunxi Li Tianyu Zhan Kairui Fu Shengyu Zhang Kun Kuang Jiwei Li Zhou Zhao Fei Wu MoMe 22 0 0 20 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training Jin Gao Shubo Lin Shaoru Wang Yutong Kou Zeming Li Liang Li Congxuan Zhang Xiaoqin Zhang Yizheng Wang Weiming Hu 39 1 0 18 Apr 2024
MobileNetV4 - Universal Models for the Mobile Ecosystem Danfeng Qin Chas Leichner M. Delakis Marco Fornoni Shixin Luo ... Berkin Akin Vaibhav Aggarwal Tenghui Zhu Daniele Moro Andrew G. Howard MQ 28 85 0 16 Apr 2024
Camera clustering for scalable stream-based active distillation Dani Manjah Davide Cacciarelli Christophe De Vleeschouwer Benoit Macq 28 1 0 16 Apr 2024
ReffAKD: Resource-efficient Autoencoder-based Knowledge Distillation Divyang Doshi Jung-Eun Kim 21 1 0 15 Apr 2024