Knowledge distillation: A good teacher is patient and consistent

9 June 2021

Papers citing "Knowledge distillation: A good teacher is patient and consistent"

50 / 203 papers shown

Title
Progressive Learning without Forgetting Tao Feng Hangjie Yuan Mang Wang Ziyuan Huang Ang Bian Jianzhou Zhang CLL KELM 42 4 0 28 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket Nianhui Guo Joseph Bethge Christoph Meinel Haojin Yang MQ 24 19 0 23 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up Luke Metz James Harrison C. Freeman Amil Merchant Lucas Beyer ... Naman Agrawal Ben Poole Igor Mordatch Adam Roberts Jascha Narain Sohl-Dickstein 24 60 0 17 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding Shizhe Chen Pierre-Louis Guhur Makarand Tapaswi Cordelia Schmid Ivan Laptev 43 74 0 17 Nov 2022
Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling Yu Wang Xin Li Shengzhao Wen Fu-En Yang Wanping Zhang Gang Zhang Haocheng Feng Junyu Han Errui Ding 37 5 0 15 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection Linfeng Zhang Yukang Shi Hung-Shuo Tai Zhipeng Zhang Yuan He Ke Wang Kaisheng Ma 18 2 0 14 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation Florian Schmid Khaled Koutini Gerhard Widmer ViT 20 58 0 09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation Cody Blakeney Jessica Zosa Forde Jonathan Frankle Ziliang Zong Matthew L. Leavitt VLM 22 4 0 01 Nov 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP Jie Chen Shouzhen Chen Mingyuan Bai Junbin Gao Junping Zhang Jian Pu 32 10 0 18 Oct 2022
Semantic Segmentation with Active Semi-Supervised Representation Learning Aneesh Rangnekar Christopher Kanan Matthew Hoffman 25 5 0 16 Oct 2022
Knowledge Distillation approach towards Melanoma Detection Md Shakib Khan Kazi Nabiul Alam Abdur Rab Dhruba H. Zunair Nabeel Mohammed 20 23 0 14 Oct 2022
Students taught by multimodal teachers are superior action recognizers Gorjan Radevski Dusan Grujicic Matthew Blaschko Marie-Francine Moens Tinne Tuytelaars 18 1 0 09 Oct 2022
Robust Active Distillation Cenk Baykal Khoa Trinh Fotis Iliopoulos Gaurav Menghani Erik Vee 31 10 0 03 Oct 2022
Global Semantic Descriptors for Zero-Shot Action Recognition Valter Estevam Rayson Laroca Hélio Pedrini David Menotti 25 3 0 24 Sep 2022
TeST: Test-time Self-Training under Distribution Shift Samarth Sinha Peter V. Gehler Francesco Locatello Bernt Schiele TTA OOD 32 22 0 23 Sep 2022
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation Ehsan Amid Rohan Anil Christopher Fifty Manfred K. Warmuth 15 2 0 15 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision Ibrahim M. Alabdulmohsin Behnam Neyshabur Xiaohua Zhai 151 102 0 13 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases Rohan Taori Tatsunori B. Hashimoto 64 43 0 08 Sep 2022
Effectiveness of Function Matching in Driving Scene Recognition Shingo Yashima 16 1 0 20 Aug 2022
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks using cGANs Sameer Ambekar Matteo Tafuro Ankit Ankit Diego van der Mast Mark Alence C. Athanasiadis GAN 23 4 0 08 Aug 2022
Efficient One Pass Self-distillation with Zipf's Label Smoothing Jiajun Liang Linze Li Z. Bing Borui Zhao Yao Tang Bo Lin Haoqiang Fan 12 18 0 26 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance Nathan Ng Neha Hulkund Kyunghyun Cho Marzyeh Ghassemi OOD 11 4 0 05 Jul 2022
What Knowledge Gets Distilled in Knowledge Distillation? Utkarsh Ojha Yuheng Li Anirudh Sundara Rajan Yingyu Liang Yong Jae Lee FedML 21 18 0 31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets Leandro M. de Lima R. Krohling ViT MedIm 28 10 0 30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers Shaoru Wang Jin Gao Zeming Li Jian-jun Sun Weiming Hu ViT 64 41 0 28 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges Zhenghua Chen Min-man Wu Alvin Chan Xiaoli Li Yew-Soon Ong 14 6 0 08 May 2022
Merging of neural networks Martin Pasen Vladimír Boza FedML MoMe 30 2 0 21 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results T. Ridnik Hussam Lawen Emanuel Ben-Baruch Asaf Noy 38 11 0 07 Apr 2022
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes Samrudhdhi B. Rangrej C. Srinidhi J. Clark 11 12 0 01 Apr 2022
On the benefits of knowledge distillation for adversarial robustness Javier Maroto Guillermo Ortiz-Jiménez P. Frossard AAML FedML 17 20 0 14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification Yuan Gong Sameer Khurana Andrew Rouditchenko James R. Glass VLM 25 29 0 13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time Mitchell Wortsman Gabriel Ilharco S. Gadre Rebecca Roelofs Raphael Gontijo-Lopes ... Hongseok Namkoong Ali Farhadi Y. Carmon Simon Kornblith Ludwig Schmidt MoMe 46 909 1 10 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning Weixin Liang Yuhui Zhang Yongchan Kwon Serena Yeung James Y. Zou VLM 40 385 0 03 Mar 2022
Meta Knowledge Distillation Jihao Liu Boxiao Liu Hongsheng Li Yu Liu 18 25 0 16 Feb 2022
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing Emanuel Ben-Baruch M. Karklinsky Yossi Biton Avi Ben-Cohen Hussam Lawen Nadav Zamir 16 11 0 18 Jan 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation K. Navaneet Soroush Abbasi Koohpayegani Ajinkya Tejankar Hamed Pirsiavash 8 19 0 13 Jan 2022
Microdosing: Knowledge Distillation for GAN based Compression Leonhard Helminger Roberto Azevedo Abdelaziz Djelouah Markus Gross Christopher Schroers 17 3 0 07 Jan 2022
Ex-Model: Continual Learning from a Stream of Trained Models Antonio Carta Andrea Cossu Vincenzo Lomonaco D. Bacciu CLL 14 11 0 13 Dec 2021
A Fast Knowledge Distillation Framework for Visual Recognition Zhiqiang Shen Eric P. Xing VLM 14 45 0 02 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image Yuki M. Asano Aaqib Saeed 30 7 0 01 Dec 2021
PP-ShiTu: A Practical Lightweight Image Recognition System Shengyun Wei Ruoyu Guo Cheng Cui Bin Lu Shuilong Dong ... Xueying Lyu Qiwen Liu Xiaoguang Hu Dianhai Yu Yanjun Ma CVBM 21 6 0 01 Nov 2021
Network Augmentation for Tiny Deep Learning Han Cai Chuang Gan Ji Lin Song Han 17 29 0 17 Oct 2021
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR Khoi Duc Minh Nguyen Y. Nguyen Bao Le 11 5 0 02 Aug 2021
Teacher's pet: understanding and mitigating biases in distillation Michal Lukasik Srinadh Bhojanapalli A. Menon Sanjiv Kumar 11 25 0 19 Jun 2021
Does Knowledge Distillation Really Work? Samuel Stanton Pavel Izmailov Polina Kirichenko Alexander A. Alemi A. Wilson FedML 16 214 0 10 Jun 2021
On Improving Adversarial Transferability of Vision Transformers Muzammal Naseer Kanchana Ranasinghe Salman Khan F. Khan Fatih Porikli ViT 21 93 0 08 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 268 2,603 0 04 May 2021
Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels Sangdoo Yun Seong Joon Oh Byeongho Heo Dongyoon Han Junsuk Choe Sanghyuk Chun 392 142 0 13 Jan 2021
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation Golnaz Ghiasi Yin Cui A. Srinivas Rui Qian Tsung-Yi Lin E. D. Cubuk Quoc V. Le Barret Zoph ISeg 226 968 0 13 Dec 2020
What is the State of Neural Network Pruning? Davis W. Blalock Jose Javier Gonzalez Ortiz Jonathan Frankle John Guttag 185 1,027 0 06 Mar 2020