ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.05237
  4. Cited By
Knowledge distillation: A good teacher is patient and consistent

Knowledge distillation: A good teacher is patient and consistent

9 June 2021
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
    VLM
ArXivPDFHTML

Papers citing "Knowledge distillation: A good teacher is patient and consistent"

50 / 203 papers shown
Title
Progressive Learning without Forgetting
Progressive Learning without Forgetting
Tao Feng
Hangjie Yuan
Mang Wang
Ziyuan Huang
Ang Bian
Jianzhou Zhang
CLL
KELM
42
4
0
28 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network
  Ticket
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
24
19
0
23 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
24
60
0
17 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
43
74
0
17 Nov 2022
Knowledge Distillation for Detection Transformer with Consistent
  Distillation Points Sampling
Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang
Xin Li
Shengzhao Wen
Fu-En Yang
Wanping Zhang
Gang Zhang
Haocheng Feng
Junyu Han
Errui Ding
37
5
0
15 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact
  Multi-View 3D Detection
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
18
2
0
14 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge
  Distillation
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
20
58
0
09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
22
4
0
01 Nov 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
Jie Chen
Shouzhen Chen
Mingyuan Bai
Junbin Gao
Junping Zhang
Jian Pu
32
10
0
18 Oct 2022
Semantic Segmentation with Active Semi-Supervised Representation
  Learning
Semantic Segmentation with Active Semi-Supervised Representation Learning
Aneesh Rangnekar
Christopher Kanan
Matthew Hoffman
25
5
0
16 Oct 2022
Knowledge Distillation approach towards Melanoma Detection
Knowledge Distillation approach towards Melanoma Detection
Md Shakib Khan
Kazi Nabiul Alam
Abdur Rab Dhruba
H. Zunair
Nabeel Mohammed
20
23
0
14 Oct 2022
Students taught by multimodal teachers are superior action recognizers
Students taught by multimodal teachers are superior action recognizers
Gorjan Radevski
Dusan Grujicic
Matthew Blaschko
Marie-Francine Moens
Tinne Tuytelaars
18
1
0
09 Oct 2022
Robust Active Distillation
Robust Active Distillation
Cenk Baykal
Khoa Trinh
Fotis Iliopoulos
Gaurav Menghani
Erik Vee
31
10
0
03 Oct 2022
Global Semantic Descriptors for Zero-Shot Action Recognition
Global Semantic Descriptors for Zero-Shot Action Recognition
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
25
3
0
24 Sep 2022
TeST: Test-time Self-Training under Distribution Shift
TeST: Test-time Self-Training under Distribution Shift
Samarth Sinha
Peter V. Gehler
Francesco Locatello
Bernt Schiele
TTA
OOD
32
22
0
23 Sep 2022
Layerwise Bregman Representation Learning with Applications to Knowledge
  Distillation
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation
Ehsan Amid
Rohan Anil
Christopher Fifty
Manfred K. Warmuth
15
2
0
15 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
151
102
0
13 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori
Tatsunori B. Hashimoto
64
43
0
08 Sep 2022
Effectiveness of Function Matching in Driving Scene Recognition
Effectiveness of Function Matching in Driving Scene Recognition
Shingo Yashima
16
1
0
20 Aug 2022
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative
  Networks using cGANs
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks using cGANs
Sameer Ambekar
Matteo Tafuro
Ankit Ankit
Diego van der Mast
Mark Alence
C. Athanasiadis
GAN
23
4
0
08 Aug 2022
Efficient One Pass Self-distillation with Zipf's Label Smoothing
Efficient One Pass Self-distillation with Zipf's Label Smoothing
Jiajun Liang
Linze Li
Z. Bing
Borui Zhao
Yao Tang
Bo Lin
Haoqiang Fan
12
18
0
26 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Nathan Ng
Neha Hulkund
Kyunghyun Cho
Marzyeh Ghassemi
OOD
11
4
0
05 Jul 2022
What Knowledge Gets Distilled in Knowledge Distillation?
What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha
Yuheng Li
Anirudh Sundara Rajan
Yingyu Liang
Yong Jae Lee
FedML
21
18
0
31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on
  Small Datasets
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViT
MedIm
28
10
0
30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian-jun Sun
Weiming Hu
ViT
64
41
0
28 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms
  and Research Challenges
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges
Zhenghua Chen
Min-man Wu
Alvin Chan
Xiaoli Li
Yew-Soon Ong
14
6
0
08 May 2022
Merging of neural networks
Merging of neural networks
Martin Pasen
Vladimír Boza
FedML
MoMe
30
2
0
21 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top
  Results
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
38
11
0
07 Apr 2022
Consistency driven Sequential Transformers Attention Model for Partially
  Observable Scenes
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes
Samrudhdhi B. Rangrej
C. Srinidhi
J. Clark
11
12
0
01 Apr 2022
On the benefits of knowledge distillation for adversarial robustness
On the benefits of knowledge distillation for adversarial robustness
Javier Maroto
Guillermo Ortiz-Jiménez
P. Frossard
AAML
FedML
17
20
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio
  Classification
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
46
909
1
10 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive
  Representation Learning
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Weixin Liang
Yuhui Zhang
Yongchan Kwon
Serena Yeung
James Y. Zou
VLM
40
385
0
03 Mar 2022
Meta Knowledge Distillation
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
18
25
0
16 Feb 2022
It's All in the Head: Representation Knowledge Distillation through
  Classifier Sharing
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing
Emanuel Ben-Baruch
M. Karklinsky
Yossi Biton
Avi Ben-Cohen
Hussam Lawen
Nadav Zamir
16
11
0
18 Jan 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised
  Knowledge Distillation
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation
K. Navaneet
Soroush Abbasi Koohpayegani
Ajinkya Tejankar
Hamed Pirsiavash
8
19
0
13 Jan 2022
Microdosing: Knowledge Distillation for GAN based Compression
Microdosing: Knowledge Distillation for GAN based Compression
Leonhard Helminger
Roberto Azevedo
Abdelaziz Djelouah
Markus Gross
Christopher Schroers
17
3
0
07 Jan 2022
Ex-Model: Continual Learning from a Stream of Trained Models
Ex-Model: Continual Learning from a Stream of Trained Models
Antonio Carta
Andrea Cossu
Vincenzo Lomonaco
D. Bacciu
CLL
14
11
0
13 Dec 2021
A Fast Knowledge Distillation Framework for Visual Recognition
A Fast Knowledge Distillation Framework for Visual Recognition
Zhiqiang Shen
Eric P. Xing
VLM
14
45
0
02 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from
  a Single Image
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Yuki M. Asano
Aaqib Saeed
30
7
0
01 Dec 2021
PP-ShiTu: A Practical Lightweight Image Recognition System
PP-ShiTu: A Practical Lightweight Image Recognition System
Shengyun Wei
Ruoyu Guo
Cheng Cui
Bin Lu
Shuilong Dong
...
Xueying Lyu
Qiwen Liu
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
CVBM
21
6
0
01 Nov 2021
Network Augmentation for Tiny Deep Learning
Network Augmentation for Tiny Deep Learning
Han Cai
Chuang Gan
Ji Lin
Song Han
17
29
0
17 Oct 2021
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation
  with SimCLR
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR
Khoi Duc Minh Nguyen
Y. Nguyen
Bao Le
11
5
0
02 Aug 2021
Teacher's pet: understanding and mitigating biases in distillation
Teacher's pet: understanding and mitigating biases in distillation
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Sanjiv Kumar
11
25
0
19 Jun 2021
Does Knowledge Distillation Really Work?
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
16
214
0
10 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
On Improving Adversarial Transferability of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
F. Khan
Fatih Porikli
ViT
21
93
0
08 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
268
2,603
0
04 May 2021
Re-labeling ImageNet: from Single to Multi-Labels, from Global to
  Localized Labels
Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels
Sangdoo Yun
Seong Joon Oh
Byeongho Heo
Dongyoon Han
Junsuk Choe
Sanghyuk Chun
392
142
0
13 Jan 2021
Simple Copy-Paste is a Strong Data Augmentation Method for Instance
  Segmentation
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi
Yin Cui
A. Srinivas
Rui Qian
Tsung-Yi Lin
E. D. Cubuk
Quoc V. Le
Barret Zoph
ISeg
226
968
0
13 Dec 2020
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
185
1,027
0
06 Mar 2020
Previous
12345
Next