Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.05237
Cited By
Knowledge distillation: A good teacher is patient and consistent
9 June 2021
Lucas Beyer
Xiaohua Zhai
Amelie Royer
L. Markeeva
Rohan Anil
Alexander Kolesnikov
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Knowledge distillation: A good teacher is patient and consistent"
50 / 203 papers shown
Title
Progressive Learning without Forgetting
Tao Feng
Hangjie Yuan
Mang Wang
Ziyuan Huang
Ang Bian
Jianzhou Zhang
CLL
KELM
42
4
0
28 Nov 2022
Join the High Accuracy Club on ImageNet with A Binary Neural Network Ticket
Nianhui Guo
Joseph Bethge
Christoph Meinel
Haojin Yang
MQ
24
19
0
23 Nov 2022
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
24
60
0
17 Nov 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
43
74
0
17 Nov 2022
Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Yu Wang
Xin Li
Shengzhao Wen
Fu-En Yang
Wanping Zhang
Gang Zhang
Haocheng Feng
Junyu Han
Errui Ding
37
5
0
15 Nov 2022
Structured Knowledge Distillation Towards Efficient and Compact Multi-View 3D Detection
Linfeng Zhang
Yukang Shi
Hung-Shuo Tai
Zhipeng Zhang
Yuan He
Ke Wang
Kaisheng Ma
18
2
0
14 Nov 2022
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
Florian Schmid
Khaled Koutini
Gerhard Widmer
ViT
20
58
0
09 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
22
4
0
01 Nov 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
Jie Chen
Shouzhen Chen
Mingyuan Bai
Junbin Gao
Junping Zhang
Jian Pu
32
10
0
18 Oct 2022
Semantic Segmentation with Active Semi-Supervised Representation Learning
Aneesh Rangnekar
Christopher Kanan
Matthew Hoffman
25
5
0
16 Oct 2022
Knowledge Distillation approach towards Melanoma Detection
Md Shakib Khan
Kazi Nabiul Alam
Abdur Rab Dhruba
H. Zunair
Nabeel Mohammed
20
23
0
14 Oct 2022
Students taught by multimodal teachers are superior action recognizers
Gorjan Radevski
Dusan Grujicic
Matthew Blaschko
Marie-Francine Moens
Tinne Tuytelaars
18
1
0
09 Oct 2022
Robust Active Distillation
Cenk Baykal
Khoa Trinh
Fotis Iliopoulos
Gaurav Menghani
Erik Vee
31
10
0
03 Oct 2022
Global Semantic Descriptors for Zero-Shot Action Recognition
Valter Estevam
Rayson Laroca
Hélio Pedrini
David Menotti
25
3
0
24 Sep 2022
TeST: Test-time Self-Training under Distribution Shift
Samarth Sinha
Peter V. Gehler
Francesco Locatello
Bernt Schiele
TTA
OOD
32
22
0
23 Sep 2022
Layerwise Bregman Representation Learning with Applications to Knowledge Distillation
Ehsan Amid
Rohan Anil
Christopher Fifty
Manfred K. Warmuth
15
2
0
15 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
151
102
0
13 Sep 2022
Data Feedback Loops: Model-driven Amplification of Dataset Biases
Rohan Taori
Tatsunori B. Hashimoto
64
43
0
08 Sep 2022
Effectiveness of Function Matching in Driving Scene Recognition
Shingo Yashima
16
1
0
20 Aug 2022
SKDCGN: Source-free Knowledge Distillation of Counterfactual Generative Networks using cGANs
Sameer Ambekar
Matteo Tafuro
Ankit Ankit
Diego van der Mast
Mark Alence
C. Athanasiadis
GAN
23
4
0
08 Aug 2022
Efficient One Pass Self-distillation with Zipf's Label Smoothing
Jiajun Liang
Linze Li
Z. Bing
Borui Zhao
Yao Tang
Bo Lin
Haoqiang Fan
12
18
0
26 Jul 2022
Predicting Out-of-Domain Generalization with Neighborhood Invariance
Nathan Ng
Neha Hulkund
Kyunghyun Cho
Marzyeh Ghassemi
OOD
11
4
0
05 Jul 2022
What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha
Yuheng Li
Anirudh Sundara Rajan
Yingyu Liang
Yong Jae Lee
FedML
21
18
0
31 May 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Leandro M. de Lima
R. Krohling
ViT
MedIm
28
10
0
30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian-jun Sun
Weiming Hu
ViT
64
41
0
28 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges
Zhenghua Chen
Min-man Wu
Alvin Chan
Xiaoli Li
Yew-Soon Ong
14
6
0
08 May 2022
Merging of neural networks
Martin Pasen
Vladimír Boza
FedML
MoMe
30
2
0
21 Apr 2022
Solving ImageNet: a Unified Scheme for Training any Backbone to Top Results
T. Ridnik
Hussam Lawen
Emanuel Ben-Baruch
Asaf Noy
38
11
0
07 Apr 2022
Consistency driven Sequential Transformers Attention Model for Partially Observable Scenes
Samrudhdhi B. Rangrej
C. Srinidhi
J. Clark
11
12
0
01 Apr 2022
On the benefits of knowledge distillation for adversarial robustness
Javier Maroto
Guillermo Ortiz-Jiménez
P. Frossard
AAML
FedML
17
20
0
14 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
46
909
1
10 Mar 2022
Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning
Weixin Liang
Yuhui Zhang
Yongchan Kwon
Serena Yeung
James Y. Zou
VLM
40
385
0
03 Mar 2022
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
18
25
0
16 Feb 2022
It's All in the Head: Representation Knowledge Distillation through Classifier Sharing
Emanuel Ben-Baruch
M. Karklinsky
Yossi Biton
Avi Ben-Cohen
Hussam Lawen
Nadav Zamir
16
11
0
18 Jan 2022
SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation
K. Navaneet
Soroush Abbasi Koohpayegani
Ajinkya Tejankar
Hamed Pirsiavash
8
19
0
13 Jan 2022
Microdosing: Knowledge Distillation for GAN based Compression
Leonhard Helminger
Roberto Azevedo
Abdelaziz Djelouah
Markus Gross
Christopher Schroers
17
3
0
07 Jan 2022
Ex-Model: Continual Learning from a Stream of Trained Models
Antonio Carta
Andrea Cossu
Vincenzo Lomonaco
D. Bacciu
CLL
14
11
0
13 Dec 2021
A Fast Knowledge Distillation Framework for Visual Recognition
Zhiqiang Shen
Eric P. Xing
VLM
14
45
0
02 Dec 2021
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
Yuki M. Asano
Aaqib Saeed
30
7
0
01 Dec 2021
PP-ShiTu: A Practical Lightweight Image Recognition System
Shengyun Wei
Ruoyu Guo
Cheng Cui
Bin Lu
Shuilong Dong
...
Xueying Lyu
Qiwen Liu
Xiaoguang Hu
Dianhai Yu
Yanjun Ma
CVBM
21
6
0
01 Nov 2021
Network Augmentation for Tiny Deep Learning
Han Cai
Chuang Gan
Ji Lin
Song Han
17
29
0
17 Oct 2021
Semi-Supervising Learning, Transfer Learning, and Knowledge Distillation with SimCLR
Khoi Duc Minh Nguyen
Y. Nguyen
Bao Le
11
5
0
02 Aug 2021
Teacher's pet: understanding and mitigating biases in distillation
Michal Lukasik
Srinadh Bhojanapalli
A. Menon
Sanjiv Kumar
11
25
0
19 Jun 2021
Does Knowledge Distillation Really Work?
Samuel Stanton
Pavel Izmailov
Polina Kirichenko
Alexander A. Alemi
A. Wilson
FedML
16
214
0
10 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
F. Khan
Fatih Porikli
ViT
21
93
0
08 Jun 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
268
2,603
0
04 May 2021
Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels
Sangdoo Yun
Seong Joon Oh
Byeongho Heo
Dongyoon Han
Junsuk Choe
Sanghyuk Chun
392
142
0
13 Jan 2021
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
Golnaz Ghiasi
Yin Cui
A. Srinivas
Rui Qian
Tsung-Yi Lin
E. D. Cubuk
Quoc V. Le
Barret Zoph
ISeg
226
968
0
13 Dec 2020
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
185
1,027
0
06 Mar 2020
Previous
1
2
3
4
5
Next