Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15781
Cited By
VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale
25 May 2023
Zhiwei Hao
Jianyuan Guo
Kai Han
Han Hu
Chang Xu
Yunhe Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale"
10 / 10 papers shown
Title
Dynamic Gradient Sparse Update for Edge Training
I-Hsuan Li
Tian-Sheuan Chang
59
1
0
23 Mar 2025
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective
Wencheng Zhu
Xin Zhou
Pengfei Zhu
Yu Wang
Qinghua Hu
VLM
56
1
0
22 Apr 2024
Revisiting Knowledge Distillation under Distribution Shift
Songming Zhang
Ziyu Lyu
Xiaofeng Chen
19
1
0
25 Dec 2023
NORM: Knowledge Distillation via N-to-One Representation Matching
Xiaolong Liu
Lujun Li
Chao Li
Anbang Yao
39
66
0
23 May 2023
Function-Consistent Feature Distillation
Dongyang Liu
Meina Kan
Shiguang Shan
Xilin Chen
44
18
0
24 Apr 2023
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
207
477
0
01 Oct 2021
CMT: Convolutional Neural Networks Meet Vision Transformers
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Chunjing Xu
Yunhe Wang
Chang Xu
ViT
337
500
0
13 Jul 2021
Distilling Knowledge via Knowledge Review
Pengguang Chen
Shu-Lin Liu
Hengshuang Zhao
Jiaya Jia
147
416
0
19 Apr 2021
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
201
14,190
0
07 Oct 2016
1