Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.04427
Cited By
Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again
10 October 2022
Xin-Chun Li
Wenxuan Fan
Shaoming Song
Yinchuan Li
Bingshuai Li
Yunfeng Shao
De-Chuan Zhan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Asymmetric Temperature Scaling Makes Larger Networks Teach Well Again"
7 / 7 papers shown
Title
ABKD: Pursuing a Proper Allocation of the Probability Mass in Knowledge Distillation via
α
α
α
-
β
β
β
-Divergence
Guanghui Wang
Zhiyong Yang
Z. Wang
Shi Wang
Qianqian Xu
Q. Huang
37
0
0
07 May 2025
Knowledge Diffusion for Distillation
Tao Huang
Yuan Zhang
Mingkai Zheng
Shan You
Fei Wang
Chao Qian
Chang Xu
24
48
0
25 May 2023
Distilling Knowledge via Knowledge Review
Pengguang Chen
Shu-Lin Liu
Hengshuang Zhao
Jiaya Jia
147
416
0
19 Apr 2021
Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher
Guangda Ji
Zhanxing Zhu
43
32
0
20 Oct 2020
Large scale distributed neural network training through online distillation
Rohan Anil
Gabriel Pereyra
Alexandre Passos
Róbert Ormándi
George E. Dahl
Geoffrey E. Hinton
FedML
267
402
0
09 Apr 2018
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
M. Andreetto
Hartwig Adam
3DH
948
20,214
0
17 Apr 2017
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie
Ross B. Girshick
Piotr Dollár
Z. Tu
Kaiming He
261
10,106
0
16 Nov 2016
1