ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.09816
  4. Cited By
Towards Understanding Ensemble, Knowledge Distillation and
  Self-Distillation in Deep Learning

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

17 December 2020
Zeyuan Allen-Zhu
Yuanzhi Li
    FedML
ArXivPDFHTML

Papers citing "Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning"

50 / 215 papers shown
Title
Feature Space Particle Inference for Neural Network Ensembles
Feature Space Particle Inference for Neural Network Ensembles
Shingo Yashima
Teppei Suzuki
Kohta Ishikawa
Ikuro Sato
Rei Kawakami
BDL
12
11
0
02 Jun 2022
CDFKD-MFS: Collaborative Data-free Knowledge Distillation via
  Multi-level Feature Sharing
CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing
Zhiwei Hao
Yong Luo
Zhi Wang
Han Hu
J. An
37
27
0
24 May 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Sosuke Kobayashi
Shun Kiyono
Jun Suzuki
Kentaro Inui
MoMe
21
7
0
24 May 2022
The Importance of Being Parameters: An Intra-Distillation Method for
  Serious Gains
The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains
Haoran Xu
Philipp Koehn
Kenton W. Murray
MoMe
19
4
0
23 May 2022
ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent
  Training
ELODI: Ensemble Logit Difference Inhibition for Positive-Congruent Training
Yue Zhao
Yantao Shen
Yuanjun Xiong
Shuo Yang
Wei Xia
Z. Tu
Bernt Shiele
Stefano Soatto
BDL
38
6
0
12 May 2022
Knowledge Distillation for Multi-Target Domain Adaptation in Real-Time
  Person Re-Identification
Knowledge Distillation for Multi-Target Domain Adaptation in Real-Time Person Re-Identification
Félix Remigereau
Djebril Mekhazni
Sajjad Abdoli
Le Thanh Nguyen-Meidine
Rafael M. O. Cruz
Eric Granger
14
9
0
12 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
27
34
0
12 May 2022
Heterogeneous Ensemble Knowledge Transfer for Training Large Models in
  Federated Learning
Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning
Yae Jee Cho
Andre Manoel
Gauri Joshi
Robert Sim
Dimitrios Dimitriadis
FedML
22
128
0
27 Apr 2022
BLCU-ICALL at SemEval-2022 Task 1: Cross-Attention Multitasking
  Framework for Definition Modeling
BLCU-ICALL at SemEval-2022 Task 1: Cross-Attention Multitasking Framework for Definition Modeling
Cunliang Kong
Yujie Wang
Ruining Chong
Liner Yang
Hengyuan Zhang
Erhong Yang
Yaping Huang
9
8
0
16 Apr 2022
Ensemble diverse hypotheses and knowledge distillation for unsupervised
  cross-subject adaptation
Ensemble diverse hypotheses and knowledge distillation for unsupervised cross-subject adaptation
Kuangen Zhang
Jiahong Chen
Jing Wang
Xinxing Chen
Yuquan Leng
Clarence W. de Silva
Chenglong Fu
41
5
0
15 Apr 2022
Qtrade AI at SemEval-2022 Task 11: An Unified Framework for Multilingual
  NER Task
Qtrade AI at SemEval-2022 Task 11: An Unified Framework for Multilingual NER Task
Weichao Gan
Yuan-Chun Lin
Guang-Zhu Yu
Guimin Chen
Qian Ye
17
5
0
14 Apr 2022
Bimodal Distributed Binarized Neural Networks
Bimodal Distributed Binarized Neural Networks
T. Rozen
Moshe Kimhi
Brian Chmiel
A. Mendelson
Chaim Baskin
MQ
36
4
0
05 Apr 2022
Unified and Effective Ensemble Knowledge Distillation
Unified and Effective Ensemble Knowledge Distillation
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
FedML
19
10
0
01 Apr 2022
Efficient Maximal Coding Rate Reduction by Variational Forms
Efficient Maximal Coding Rate Reduction by Variational Forms
Christina Baek
Ziyang Wu
Kwan Ho Ryan Chan
Tianjiao Ding
Yi-An Ma
B. Haeffele
31
9
0
31 Mar 2022
Understanding out-of-distribution accuracies through quantifying
  difficulty of test samples
Understanding out-of-distribution accuracies through quantifying difficulty of test samples
Berfin Simsek
Melissa Hall
Levent Sagun
23
5
0
28 Mar 2022
UTSA NLP at SemEval-2022 Task 4: An Exploration of Simple Ensembles of
  Transformers, Convolutional, and Recurrent Neural Networks
UTSA NLP at SemEval-2022 Task 4: An Exploration of Simple Ensembles of Transformers, Convolutional, and Recurrent Neural Networks
Xingmeng Zhao
Anthony Rios
11
0
0
28 Mar 2022
Linking Emergent and Natural Languages via Corpus Transfer
Linking Emergent and Natural Languages via Corpus Transfer
Shunyu Yao
Mo Yu
Yang Zhang
Karthik Narasimhan
J. Tenenbaum
Chuang Gan
21
15
0
24 Mar 2022
Importance Sampling CAMs for Weakly-Supervised Segmentation
Importance Sampling CAMs for Weakly-Supervised Segmentation
Arvi Jonnarth
M. Felsberg
19
7
0
23 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network
  Fail in Deep Learning? (Provably)
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
11
88
0
23 Mar 2022
Efficient Split-Mix Federated Learning for On-Demand and In-Situ
  Customization
Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization
Junyuan Hong
Haotao Wang
Zhangyang Wang
Jiayu Zhou
FedML
11
56
0
18 Mar 2022
Easy Ensemble: Simple Deep Ensemble Learning for Sensor-Based Human
  Activity Recognition
Easy Ensemble: Simple Deep Ensemble Learning for Sensor-Based Human Activity Recognition
Tatsuhito Hasegawa
Kazuma Kondo
18
7
0
08 Mar 2022
Better Supervisory Signals by Observing Learning Paths
Better Supervisory Signals by Observing Learning Paths
Yi Ren
Shangmin Guo
Danica J. Sutherland
25
21
0
04 Mar 2022
Ensembles of Vision Transformers as a New Paradigm for Automated
  Classification in Ecology
Ensembles of Vision Transformers as a New Paradigm for Automated Classification in Ecology
S. Kyathanahally
T. Hardeman
M. Reyes
E. Merz
T. Bulas
P. Brun
F. Pomati
M. Baity-Jesi
27
15
0
03 Mar 2022
Data Augmentation as Feature Manipulation
Data Augmentation as Feature Manipulation
Ruoqi Shen
Sébastien Bubeck
Suriya Gunasekar
MLT
12
16
0
03 Mar 2022
A Semi-supervised Learning Approach with Two Teachers to Improve
  Breakdown Identification in Dialogues
A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues
Qian Lin
Hwee Tou Ng
22
4
0
22 Feb 2022
What is Next when Sequential Prediction Meets Implicitly Hard
  Interaction?
What is Next when Sequential Prediction Meets Implicitly Hard Interaction?
Kaixi Hu
Lin Li
Qing Xie
Jianquan Liu
Xiaohui Tao
11
19
0
14 Feb 2022
Benign Overfitting in Two-layer Convolutional Neural Networks
Benign Overfitting in Two-layer Convolutional Neural Networks
Yuan Cao
Zixiang Chen
M. Belkin
Quanquan Gu
MLT
16
82
0
14 Feb 2022
Real World Large Scale Recommendation Systems Reproducibility and Smooth
  Activations
Real World Large Scale Recommendation Systems Reproducibility and Smooth Activations
G. Shamir
Dong Lin
HAI
OffRL
23
6
0
14 Feb 2022
Semi-supervised New Event Type Induction and Description via Contrastive
  Loss-Enforced Batch Attention
Semi-supervised New Event Type Induction and Description via Contrastive Loss-Enforced Batch Attention
Carl N. Edwards
Heng Ji
25
9
0
12 Feb 2022
Reproducibility in Optimization: Theoretical Framework and Limits
Reproducibility in Optimization: Theoretical Framework and Limits
Kwangjun Ahn
Prateek Jain
Ziwei Ji
Satyen Kale
Praneeth Netrapalli
G. Shamir
17
22
0
09 Feb 2022
Fortuitous Forgetting in Connectionist Networks
Fortuitous Forgetting in Connectionist Networks
Hattie Zhou
Ankit Vani
Hugo Larochelle
Aaron Courville
CLL
6
42
0
01 Feb 2022
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an
  Application to Question Answering Systems
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems
Yoshitomo Matsubara
Luca Soldaini
Eric Lind
Alessandro Moschitti
21
6
0
15 Jan 2022
Extracting knowledge from features with multilevel abstraction
Extracting knowledge from features with multilevel abstraction
Jin-Siang Lin
Zhaoyang Li
16
0
0
04 Dec 2021
Efficient Self-Ensemble for Semantic Segmentation
Efficient Self-Ensemble for Semantic Segmentation
Walid Bousselham
Guillaume Thibault
Lucas Pagano
Archana Machireddy
Joe W. Gray
Y. Chang
Xubo B. Song
ViT
25
24
0
26 Nov 2021
Self-Distilled Self-Supervised Representation Learning
Self-Distilled Self-Supervised Representation Learning
J. Jang
Seonhoon Kim
Kiyoon Yoo
Chaerin Kong
Jang-Hyun Kim
Nojun Kwak
SSL
17
14
0
25 Nov 2021
Multi-label Iterated Learning for Image Classification with Label
  Ambiguity
Multi-label Iterated Learning for Image Classification with Label Ambiguity
Sai Rajeswar
Pau Rodríguez López
Soumye Singhal
David Vazquez
Aaron C. Courville
VLM
15
30
0
23 Nov 2021
Recent Advances in Natural Language Processing via Large Pre-Trained
  Language Models: A Survey
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
71
1,029
0
01 Nov 2021
Transformer Ensembles for Sexism Detection
Transformer Ensembles for Sexism Detection
Lily Davies
Marta Baldracchi
C. Borella
K. Perifanos
ViT
6
3
0
29 Oct 2021
Towards Model Agnostic Federated Learning Using Knowledge Distillation
Towards Model Agnostic Federated Learning Using Knowledge Distillation
A. Afonin
Sai Praneeth Karimireddy
FedML
30
44
0
28 Oct 2021
Combining Different V1 Brain Model Variants to Improve Robustness to
  Image Corruptions in CNNs
Combining Different V1 Brain Model Variants to Improve Robustness to Image Corruptions in CNNs
A. Baidya
Joel Dapello
J. DiCarlo
Tiago Marques
AAML
25
6
0
20 Oct 2021
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for
  Efficient Distillation
Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation
Sumanth Chennupati
Mohammad Mahdi Kamani
Zhongwei Cheng
Lin Chen
19
4
0
19 Oct 2021
Combining Diverse Feature Priors
Combining Diverse Feature Priors
Saachi Jain
Dimitris Tsipras
A. Madry
58
14
0
15 Oct 2021
Dropout Prediction Uncertainty Estimation Using Neuron Activation
  Strength
Dropout Prediction Uncertainty Estimation Using Neuron Activation Strength
Haichao Yu
Zhe Chen
Dong Lin
G. Shamir
Jie Han
UQCV
25
0
0
13 Oct 2021
Deep Neural Compression Via Concurrent Pruning and Self-Distillation
Deep Neural Compression Via Concurrent Pruning and Self-Distillation
J. Ó. Neill
Sourav Dutta
H. Assem
VLM
8
5
0
30 Sep 2021
On the One-sided Convergence of Adam-type Algorithms in Non-convex
  Non-concave Min-max Optimization
On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Zehao Dou
Yuanzhi Li
26
13
0
29 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks
  with Proper Regularization
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
44
37
0
25 Aug 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods
  in Natural Language Processing
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
28
3,828
0
28 Jul 2021
Bagging, optimized dynamic mode decomposition (BOP-DMD) for robust,
  stable forecasting with spatial and temporal uncertainty-quantification
Bagging, optimized dynamic mode decomposition (BOP-DMD) for robust, stable forecasting with spatial and temporal uncertainty-quantification
Diya Sashidhar
J. Nathan Kutz
14
51
0
22 Jul 2021
Technical Report of Team GraphMIRAcles in the WikiKG90M-LSC Track of
  OGB-LSC @ KDD Cup 2021
Technical Report of Team GraphMIRAcles in the WikiKG90M-LSC Track of OGB-LSC @ KDD Cup 2021
Jianyu Cai
Jiajun Chen
Taoxing Pan
Zhanqiu Zhang
Jie Wang
16
1
0
12 Jul 2021
R-Drop: Regularized Dropout for Neural Networks
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
M. Zhang
Tie-Yan Liu
38
424
0
28 Jun 2021
Previous
12345
Next