Model Distillation with Knowledge Transfer in Face Classification,
Alignment and Verification
- CVBM

Knowledge distillation is a potential solution for model compression. The idea is to make a small student model imitate the output of a large teacher model, thus the student that is competitive to the teacher can be obtained. Most previous studies focus only on the classification task where they propose different teacher supervision, but other tasks are barely considered, and they mostly ignore the importance of student initialization. To overcome the two limitations, in this paper, we propose face model distillation with strong student initialization and knowledge transfer, which can boost not only the task of face classification, but also domain-similar tasks including face alignment and verification. First, in face classification, a student model with all layers initialized is trained in a multi-task way with its class labels and teacher supervision. Then, the similar multi-task training is adopted with the knowledge transferred from classification to alignment and verification. Evaluation on the CASIA-WebFace and CelebA datasets demonstrates that the student can be competitive to the teacher in all the three tasks, and even surpasses the teacher under appropriate compression rates. Moreover, we also test the proposed method on the large-scale MS-Celeb-1M database, where the student can also achieve competitive performance.
View on arXiv