11
0

Boosting of Head Pose Estimation by Knowledge Distillation

Abstract

We propose a response-based method of knowledge distillation (KD) for the head pose estimation problem. A student model trained by the proposed KD achieves results better than a teacher model, which is atypical for the response-based method. Our method consists of two stages. In the first stage, we trained the base neural network (NN), which has one regression head and four regression via classification (RvC) heads. We build the convolutional ensemble over the base NN using offsets of face bounding boxes over a regular grid. In the second stage, we perform KD from the convolutional ensemble into the final NN with one RvC head. The KD improves the results by an average of 7.7\% compared to base NN. This feature makes it possible to use KD as a booster and effectively train deeper NNs. NNs trained by our KD method partially improved the state-of-the-art results. KD-ResNet152 has the best results, and KD-ResNet18 has a better result on the AFLW2000 dataset than any previous method.We have made publicly available trained NNs and face bounding boxes for the 300W-LP, AFLW, AFLW2000, and BIWI datasets.Our method potentially can be effective for other regression problems.

View on arXiv
Comments on this paper