Facial landmark detection using structured output deep neural networks
Facial landmark detection is an important step for many perception tasks such as face recognition and facial analysis. Regression-based methods have shown a large success. In particular, deep neural networks (DNN) has demonstrated a strong capability to model the high non-linearity between the face image and the face shape. In this paper, we tackle this task as a structured output problem, where we exploit the strong dependencies that lie between the outputs. Beside learning a regression mapping function from the input to the output, we learn, in an unsupervised way, the inter-dependencies between the outputs. For this, we propose a generic regression framework for structured output problems. Our framework allows a successful incorporation of learning the output structure into DNN using the pre-training trick. We apply our method on a facial landmark detection task, where the output is strongly structured. We evaluate our DNN, named Input/Output Deep Architecture (IODA), on two public challenging datasets: LFPW and HELEN. We show that IODA outperforms traditional deep architectures.
View on arXiv