Multi-task Learning for Structured Output Prediction
A deep neural network model is a powerful framework for learning representations. Usually, it is used to learn the relation x to y by exploiting the regularities in the input x but without considering the output representation y. In structured output prediction problems, where the output is multi-dimensional and where structural relations exist between the dimensions, the network usually tends to overfit when the training data are limited. In order to overcome this issue and circumvent the large required data to output accurate predictions, we propose in this paper a regularization scheme for training neural networks for these particular tasks. Our proposed scheme aims at incorporating the learning of the output representation y in the training process while learning the mapping function x to y. Our proposition is a multi-task framework containing two unsupervised tasks over the input and the output data along with the supervised task. We experimented the use of the output labels y without their corresponding input x. We evaluate our framework on a facial landmark detection problem which is a typical structured output prediction task. We show over two public challenging datasets (LFPW and HELEN) that our regularization scheme improves the generalization of deep neural networks and accelerates their training. The use of unlabeled data is also explored, showing an additional improvement of the results. We provide an opensource implementation https://github.com/sbelharbi/structured-output-ae of our framework.
View on arXiv