Multiple Template Learning for Structured Prediction
Conditional random field (CRF) and Structural Support Vector Machine (SVM) are two state-of-the-art algorithms for structured prediction, which captures the interdependency among output variables. The success of these algorithms is attributed to the fact that their discriminative models can account for overlapping features on the whole input observations. These features are usually generated by applying a given set of templates on labeled data, but improper templates may lead to degraded performance. To alleviate this issue, in this paper, we propose a novel multiple template learning paradigm to learn structured prediction and the importance of each template simultaneously, so that arbitrary templates could be added into the learning model without caution. This paradigm can be formulated as a special multiple kernel learning problem with exponential number of constraints. Then we introduce an efficient cutting plane algorithm to solve this problem in the primal. We also evaluate the proposed learning paradigm on two widely-studied structured prediction tasks, i.e. sequence labeling and dependency parsing. Extensive experimental results show that the proposed method outperforms CRFs and Structural SVMs due to exploiting the importance of each template.
View on arXiv