Inference algorithms for pattern-based CRFs on sequence data

We consider Conditional Random Fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) is the sum of terms over intervals where each term is non-zero only if the substring equals a prespecified pattern . Such CRFs were used in computer vision, and can be naturally applied to many sequence tagging problems. Let be the set of input patterns and be their total length. (Komodakis & Paragios, 2009) showed how to compute MAP in time when all costs are non-positive. We present a modification that has the same worst-case complexity but can beat it in the best case. More importantly, we give efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) MAP in the general case (i.e. when costs can be positive). Their complexities are respectively , , and where is the input alphabet and .
View on arXiv