306

Generative Bridging Network in Neural Sequence Prediction

Abstract

Maximum Likelihood Estimation(MLE) has been known to pose data sparsity challenge in sequence prediction tasks, in order to alleviate data sparseness, we propose a novel framework to train sequence model via a bridging process. Unlike MLE which optimizes the sequence generator by directly maximizing the likelihood of ground truth sequence given the input, our proposed framework designs a bridge to connect generator with ground truth. During training, we first follow certain constraints to transform the pointwise ground truth as a bridge distribution, then match the generator's output distribution with the transformed bridge distribution by minimizing their KL-divergence. By imposing different constraints, bridge distribution will adopt different properties. In order to increase output diversity, enhance language smoothness and lower learning burden, we design three different regularization constraints to construct different bridge distributions. Combining these bridges with sequence generator, we can build three parallel generative bridging networks, namely uniform GBN, language-model GBN and coaching GBN. Experimental results on two recognized sequence prediction tasks have shown that GBN can yield significant improvements over the baseline system. Furthermore, we draw samples from three bridge distributions to analyze their different properties and verify their influences on the sequence model learning.

View on arXiv
Comments on this paper