Learning to Decode for Future Success
We introduce a general strategy for improving neural sequence generation by incorporating knowledge about the future. Our decoder combines a standard sequence decoder with a `soothsayer' prediction function Q that estimates the outcome in the future of generating a word in the present. Our model draws on the same intuitions as reinforcement learning, but is both simpler and higher performing, avoiding known problems with the use of reinforcement learning in tasks with enormous search spaces like sequence generation. We demonstrate our model by incorporating Q functions that incrementally predict what the future BLEU or ROUGE score of the completed sequence will be, its future length, and the backwards probability of the source given the future target sequence. Experimental results show that future rediction yields improved performance in abstractive summarization and conversational response generation and the state-of-the-art in machine translation, while also enabling the decoder to generate outputs that have specific properties.
View on arXiv