Think Again Networks, the Delta Loss, and an Application in Language
Modeling
Abstract
This short paper introduces an abstraction called Think Again Networks (ThinkNet) which can be applied to any state-dependent function (such as a recurrent neural network). Here we show a simple application in Language Modeling which achieves state of the art perplexity on the Penn Treebank.
View on arXivComments on this paper
