207

Think Again Networks, the Delta Loss, and an Application in Language Modeling

Abstract

This short paper introduces an abstraction called Think Again Networks (ThinkNet) which can be applied to any state-dependent function (such as a recurrent neural network). Here we show a simple application in Language Modeling which achieves state of the art perplexity on the Penn Treebank.

View on arXiv
Comments on this paper