A Latent Variable Recurrent Neural Network for Discourse Relation
Language Models
- BDL
This paper presents a novel latent variable recurrent neural network architecture for jointly modeling sequences of words and (possibly latent) discourse relations that link adjacent sentences. A recurrent neural network generates individual words, thus reaping the benefits of discriminatively-trained vector representations. The discourse relations are represented with a latent variable, which can be predicted or marginalized, depending on the task. The resulting model outperforms state-of-the-art alternatives for implicit discourse relation classification in the Penn Discourse Treebank, and for dialog act classification in the Switchboard corpus. By marginalizing over latent discourse relations, it also yields a language model that improves on a strong recurrent neural network baseline.
View on arXiv