Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog

30 June 2019

Papers citing "Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog"

2 / 102 papers shown

Title
Deep Reinforcement Learning for Dialogue Generation Jiwei Li Will Monroe Alan Ritter Michel Galley Jianfeng Gao Dan Jurafsky 220 1,328 0 05 Jun 2016
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning Y. Gal Zoubin Ghahramani UQCV BDL 287 9,156 0 06 Jun 2015