Estimating Risk and Uncertainty in Deep Reinforcement Learning

23 May 2019

W. Clements

B. V. Delft

Benoît-Marie Robaglia

Reda Bahi Slaoui

Sébastien Toth

ArXiv (abs)PDF HTML

Abstract

We demonstrate a method for separately estimating aleatoric risk and epistemic uncertainty in deep reinforcement learning. Aleatoric risk, which arises from inherently stochastic environments or agents, must be accounted for in the design of risk-sensitive algorithms. Epistemic uncertainty, which stems from limited data, is important both for risk-sensitivity and to efficiently explore an environment. We present a Bayesian framework for learning the return distribution in reinforcement learning, which provides theoretical foundations for quantifying both types of uncertainty. The variance of the return distribution yields the aleatoric uncertainty, and our Bayesian formulation provides the epistemic uncertainty. Based on our framework, we show that the disagreement between only two neural networks is sufficient to produce an estimate of the epistemic uncertainty on the expected return, thus providing a simple and computationally cheap uncertainty metric. We demonstrate experiments that illustrate our method and some applications.

View on arXiv

Comments on this paper