30
0

Echoes of Socratic Doubt: Embracing Uncertainty in Calibrated Evidential Reinforcement Learning

Abstract

We present a novel statistical approach to incorporating uncertainty awareness in model-free distributional reinforcement learning involving quantile regression-based deep Q networks. The proposed algorithm, Calibrated Evidential Quantile Regression in Deep Q Networks (CEQR-DQN)\textit{Calibrated Evidential Quantile Regression in Deep Q Networks (CEQR-DQN)}, aims to address key challenges associated with separately estimating aleatoric and epistemic uncertainty in stochastic environments. It combines deep evidential learning with quantile calibration based on principles of conformal inference to provide explicit, sample-free computations of global\textit{global} uncertainty as opposed to local\textit{local} estimates based on simple variance, overcoming limitations of traditional methods in computational and statistical efficiency and handling of out-of-distribution (OOD) observations. Tested on a suite of miniaturized Atari games (i.e., MinAtar), CEQR-DQN is shown to surpass similar existing frameworks in scores and learning speed. Its ability to rigorously evaluate uncertainty improves exploration strategies and can serve as a blueprint for other algorithms requiring uncertainty awareness.

View on arXiv
Comments on this paper