On the Performance of Temporal Difference Learning With Neural Networks
International Conference on Learning Representations (ICLR), 2023
Abstract
Neural Temporal Difference (TD) Learning is an approximate temporal difference method for policy evaluation that uses a neural network for function approximation. Analysis of Neural TD Learning has proven to be challenging. In this paper we provide a convergence analysis of Neural TD Learning with a projection onto , a ball of fixed radius around the initial point . We show an approximation bound of where is the approximation quality of the best neural network in and is the width of all hidden layers in the network.
View on arXivComments on this paper
