17
20

A Small Gain Analysis of Single Timescale Actor Critic

Abstract

We consider a version of actor-critic which uses proportional step-sizes and only one critic update with a single sample from the stationary distribution per actor step. We provide an analysis of this method using the small-gain theorem. Specifically, we prove that this method can be used to find a stationary point, and that the resulting sample complexity improves the state of the art for actor-critic methods to O(μ2ϵ2)O \left(\mu^{-2} \epsilon^{-2} \right) to find an ϵ\epsilon-approximate stationary point where μ\mu is the condition number associated with the critic.

View on arXiv
Comments on this paper