421

A Small Gain Analysis of Single Timescale Actor Critic

SIAM Journal of Control and Optimization (SICON), 2022
Abstract

We consider a version of actor-critic which uses proportional step-sizes and only one critic update with a single sample from the stationary distribution per actor step. We provide an analysis of this method using the small gain theorem. Specifically, we prove that this method can be used to find a stationary point, and that the resulting sample complexity improves the state of the art for actor-critic methods from O(μ4ϵ2)O \left(\mu^{-4} \epsilon^{-2} \right) to O(μ2ϵ2)O \left(\mu^{-2} \epsilon^{-2} \right) to find an ϵ\epsilon-approximate stationary point where μ\mu is the condition number associated with the critic.

View on arXiv
Comments on this paper