A general class of surrogate functions for stable and efficient
reinforcement learning

A general class of surrogate functions for stable and efficient reinforcement learning

12 August 2021

Olivier Bachem

Marlos C. Machado

Nicolas Le Roux

Papers citing "A general class of surrogate functions for stable and efficient reinforcement learning"

4 / 4 papers shown

Title
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs Nicolas Le Roux Marc G. Bellemare Jonathan Lebensold Arnaud Bergeron Joshua Greaves Alex Fréchette Carolyne Pelletier Eric Thibodeau-Laufer Sándor Toth Sam Work OffRL 89 2 0 18 Mar 2025
Mirror Descent Actor Critic via Bounded Advantage Learning Ryo Iwaki 93 0 0 06 Feb 2025
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize Ryan DÓrazio Nicolas Loizou I. Laradji Ioannis Mitliagkas 27 30 0 28 Oct 2021
A general sample complexity analysis of vanilla policy gradient Rui Yuan Robert Mansel Gower A. Lazaric 69 62 0 23 Jul 2021