Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.05828
Cited By
A general class of surrogate functions for stable and efficient reinforcement learning
12 August 2021
Sharan Vaswani
Olivier Bachem
Simone Totaro
Robert Mueller
Shivam Garg
M. Geist
Marlos C. Machado
P. S. Castro
Nicolas Le Roux
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A general class of surrogate functions for stable and efficient reinforcement learning"
4 / 4 papers shown
Title
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
89
2
0
18 Mar 2025
Mirror Descent Actor Critic via Bounded Advantage Learning
Ryo Iwaki
93
0
0
06 Feb 2025
Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize
Ryan DÓrazio
Nicolas Loizou
I. Laradji
Ioannis Mitliagkas
27
30
0
28 Oct 2021
A general sample complexity analysis of vanilla policy gradient
Rui Yuan
Robert Mansel Gower
A. Lazaric
69
62
0
23 Jul 2021
1