Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1612.08967
Cited By
Efficient iterative policy optimization
28 December 2016
Nicolas Le Roux
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Efficient iterative policy optimization"
6 / 6 papers shown
Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)
Chongli Qin
Jost Tobias Springenberg
OffRL
215
12
0
17 Jul 2025
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
503
34
0
18 Mar 2025
Boosted Off-Policy Learning
International Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Ben London
Levi Lu
Ted Sandler
Thorsten Joachims
OffRL
304
4
0
01 Aug 2022
An operator view of policy gradient methods
Dibya Ghosh
Marlos C. Machado
Nicolas Le Roux
OffRL
289
28
0
19 Jun 2020
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
T. Matsushima
Hiroki Furuta
Y. Matsuo
Ofir Nachum
S. Gu
OffRL
309
162
0
05 Jun 2020
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
202
530
0
14 Jun 2018
1
Page 1 of 1