ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.08967
  4. Cited By
Efficient iterative policy optimization

Efficient iterative policy optimization

28 December 2016
Nicolas Le Roux
ArXiv (abs)PDFHTML

Papers citing "Efficient iterative policy optimization"

6 / 6 papers shown
Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)
Supervised Fine Tuning on Curated Data is Reinforcement Learning (and can be improved)
Chongli Qin
Jost Tobias Springenberg
OffRL
215
12
0
17 Jul 2025
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Tapered Off-Policy REINFORCE: Stable and efficient reinforcement learning for LLMs
Nicolas Le Roux
Marc G. Bellemare
Jonathan Lebensold
Arnaud Bergeron
Joshua Greaves
Alex Fréchette
Carolyne Pelletier
Eric Thibodeau-Laufer
Sándor Toth
Sam Work
OffRL
503
34
0
18 Mar 2025
Boosted Off-Policy Learning
Boosted Off-Policy LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Ben London
Levi Lu
Ted Sandler
Thorsten Joachims
OffRL
304
4
0
01 Aug 2022
An operator view of policy gradient methods
An operator view of policy gradient methods
Dibya Ghosh
Marlos C. Machado
Nicolas Le Roux
OffRL
289
28
0
19 Jun 2020
Deployment-Efficient Reinforcement Learning via Model-Based Offline
  Optimization
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
T. Matsushima
Hiroki Furuta
Y. Matsuo
Ofir Nachum
S. Gu
OffRL
309
162
0
05 Jun 2020
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
202
530
0
14 Jun 2018
1
Page 1 of 1