Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.14156
Cited By
Policy Mirror Descent with Lookahead
21 March 2024
Kimon Protopapas
Anas Barakat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Policy Mirror Descent with Lookahead"
10 / 10 papers shown
Title
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
Jincheng Mei
Bo Dai
Alekh Agarwal
Mohammad Ghavamzadeh
Csaba Szepesvári
Dale Schuurmans
52
4
0
02 Apr 2025
Functional Acceleration for Policy Mirror Descent
Veronica Chelu
Doina Precup
23
0
0
23 Jul 2024
Policy Mirror Descent Inherently Explores Action Space
Yan Li
Guanghui Lan
OffRL
51
8
0
08 Mar 2023
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
26
11
0
22 Feb 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
54
15
0
30 Jan 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,730
0
04 Mar 2022
Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction
Assaf Hallak
Gal Dalal
Steven Dalton
I. Frosio
Shie Mannor
Gal Chechik
OffRL
OnRL
29
9
0
04 Jul 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
135
0
30 Jan 2021
On Linear Convergence of Policy Gradient Methods for Finite MDPs
Jalaj Bhandari
Daniel Russo
48
59
0
21 Jul 2020
Beyond the One Step Greedy Approach in Reinforcement Learning
Yonathan Efroni
Gal Dalal
B. Scherrer
Shie Mannor
OffRL
40
48
0
10 Feb 2018
1