Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.05187
Cited By
Learning mirror maps in policy mirror descent
7 February 2024
Carlo Alfano
Sebastian Towers
Silvia Sapora
Chris Xiaoxuan Lu
Patrick Rebeschini
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning mirror maps in policy mirror descent"
5 / 5 papers shown
Title
Policy Mirror Descent Inherently Explores Action Space
Yan Li
Guanghui Lan
OffRL
51
8
0
08 Mar 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
54
15
0
30 Jan 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
52
11
0
21 Oct 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
136
0
30 Jan 2021
1