Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2303.04386
Cited By
Policy Mirror Descent Inherently Explores Action Space
8 March 2023
Yan Li
Guanghui Lan
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Policy Mirror Descent Inherently Explores Action Space"
3 / 3 papers shown
Title
First-order Policy Optimization for Robust Markov Decision Process
Yan Li
Guanghui Lan
Tuo Zhao
20
22
0
21 Sep 2022
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
18
11
0
21 Oct 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
31
120
0
30 Jan 2021
1