ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.05187
  4. Cited By
Learning mirror maps in policy mirror descent

Learning mirror maps in policy mirror descent

7 February 2024
Carlo Alfano
Sebastian Towers
Silvia Sapora
Chris Xiaoxuan Lu
Patrick Rebeschini
ArXivPDFHTML

Papers citing "Learning mirror maps in policy mirror descent"

5 / 5 papers shown
Title
Policy Mirror Descent Inherently Explores Action Space
Policy Mirror Descent Inherently Explores Action Space
Yan Li
Guanghui Lan
OffRL
51
8
0
08 Mar 2023
A Novel Framework for Policy Mirror Descent with General
  Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Carlo Alfano
Rui Yuan
Patrick Rebeschini
54
15
0
30 Jan 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
303
11,881
0
04 Mar 2022
Actor-critic is implicitly biased towards high entropy optimal policies
Actor-critic is implicitly biased towards high entropy optimal policies
Yuzheng Hu
Ziwei Ji
Matus Telgarsky
52
11
0
21 Oct 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence,
  New Sampling Complexity, and Generalized Problem Classes
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
136
0
30 Jan 2021
1