ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02847
  4. Cited By
Navigating to the Best Policy in Markov Decision Processes

Navigating to the Best Policy in Markov Decision Processes

5 June 2021
Aymen Al Marjani
Aurélien Garivier
Alexandre Proutière
ArXivPDFHTML

Papers citing "Navigating to the Best Policy in Markov Decision Processes"

3 / 3 papers shown
Title
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
70
2
0
10 Oct 2024
Best Policy Identification in Linear MDPs
Best Policy Identification in Linear MDPs
Jerome Taupin
Yassir Jedra
Alexandre Proutière
26
3
0
11 Aug 2022
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Gap-Dependent Unsupervised Exploration for Reinforcement Learning
Jingfeng Wu
Vladimir Braverman
Lin F. Yang
15
12
0
11 Aug 2021
1