ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.12922
  4. Cited By
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of
  Stationary Policies

Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

24 March 2022
Zihan Zhang
Xiangyang Ji
S. Du
ArXivPDFHTML

Papers citing "Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies"

18 / 18 papers shown
Title
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Srinjoy Roy
Swagatam Das
19
0
0
31 Mar 2024
Horizon-Free Regret for Linear Markov Decision Processes
Horizon-Free Regret for Linear Markov Decision Processes
Zihan Zhang
Jason D. Lee
Yuxin Chen
Simon S. Du
23
3
0
15 Mar 2024
Optimal Multi-Distribution Learning
Optimal Multi-Distribution Learning
Zihan Zhang
Wenhao Zhan
Yuxin Chen
Simon S. Du
Jason D. Lee
21
12
0
08 Dec 2023
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement
  Learning with General Function Approximation
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation
Jiayi Huang
Han Zhong
Liwei Wang
Lin F. Yang
22
2
0
07 Dec 2023
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
90
21
0
25 Jul 2023
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
Kaixuan Ji
Qingyue Zhao
Jiafan He
Weitong Zhang
Q. Gu
34
4
0
15 May 2023
Cooperative Multi-Agent Reinforcement Learning: Asynchronous
  Communication and Linear Function Approximation
Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation
Yifei Min
Jiafan He
Tianhao Wang
Quanquan Gu
33
7
0
10 May 2023
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
Junkai Zhang
Weitong Zhang
Quanquan Gu
4
3
0
17 Mar 2023
Provably Efficient Exploration in Quantum Reinforcement Learning with
  Logarithmic Worst-Case Regret
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret
Han Zhong
Jiachen Hu
Yecheng Xue
Tongyang Li
Liwei Wang
13
3
0
21 Feb 2023
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
  Learning: Adaptivity and Computational Efficiency
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Heyang Zhao
Jiafan He
Dongruo Zhou
Tong Zhang
Quanquan Gu
24
27
0
21 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both
  Worlds in Stochastic and Deterministic Environments
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments
Runlong Zhou
Zihan Zhang
S. Du
16
10
0
31 Jan 2023
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent
  Markov Decision Processes
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes
Runlong Zhou
Ruosong Wang
S. Du
18
3
0
20 Oct 2022
A Unified Algorithm for Stochastic Path Problems
A Unified Algorithm for Stochastic Path Problems
Christoph Dann
Chen-Yu Wei
Julian Zimmert
28
0
0
17 Oct 2022
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic
  Shortest Path
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path
Liyu Chen
Andrea Tirinzoni
Matteo Pirotta
A. Lazaric
24
3
0
10 Oct 2022
Offline Reinforcement Learning with Differentiable Function
  Approximation is Provably Efficient
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Ming Yin
Mengdi Wang
Yu-Xiang Wang
OffRL
43
11
0
03 Oct 2022
Computationally Efficient Horizon-Free Reinforcement Learning for Linear
  Mixture MDPs
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs
Dongruo Zhou
Quanquan Gu
73
43
0
23 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for
  Discounted MDPs
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs
Ian A. Kash
L. Reyzin
Zishun Yu
16
0
0
18 May 2022
UCB Momentum Q-learning: Correcting the bias without forgetting
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
72
40
0
01 Mar 2021
1