Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies

24 March 2022

Papers citing "Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies"

18 / 18 papers shown

Title
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning Srinjoy Roy Swagatam Das 19 0 0 31 Mar 2024
Horizon-Free Regret for Linear Markov Decision Processes Zihan Zhang Jason D. Lee Yuxin Chen Simon S. Du 23 3 0 15 Mar 2024
Optimal Multi-Distribution Learning Zihan Zhang Wenhao Zhan Yuxin Chen Simon S. Du Jason D. Lee 21 12 0 08 Dec 2023
Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation Jiayi Huang Han Zhong Liwei Wang Lin F. Yang 24 2 0 07 Dec 2023
Settling the Sample Complexity of Online Reinforcement Learning Zihan Zhang Yuxin Chen Jason D. Lee S. Du OffRL 90 21 0 25 Jul 2023
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs Kaixuan Ji Qingyue Zhao Jiafan He Weitong Zhang Q. Gu 34 4 0 15 May 2023
Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation Yifei Min Jiafan He Tianhao Wang Quanquan Gu 33 7 0 10 May 2023
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs Junkai Zhang Weitong Zhang Quanquan Gu 4 3 0 17 Mar 2023
Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret Han Zhong Jiachen Hu Yecheng Xue Tongyang Li Liwei Wang 13 3 0 21 Feb 2023
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency Heyang Zhao Jiafan He Dongruo Zhou Tong Zhang Quanquan Gu 24 27 0 21 Feb 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments Runlong Zhou Zihan Zhang S. Du 18 10 0 31 Jan 2023
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes Runlong Zhou Ruosong Wang S. Du 23 3 0 20 Oct 2022
A Unified Algorithm for Stochastic Path Problems Christoph Dann Chen-Yu Wei Julian Zimmert 28 0 0 17 Oct 2022
Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path Liyu Chen Andrea Tirinzoni Matteo Pirotta A. Lazaric 24 3 0 10 Oct 2022
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient Ming Yin Mengdi Wang Yu-Xiang Wang OffRL 43 11 0 03 Oct 2022
Computationally Efficient Horizon-Free Reinforcement Learning for Linear Mixture MDPs Dongruo Zhou Quanquan Gu 73 43 0 23 May 2022
Slowly Changing Adversarial Bandit Algorithms are Efficient for Discounted MDPs Ian A. Kash L. Reyzin Zishun Yu 18 0 0 18 May 2022
UCB Momentum Q-learning: Correcting the bias without forgetting Pierre Menard O. D. Domingues Xuedong Shang Michal Valko 72 40 0 01 Mar 2021