A Unifying View of Optimism in Episodic Reinforcement Learning

3 July 2020

Papers citing "A Unifying View of Optimism in Episodic Reinforcement Learning"

26 / 26 papers shown

Title
TW-CRL: Time-Weighted Contrastive Reward Learning for Efficient Inverse Reinforcement Learning Yuxuan Li Ning Yang Ning Yang Stephen Xia OffRL 53 0 0 08 Apr 2025
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling Jasmine Bayrooti Carl Henrik Ek Amanda Prorok 42 0 0 07 Oct 2024
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond Xutong Liu Siwei Wang Jinhang Zuo Han Zhong Xuchuang Wang Zhiyong Wang Shuai Li Mohammad Hajiesmaili J. C. Lui Wei Chen 85 1 0 03 Jun 2024
Offline RL via Feature-Occupancy Gradient Ascent Gergely Neu Nneka Okolo OffRL 34 0 0 22 May 2024
Behind the Myth of Exploration in Policy Gradients Adrien Bolland Gaspard Lambrechts Damien Ernst 59 0 0 31 Jan 2024
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics Michal Nauman Marek Cygan 40 1 0 30 Oct 2023
Settling the Sample Complexity of Online Reinforcement Learning Zihan Zhang Yuxin Chen Jason D. Lee S. Du OffRL 98 22 0 25 Jul 2023
Zero-sum Polymatrix Markov Games: Equilibrium Collapse and Efficient Computation of Nash Equilibria Fivos Kalogiannis Ioannis Panageas 39 8 0 23 May 2023
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage Jose H. Blanchet Miao Lu Tong Zhang Han Zhong OffRL 45 30 0 16 May 2023
Does Sparsity Help in Learning Misspecified Linear Bandits? Jialin Dong Lin F. Yang 25 1 0 29 Mar 2023
Sharp Variance-Dependent Bounds in Reinforcement Learning: Best of Both Worlds in Stochastic and Deterministic Environments Runlong Zhou Zihan Zhang S. Du 44 10 0 31 Jan 2023
Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization Gergely Neu Nneka Okolo 37 6 0 21 Oct 2022
Convex duality for stochastic shortest path problems in known and unknown environments Kelli Francis-Staite 29 0 0 31 Jul 2022
Active Exploration via Experiment Design in Markov Chains Mojmír Mutný Tadeusz Janik Andreas Krause 43 14 0 29 Jun 2022
Horizon-Free Reinforcement Learning in Polynomial Time: the Power of Stationary Policies Zihan Zhang Xiangyang Ji S. Du 30 21 0 24 Mar 2022
Settling the Horizon-Dependence of Sample Complexity in Reinforcement Learning Yuanzhi Li Ruosong Wang Lin F. Yang 27 20 0 01 Nov 2021
Representation Learning for Online and Offline RL in Low-rank MDPs Masatoshi Uehara Xuezhou Zhang Wen Sun OffRL 62 127 0 09 Oct 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation Yifei Min Tianhao Wang Dongruo Zhou Quanquan Gu OffRL 37 38 0 22 Jun 2021
The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces Chi Jin Qinghua Liu Tiancheng Yu 26 50 0 07 Jun 2021
Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret Jean Tarbouriech Runlong Zhou S. Du Matteo Pirotta M. Valko A. Lazaric 59 35 0 22 Apr 2021
Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms Chi Jin Qinghua Liu Sobhan Miryoosefi OffRL 35 215 0 01 Feb 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints Chi Jin Zhuoran Yang Zhaoran Wang OffRL 122 166 0 06 Jan 2021
Logistic Q-Learning Joan Bas-Serrano Sebastian Curi Andreas Krause Gergely Neu 14 40 0 21 Oct 2020
Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs Jiafan He Dongruo Zhou Quanquan Gu 21 37 0 01 Oct 2020
Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon Zihan Zhang Xiangyang Ji S. Du OffRL 17 104 0 28 Sep 2020
Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning Sebastian Curi Felix Berkenkamp Andreas Krause 33 82 0 15 Jun 2020