Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

30 January 2023

Papers citing "Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation"

16 / 16 papers shown

Title
Enhancing PPO with Trajectory-Aware Hybrid Policies Qisai Liu Zhanhong Jiang Hsin-Jung Yang Mahsa Khosravi Joshua R. Waite S. Sarkar 44 0 0 21 Feb 2025
Imitation Learning in Discounted Linear MDPs without exploration assumptions Luca Viano Stratis Skoulakis V. Cevher 30 3 0 03 May 2024
Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition Long-Fei Li Peng Zhao Zhi-Hua Zhou 37 4 0 07 Mar 2024
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback Canzhe Zhao Ruofeng Yang Baoxiang Wang Xuezhou Zhang Shuai Li 22 2 0 14 Nov 2023
Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback Haolin Liu Chen-Yu Wei Julian Zimmert 17 6 0 17 Oct 2023
Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits Haolin Liu Chen-Yu Wei Julian Zimmert 25 9 0 02 Sep 2023
Rate-Optimal Policy Optimization for Linear Markov Decision Processes Uri Sherman Alon Cohen Tomer Koren Yishay Mansour 28 7 0 28 Aug 2023
Is RLHF More Difficult than Standard RL? Yuanhao Wang Qinghua Liu Chi Jin OffRL 9 57 0 25 Jun 2023
A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes Han Zhong Tong Zhang 19 26 0 15 May 2023
Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback Tal Lancewicki Aviv A. Rosenberg Dmitry Sotnikov 24 3 0 13 May 2023
Refined Regret for Adversarial MDPs with Linear Function Approximation Yan Dai Haipeng Luo Chen-Yu Wei Julian Zimmert 18 12 0 30 Jan 2023
Near-Optimal Regret for Adversarial MDP with Delayed Bandit Feedback Tiancheng Jin Tal Lancewicki Haipeng Luo Yishay Mansour Aviv A. Rosenberg 66 21 0 31 Jan 2022
First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach Andrew Wagenmaker Yifang Chen Max Simchowitz S. Du Kevin G. Jamieson 71 36 0 07 Dec 2021
Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs Jiafan He Dongruo Zhou Quanquan Gu 95 23 0 17 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes Guanghui Lan 87 136 0 30 Jan 2021
Deep Reinforcement Learning for Autonomous Driving: A Survey B. R. Kiran Ibrahim Sobh V. Talpaert Patrick Mannion A. A. Sallab S. Yogamani P. Pérez 143 1,628 0 02 Feb 2020