Exploration-Enhanced POLITEX

27 August 2019

Papers citing "Exploration-Enhanced POLITEX"

19 / 19 papers shown

Sharper Model-free Reinforcement Learning for Average-reward Markov Decision ProcessesAnnual Conference Computational Learning Theory (COLT), 2023

Zihan Zhang

Qiaomin Xie

OffRL

271

28 Jun 2023

The Role of Coverage in Online Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022

398

09 Oct 2022

Proximal Point Imitation LearningNeural Information Processing Systems (NeurIPS), 2022

577

22 Sep 2022

Towards General Function Approximation in Zero-Sum Markov GamesInternational Conference on Learning Representations (ICLR), 2021

291

30 Jul 2021

Going Beyond Linear RL: Sample Efficient Neural Function Approximation

Kaixuan Huang

226

14 Jul 2021

Online Learning for Unknown Partially Observable MDPsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Mehdi Jafarnia-Jahromi

Rahul Jain

A. Nayyar

325

25 Feb 2021

Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Yue Wu

Dongruo Zhou

Quanquan Gu

214

15 Feb 2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient

374

08 Nov 2020

Online Sparse Reinforcement Learning

784

08 Nov 2020

Single-Timescale Actor-Critic Provably Finds Globally Optimal PolicyInternational Conference on Learning Representations (ICLR), 2020

Zuyue Fu

Zhuoran Yang

Zhaoran Wang

362

02 Aug 2020

Learning Infinite-horizon Average-reward MDPs with Linear Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020

Chen-Yu Wei

Mehdi Jafarnia-Jahromi

Haipeng Luo

Rahul Jain

362

23 Jul 2020

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient LearningNeural Information Processing Systems (NeurIPS), 2020

335

123

16 Jul 2020

Online learning in MDPs with linear function approximation and bandit feedback

Gergely Neu

Julia Olkhovskaya

302

03 Jul 2020

Learning and Planning in Average-Reward Markov Decision Processes

Yi Wan

A. Naik

R. Sutton

OffRL

310

29 Jun 2020

A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret

Mehdi Jafarnia-Jahromi

Chen-Yu Wei

Rahul Jain

Haipeng Luo

334

08 Jun 2020

Upper Confidence Primal-Dual Reinforcement Learning for CMDP with Adversarial LossNeural Information Processing Systems (NeurIPS), 2020

462

02 Mar 2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated EquilibriumAnnual Conference Computational Learning Theory (COLT), 2020

495

137

17 Feb 2020

Provably Efficient Exploration in Policy OptimizationInternational Conference on Machine Learning (ICML), 2019

368

303

12 Dec 2019

Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision ProcessesInternational Conference on Machine Learning (ICML), 2019

Chen-Yu Wei

Mehdi Jafarnia-Jahromi

Haipeng Luo

Hiteshi Sharma

R. Jain

378

120

15 Oct 2019