v1v2v3v4 (latest)

Adaptive Approximate Policy Iteration

8 February 2020

Papers citing "Adaptive Approximate Policy Iteration"

13 / 13 papers shown

Acceleration in Policy Optimization

352

18 Jun 2023

Concentration Phenomenon for Random Dynamical Systems: An Operator Theoretic ApproachConference on Learning for Dynamics & Control (L4DC), 2022

Muhammad Naeem

Miroslav Pajic

351

07 Dec 2022

Transportation-Inequalities, Lyapunov Stability and Sampling for Dynamical Systems on Continuous State SpaceConference on Learning for Dynamics & Control (L4DC), 2022

Muhammad Naeem

Miroslav Pajic

224

25 May 2022

Online Learning for Unknown Partially Observable MDPsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Mehdi Jafarnia-Jahromi

Rahul Jain

A. Nayyar

317

25 Feb 2021

Improved Regret Bound and Experience Replay in Regularized Policy IterationInternational Conference on Machine Learning (ICML), 2021

151

25 Feb 2021

Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Jiafan He

Dongruo Zhou

Quanquan Gu

284

17 Feb 2021

Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021

Yue Wu

Dongruo Zhou

Quanquan Gu

192

15 Feb 2021

Optimization Issues in KL-Constrained Approximate Policy Iteration

130

11 Feb 2021

Average-reward model-free reinforcement learning: a systematic review and literature mapping

300

18 Oct 2020

Single-Timescale Actor-Critic Provably Finds Globally Optimal PolicyInternational Conference on Learning Representations (ICLR), 2020

Zuyue Fu

Zhuoran Yang

Zhaoran Wang

358

02 Aug 2020

Learning Infinite-horizon Average-reward MDPs with Linear Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020

Chen-Yu Wei

Mehdi Jafarnia-Jahromi

Haipeng Luo

Rahul Jain

332

23 Jul 2020

Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

Muhammad Naeem

Miroslav Pajic

198

15 Jun 2020

A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret

Mehdi Jafarnia-Jahromi

Chen-Yu Wei

Rahul Jain

Haipeng Luo

305

08 Jun 2020