v1v2v3 (latest)

Finite-Sample Analysis for SARSA with Linear Function Approximation

6 February 2019

Papers citing "Finite-Sample Analysis for SARSA with Linear Function Approximation"

50 / 101 papers shown

Title
Recent Advances in Reinforcement Learning in Finance B. Hambly Renyuan Xu Huining Yang OffRL 457 229 0 08 Dec 2021
Finite-Time Error Bounds for Distributed Linear Stochastic Approximation Yixuan Lin V. Gupta Ji Liu 276 3 0 24 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution MismatchJournal of machine learning research (JMLR), 2021 Shangtong Zhang Rémi Tachet des Combes Romain Laroche 324 16 0 04 Nov 2021
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth SettingsAAAI Conference on Artificial Intelligence (AAAI), 2021 Matthew Shunshi Zhang Murat A. Erdogdu Animesh Garg 299 5 0 30 Oct 2021
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision ProcessesIEEE Conference on Decision and Control (CDC), 2021 Sihan Zeng Thinh T. Doan Justin Romberg 298 21 0 21 Oct 2021
Actor-critic is implicitly biased towards high entropy optimal policiesInternational Conference on Learning Representations (ICLR), 2021 Yuzheng Hu Ziwei Ji Matus Telgarsky 175 12 0 21 Oct 2021
Faster Algorithm and Sharper Analysis for Constrained Markov Decision ProcessOperations Research Letters (ORL), 2021 Tianjiao Li Ziwei Guan Shaofeng Zou Tengyu Xu Yingbin Liang Guanghui Lan 158 36 0 20 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization Hua Zheng Wei Xie M. Feng OffRL 417 3 0 17 Oct 2021
Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs Naman Agarwal Syomantak Chaudhuri Prateek Jain Dheeraj M. Nagaraj Praneeth Netrapalli OffRL 283 22 0 16 Oct 2021
Sim and Real: Better Together Shirli Di-Castro Shashua Dotan DiCastro Shie Mannor 209 11 0 01 Oct 2021
A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning Sihan Zeng Thinh T. Doan Justin Romberg 397 30 0 29 Sep 2021
Online Robust Reinforcement Learning with Model Uncertainty Yue Wang Shaofeng Zou OOD OffRL 338 133 0 29 Sep 2021
Truncated Emphatic Temporal Difference Methods for Prediction and ControlJournal of machine learning research (JMLR), 2021 Shangtong Zhang Shimon Whiteson OffRL 235 15 0 11 Aug 2021
Analysis of a Target-Based Actor-Critic Algorithm with Linear Function ApproximationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 Anas Barakat Pascal Bianchi Julien Lehmann 181 13 0 14 Jun 2021
Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement LearningIEEE Transactions on Signal Processing (IEEE TSP), 2021 Chang Tian An Liu Guang-Li Huang Wu Luo 97 15 0 26 May 2021
Deeply-Debiased Off-Policy Interval EstimationInternational Conference on Machine Learning (ICML), 2021 C. Shi Runzhe Wan Victor Chernozhukov R. Song OffRL 193 43 0 10 May 2021
Predictor-Corrector(PC) Temporal Difference(TD) Learning (PCTD) C. Bowyer 67 1 0 15 Apr 2021
Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function ApproximationNeural Information Processing Systems (NeurIPS), 2021 Yue Wang Shaofeng Zou Yi Zhou 375 11 0 07 Apr 2021
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved ComplexityInternational Conference on Learning Representations (ICLR), 2021 Shaocong Ma Ziyi Chen Yi Zhou Shaofeng Zou 160 12 0 30 Mar 2021
Multi-Agent Off-Policy TD Learning: Finite-Time Analysis with Near-Optimal Sample Complexity and Communication Complexity Ziyi Chen Yi Zhou Rongrong Chen OffRL 171 7 0 24 Mar 2021
Breaking the Deadly Triad with a Target NetworkInternational Conference on Machine Learning (ICML), 2021 Shangtong Zhang Hengshuai Yao Shimon Whiteson AAML 568 55 0 21 Jan 2021
Towards Understanding Asynchronous Advantage Actor-critic: Convergence and Linear SpeedupIEEE Transactions on Signal Processing (TSP), 2020 Han Shen Jianchao Tan Min-Fong Hong Tianyi Chen 308 43 0 31 Dec 2020
On Convergence of Gradient Expected Sarsa( $λ$ )AAAI Conference on Artificial Intelligence (AAAI), 2020 Long Yang Gang Zheng Yu Zhang Qian Zheng Pengfei Li Gang Pan 215 4 0 14 Dec 2020
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee Minghao Han Yuan Tian Lixian Zhang Jun Wang Wei Pan 124 56 0 13 Nov 2020
Sample Complexity Bounds for Two Timescale Value-based Reinforcement Learning Algorithms Tengyu Xu Yingbin Liang 236 27 0 10 Nov 2020
Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence AnalysisNeural Information Processing Systems (NeurIPS), 2020 Shaocong Ma Yi Zhou Shaofeng Zou OffRL 270 17 0 26 Oct 2020
Finite-Time Analysis for Double Q-learning Huaqing Xiong Linna Zhao Yingbin Liang Wei Zhang 170 32 0 29 Sep 2020
Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis Arunselvan Ramaswamy Eyke Hüllermeier 86 4 0 25 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal PolicyInternational Conference on Learning Representations (ICLR), 2020 Zuyue Fu Zhuoran Yang Zhaoran Wang 236 48 0 02 Aug 2020
Momentum Q-learning with Finite-Sample Convergence Guarantee Bowen Weng Huaqing Xiong Linna Zhao Yingbin Liang Wei Zhang 174 8 0 30 Jul 2020
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient DescentInternational Joint Conference on Artificial Intelligence (IJCAI), 2020 Chuhan Wu Fangzhao Wu Tao Qi Yongfeng Huang 112 24 0 15 Jul 2020
Provably Efficient Reinforcement Learning for Discounted MDPs with Feature MappingInternational Conference on Machine Learning (ICML), 2020 Dongruo Zhou Jiafan He Quanquan Gu 301 141 0 23 Jun 2020
Sample Efficient Reinforcement Learning via Low-Rank Matrix EstimationNeural Information Processing Systems (NeurIPS), 2020 Devavrat Shah Dogyoon Song Zhi Xu Yuzhe Yang 275 33 0 11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 663 11 0 08 Jun 2020
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction Gen Li Yuting Wei Yuejie Chi Yuantao Gu Yuxin Chen OffRL 380 126 0 04 Jun 2020
Finite-sample Analysis of Greedy-GQ with Linear Function Approximation under Markovian Noise Yue Wang Shaofeng Zou 142 22 0 20 May 2020
Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms Tengyu Xu Zhe Wang Yingbin Liang 244 63 0 07 May 2020
A Finite Time Analysis of Two Time-Scale Actor Critic MethodsNeural Information Processing Systems (NeurIPS), 2020 Yue Wu Weitong Zhang Pan Xu Quanquan Gu 334 167 0 04 May 2020
Actor-Critic Reinforcement Learning for Control with Stability GuaranteeIEEE Robotics and Automation Letters (RA-L), 2020 Minghao Han Lixian Zhang Jun Wang Wei Pan 261 136 0 29 Apr 2020
Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms Tengyu Xu Zhe Wang Yingbin Liang 311 25 0 27 Apr 2020
Agnostic Q-learning with Function Approximation in Deterministic Systems: Tight Bounds on Approximation Error and Sample Complexity S. Du Jason D. Lee G. Mahajan Ruosong Wang 110 39 0 17 Feb 2020
Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian SamplingAAAI Conference on Artificial Intelligence (AAAI), 2020 Huaqing Xiong Tengyu Xu Yingbin Liang Wei Zhang 171 35 0 15 Feb 2020
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision MakingInternational Conference on Machine Learning (ICML), 2020 C. Shi Runzhe Wan R. Song Wenbin Lu Ling Leng 162 41 0 05 Feb 2020
Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning FrameworkJournal of the American Statistical Association (JASA), 2020 C. Shi Xiaoyu Wang Shuang Luo Hongtu Zhu Jieping Ye R. Song CML OffRL 442 48 0 05 Feb 2020
Reanalysis of Variance Reduced Temporal Difference LearningInternational Conference on Learning Representations (ICLR), 2020 Tengyu Xu Zhe Wang Yi Zhou Yingbin Liang OffRL 321 44 0 07 Jan 2020
Scalable Reinforcement Learning for Multi-Agent Networked SystemsOperational Research (OR), 2019 Guannan Qu Adam Wierman Na Li 244 43 0 05 Dec 2019
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms Dong-hwan Lee Niao He 332 28 0 04 Dec 2019
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms Jianchao Tan Zhuoran Yang Tamer Basar 527 1,440 0 24 Nov 2019
Generalized Speedy Q-learningIEEE Control Systems Letters (L-CSS), 2019 I. John Chandramouli Kamanchi S. Bhatnagar 119 20 0 01 Nov 2019
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function ApproximationMachine-mediated learning (ML), 2019 Harshat Kumar Alec Koppel Alejandro Ribeiro 301 95 0 18 Oct 2019