Variational Policy Gradient Method for Reinforcement Learning with General Utilities

4 July 2020

Papers citing "Variational Policy Gradient Method for Reinforcement Learning with General Utilities"

26 / 26 papers shown

Title
Online Episodic Convex Reinforcement Learning B. Moreno Khaled Eldowa Pierre Gaillard Margaux Brégère Nadia Oudjane OffRL 27 0 0 12 May 2025
Is there Value in Reinforcement Learning? Lior Fox Y. Loewenstein OffRL 61 0 0 07 May 2025
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference Qining Zhang Lei Ying OffRL 37 2 0 25 Sep 2024
Global Reinforcement Learning: Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods Ric De Santi Manish Prajapat Andreas Krause 36 3 0 13 Jul 2024
MetaCURL: Non-stationary Concave Utility Reinforcement Learning B. Moreno Margaux Brégère Pierre Gaillard Nadia Oudjane OffRL 35 0 0 30 May 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints Bram De Cooman Johan A. K. Suykens 23 0 0 25 Apr 2024
Neural Network Approximation for Pessimistic Offline Reinforcement Learning Di Wu Yuling Jiao Li Shen Haizhao Yang Xiliang Lu OffRL 27 1 0 19 Dec 2023
Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators Yin-Huan Han Meisam Razaviyayn Renyuan Xu 22 5 0 15 Mar 2023
n-Step Temporal Difference Learning with Optimal n Lakshmi Mandal S. Bhatnagar 16 2 0 13 Mar 2023
Scalable Multi-Agent Reinforcement Learning with General Utilities Donghao Ying Yuhao Ding Alec Koppel Javad Lavaei 34 1 0 15 Feb 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning Hanlin Zhu Paria Rashidinejad Jiantao Jiao OffRL 30 15 0 30 Jan 2023
Proximal Mean Field Learning in Shallow Neural Networks Alexis M. H. Teter Iman Nodozi A. Halder FedML 37 1 0 25 Oct 2022
Cross apprenticeship learning framework: Properties and solution approaches A. Aravind Debasish Chatterjee A. Cherukuri 26 0 0 06 Sep 2022
Improved Policy Optimization for Online Imitation Learning J. Lavington Sharan Vaswani Mark W. Schmidt OffRL 13 6 0 29 Jul 2022
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning Ruida Zhou Tao-Wen Liu D. Kalathil P. R. Kumar Chao Tian 24 13 0 10 Jun 2022
Jump-Start Reinforcement Learning Ikechukwu Uchendu Ted Xiao Yao Lu Banghua Zhu Mengyuan Yan ... Chuyuan Fu Cong Ma Jiantao Jiao Sergey Levine Karol Hausman OffRL OnRL 33 107 0 05 Apr 2022
Challenging Common Assumptions in Convex Reinforcement Learning Mirco Mutti Ric De Santi Piersilvio De Bartolomeis Marcello Restelli OffRL 24 21 0 03 Feb 2022
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods Xin Guo Anran Hu Junzi Zhang OffRL 16 6 0 13 Sep 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations Mridul Agarwal Qinbo Bai Vaneet Aggarwal 28 12 0 12 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning Andrea Zanette Martin J. Wainwright Emma Brunskill OffRL 29 111 0 19 Aug 2021
A general sample complexity analysis of vanilla policy gradient Rui Yuan Robert Mansel Gower A. Lazaric 69 62 0 23 Jul 2021
Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint M. Geist Julien Pérolat Mathieu Laurière Romuald Elie Sarah Perrin Olivier Bachem Rémi Munos Olivier Pietquin 21 62 0 07 Jun 2021
Finite-Sample Analysis of Off-Policy Natural Actor-Critic with Linear Function Approximation Zaiwei Chen S. Khodadadian S. T. Maguluri OffRL 52 29 0 26 May 2021
On the Linear convergence of Natural Policy Gradient Algorithm S. Khodadadian P. Jhunjhunwala Sushil Mahavir Varma S. T. Maguluri 30 56 0 04 May 2021
Is Pessimism Provably Efficient for Offline RL? Ying Jin Zhuoran Yang Zhaoran Wang OffRL 27 345 0 30 Dec 2020
Sample Efficient Reinforcement Learning with REINFORCE Junzi Zhang Jongho Kim Brendan O'Donoghue Stephen P. Boyd 35 99 0 22 Oct 2020