Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2007.02151
Cited By
Variational Policy Gradient Method for Reinforcement Learning with General Utilities
4 July 2020
Junyu Zhang
Alec Koppel
Amrit Singh Bedi
Csaba Szepesvári
Mengdi Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Variational Policy Gradient Method for Reinforcement Learning with General Utilities"
50 / 87 papers shown
Flow Density Control: Generative Optimization Beyond Entropy-Regularized Fine-Tuning
Riccardo De Santi
Marin Vlastelica
Ya-Ping Hsieh
Zebang Shen
Niao He
Andreas Krause
AI4CE
145
5
0
27 Nov 2025
On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
Jiacai Liu
Wenye Li
Ke Wei
217
1
0
23 Sep 2025
Policy Gradient with Self-Attention for Model-Free Distributed Nonlinear Multi-Agent Games
Eduardo Sebastián
Maitrayee Keskar
Eeman Iqbal
Eduardo Montijano
C. Sagüés
Nikolay Atanasov
184
0
0
22 Sep 2025
Bayesian Risk-Sensitive Policy Optimization For MDPs With General Loss Functions
Xiaoshuang Wang
Yifan Lin
Enlu Zhou
216
0
0
19 Sep 2025
The Geometry of Nonlinear Reinforcement Learning
Nikola Milosevic
Nico Scherf
131
0
0
01 Sep 2025
Solving General-Utility Markov Decision Processes in the Single-Trial Regime with Online Planning
Pedro P. Santos
Alberto Sardinha
Francisco S. Melo
115
0
0
21 May 2025
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
353
0
0
12 May 2025
Is there Value in Reinforcement Learning?
Lior Fox
Y. Loewenstein
OffRL
256
0
0
07 May 2025
Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm
Neural Information Processing Systems (NeurIPS), 2024
Sattar Vakili
Julia Olkhovskaya
345
3
0
30 Oct 2024
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference
International Conference on Learning Representations (ICLR), 2024
Qining Zhang
Lei Ying
OffRL
562
10
0
25 Sep 2024
The Number of Trials Matters in Infinite-Horizon General-Utility Markov Decision Processes
Pedro P. Santos
Alberto Sardinha
Francisco S. Melo
197
1
0
23 Sep 2024
Geometric Active Exploration in Markov Decision Processes: the Benefit of Abstraction
Ric De Santi
Federico Arangath Joseph
Noah Liniger
Mirco Mutti
Andreas Krause
AI4CE
270
5
0
18 Jul 2024
Global Reinforcement Learning: Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods
Ric De Santi
Manish Prajapat
Andreas Krause
332
13
0
13 Jul 2024
MetaCURL: Non-stationary Concave Utility Reinforcement Learning
B. Moreno
Margaux Brégère
Pierre Gaillard
Nadia Oudjane
OffRL
280
3
0
30 May 2024
Inverse Concave-Utility Reinforcement Learning is Inverse Game Theory
M. Çelikok
F. Oliehoek
Jan-Willem van de Meent
342
2
0
29 May 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
341
1
0
25 Apr 2024
On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes
Navdeep Kumar
Yashaswini Murthy
Itai Shufaro
Kfir Y. Levy
R. Srikant
Shie Mannor
228
11
0
11 Mar 2024
Taming Nonconvex Stochastic Mirror Descent with General Bregman Divergence
Ilyas Fatkhullin
Niao He
379
16
0
27 Feb 2024
Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning
Zihao Li
Boyi Liu
Zhuoran Yang
Zhaoran Wang
Mengdi Wang
343
2
0
16 Feb 2024
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences
Souradip Chakraborty
Jiahao Qiu
Hui Yuan
Alec Koppel
Furong Huang
Dinesh Manocha
Amrit Singh Bedi
Mengdi Wang
ALM
233
29
0
14 Feb 2024
On the Limitations of Markovian Rewards to Express Multi-Objective, Risk-Sensitive, and Modal Tasks
Conference on Uncertainty in Artificial Intelligence (UAI), 2024
Joar Skalse
Alessandro Abate
263
13
0
26 Jan 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
237
1
0
23 Jan 2024
Quantum Advantage Actor-Critic for Reinforcement Learning
International Conference on Agents and Artificial Intelligence (ICAART), 2024
Michael Kolle
Mohamad Hgog
Fabian Ritz
Philipp Altmann
Maximilian Zorn
Jonas Stein
Claudia Linnhoff-Popien
299
17
0
13 Jan 2024
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction
Journal of Scientific Computing (J. Sci. Comput.), 2024
Jie Feng
Ke Wei
Jinchi Chen
412
4
0
02 Jan 2024
Neural Network Approximation for Pessimistic Offline Reinforcement Learning
Di Wu
Yuling Jiao
Li Shen
Haizhao Yang
Xiliang Lu
OffRL
307
2
0
19 Dec 2023
Efficient Model-Based Concave Utility Reinforcement Learning through Greedy Mirror Descent
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
B. Moreno
Margaux Brégère
Pierre Gaillard
Nadia Oudjane
300
5
0
30 Nov 2023
Stable In-hand Manipulation with Finger Specific Multi-agent Shadow Reward
Lingfeng Tao
Jiucai Zhang
Xiaoli Zhang
250
0
0
13 Sep 2023
Diversifying AI: Towards Creative Chess with AlphaZero
Tom Zahavy
Vivek Veeriah
Shaobo Hou
Kevin Waugh
Matthew Lai
Edouard Leurent
Nenad Tomašev
Lisa Schut
Demis Hassabis
Satinder Singh
322
23
0
17 Aug 2023
Invex Programs: First Order Algorithms and Their Convergence
Adarsh Barik
S. Sra
Jean Honorio
263
5
0
10 Jul 2023
Active Coverage for PAC Reinforcement Learning
Annual Conference Computational Learning Theory (COLT), 2023
Aymen Al Marjani
Andrea Tirinzoni
E. Kaufmann
OffRL
262
7
0
23 Jun 2023
A Single-Loop Deep Actor-Critic Algorithm for Constrained Reinforcement Learning with Provable Convergence
Kexuan Wang
An Liu
Baishuo Liu
202
1
0
10 Jun 2023
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
International Conference on Machine Learning (ICML), 2023
Anas Barakat
Ilyas Fatkhullin
Niao He
258
17
0
02 Jun 2023
On the Linear Convergence of Policy Gradient under Hadamard Parameterization
Information and Inference A Journal of the IMA (JIII), 2023
Jiacai Liu
Jinchi Chen
Ke Wei
284
4
0
31 May 2023
Scalable Primal-Dual Actor-Critic Method for Safe Multi-Agent RL with General Utilities
Neural Information Processing Systems (NeurIPS), 2023
Donghao Ying
Yunkai Zhang
Yuhao Ding
Alec Koppel
Javad Lavaei
423
22
0
27 May 2023
Inverse Reinforcement Learning with the Average Reward Criterion
Neural Information Processing Systems (NeurIPS), 2023
Feiyang Wu
Jingyang Ke
Anqi Wu
421
14
0
24 May 2023
A Coupled Flow Approach to Imitation Learning
International Conference on Machine Learning (ICML), 2023
G. Freund
Elad Sarafian
Sarit Kraus
OOD
233
16
0
29 Apr 2023
What can online reinforcement learning with function approximation benefit from general coverage conditions?
International Conference on Machine Learning (ICML), 2023
Fanghui Liu
Luca Viano
Volkan Cevher
OffRL
341
6
0
25 Apr 2023
Policy Gradient Converges to the Globally Optimal Policy for Nearly Linear-Quadratic Regulators
SIAM Journal of Control and Optimization (SICON), 2023
Yin-Huan Han
Meisam Razaviyayn
Renyuan Xu
503
7
0
15 Mar 2023
n-Step Temporal Difference Learning with Optimal n
Lakshmi Mandal
S. Bhatnagar
465
3
0
13 Mar 2023
Deep Reinforcement Learning for Cost-Effective Medical Diagnosis
International Conference on Learning Representations (ICLR), 2023
Zheng Yu
Yikuan Li
Joseph C. Kim
Kai Huang
Yuan Luo
Mengdi Wang
OffRL
378
23
0
20 Feb 2023
Scalable Multi-Agent Reinforcement Learning with General Utilities
American Control Conference (ACC), 2023
Donghao Ying
Yuhao Ding
Alec Koppel
Javad Lavaei
278
2
0
15 Feb 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability
Neural Information Processing Systems (NeurIPS), 2023
Hanlin Zhu
Amy Zhang
OffRL
368
5
0
07 Feb 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies
International Conference on Machine Learning (ICML), 2023
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
461
60
0
03 Feb 2023
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
Neural Information Processing Systems (NeurIPS), 2023
Carlo Alfano
Rui Yuan
Patrick Rebeschini
669
24
0
30 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
565
20
0
30 Jan 2023
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian
International Conference on Learning Representations (ICLR), 2022
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
469
34
0
01 Nov 2022
Proximal Mean Field Learning in Shallow Neural Networks
Alexis M. H. Teter
Iman Nodozi
A. Halder
FedML
312
1
0
25 Oct 2022
Policy Gradient for Reinforcement Learning with General Utilities
Navdeep Kumar
Kaixin Wang
Kfir Y. Levy
Shie Mannor
119
6
0
03 Oct 2022
On the convex formulations of robust Markov decision processes
Mathematics of Operations Research (MOR), 2022
Julien Grand-Clément
Marek Petrik
313
13
0
21 Sep 2022
Cross apprenticeship learning framework: Properties and solution approaches
A. Aravind
Debasish Chatterjee
A. Cherukuri
217
0
0
06 Sep 2022
1
2
Next
Page 1 of 2