Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01854
Cited By
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space
2 June 2023
Anas Barakat
Ilyas Fatkhullin
Niao He
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space"
14 / 14 papers shown
Title
Online Episodic Convex Reinforcement Learning
B. Moreno
Khaled Eldowa
Pierre Gaillard
Margaux Brégère
Nadia Oudjane
OffRL
22
0
0
12 May 2025
From Gradient Clipping to Normalization for Heavy Tailed SGD
Florian Hübler
Ilyas Fatkhullin
Niao He
40
5
0
17 Oct 2024
Global Reinforcement Learning: Beyond Linear and Convex Rewards via Submodular Semi-gradient Methods
Ric De Santi
Manish Prajapat
Andreas Krause
36
3
0
13 Jul 2024
MetaCURL: Non-stationary Concave Utility Reinforcement Learning
B. Moreno
Margaux Brégère
Pierre Gaillard
Nadia Oudjane
OffRL
29
0
0
30 May 2024
Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline
Wenjia Meng
Qian Zheng
Long Yang
Yilong Yin
Gang Pan
OffRL
34
0
0
04 May 2024
Taming Nonconvex Stochastic Mirror Descent with General Bregman Divergence
Ilyas Fatkhullin
Niao He
27
3
0
27 Feb 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
14
0
0
23 Jan 2024
Inverse Reinforcement Learning with the Average Reward Criterion
Feiyang Wu
Jingyang Ke
Anqi Wu
21
9
0
24 May 2023
Policy Gradient for Reinforcement Learning with General Utilities
Navdeep Kumar
Kaixin Wang
Kfir Y. Levy
Shie Mannor
11
4
0
03 Oct 2022
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated Function
Saeed Masiha
Saber Salehkaleybar
Niao He
Negar Kiyavash
Patrick Thiran
79
18
0
25 May 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient Estimation
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
33
14
0
01 Feb 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M. Sadler
Pratap Tokekar
Alec Koppel
41
17
0
28 Jan 2022
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
44
66
0
17 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
136
0
30 Jan 2021
1