ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.08610
  4. Cited By
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
v1v2v3 (latest)

Sample Efficient Policy Gradient Methods with Recursive Variance Reduction

International Conference on Learning Representations (ICLR), 2019
18 September 2019
Pan Xu
F. Gao
Quanquan Gu
ArXiv (abs)PDFHTML

Papers citing "Sample Efficient Policy Gradient Methods with Recursive Variance Reduction"

50 / 63 papers shown
Title
Predictive Spike Timing Enables Distributed Shortest Path Computation in Spiking Neural Networks
Predictive Spike Timing Enables Distributed Shortest Path Computation in Spiking Neural Networks
Simen Storesund
Kristian Valset Aars
Robin Dietrich
Nicolai Waniek
85
0
0
12 Sep 2025
Reusing Trajectories in Policy Gradients Enables Fast Convergence
Reusing Trajectories in Policy Gradients Enables Fast Convergence
Alessandro Montenegro
Federico Mansutti
Marco Mussi
Matteo Papini
Alberto Maria Metelli
OnRL
215
0
0
06 Jun 2025
Accelerating RLHF Training with Reward Variance Increase
Accelerating RLHF Training with Reward Variance Increase
Zonglin Yang
Zhexuan Gu
Houduo Qi
Yancheng Yuan
325
1
0
29 May 2025
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Swetha Ganesh
Vaneet Aggarwal
189
2
0
26 May 2025
Enhancing PPO with Trajectory-Aware Hybrid Policies
Qisai Liu
Zhanhong Jiang
Hsin-Jung Yang
Mahsa Khosravi
Joshua R. Waite
Soumik Sarkar
246
0
0
21 Feb 2025
Momentum for the Win: Collaborative Federated Reinforcement Learning
  across Heterogeneous Environments
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
Han Wang
Sihong He
Zhili Zhang
Fei Miao
James Anderson
222
8
0
29 May 2024
Policy Gradient with Active Importance Sampling
Policy Gradient with Active Importance Sampling
Matteo Papini
Giorgio Manganini
Alberto Maria Metelli
Marcello Restelli
OffRL
155
4
0
09 May 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Learning Optimal Deterministic Policies with Stochastic Policy GradientsInternational Conference on Machine Learning (ICML), 2024
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
253
5
0
03 May 2024
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Guangchen Lan
Dong-Jun Han
Abolfazl Hashemi
Vaneet Aggarwal
Christopher G. Brinton
748
22
0
09 Apr 2024
Global Convergence Guarantees for Federated Policy Gradient Methods with
  Adversaries
Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries
Swetha Ganesh
Jiayu Chen
Gugan Thoppe
Vaneet Aggarwal
FedML
275
4
0
15 Mar 2024
Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis
Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis
Rui Liu
Erfaun Noorani
Erfaun Noorani
John S. Baras
386
4
0
13 Mar 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for
  Regularized Expected Reward Optimization
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
160
1
0
23 Jan 2024
Global Convergence of Natural Policy Gradient with Hessian-aided
  Momentum Variance Reduction
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance ReductionJournal of Scientific Computing (J. Sci. Comput.), 2024
Jie Feng
Ke Wei
Jinchi Chen
320
3
0
02 Jan 2024
A safe exploration approach to constrained Markov decision processes
A safe exploration approach to constrained Markov decision processesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Tingting Ni
Maryam Kamgarpour
290
4
0
01 Dec 2023
Efficiently Escaping Saddle Points for Policy Optimization
Efficiently Escaping Saddle Points for Policy OptimizationConference on Uncertainty in Artificial Intelligence (UAI), 2023
Sadegh Khorasani
Saber Salehkaleybar
Negar Kiyavash
Niao He
Matthias Grossglauser
199
1
0
15 Nov 2023
Model-Based Reparameterization Policy Gradient Methods: Theory and
  Practical Algorithms
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical AlgorithmsNeural Information Processing Systems (NeurIPS), 2023
Shenao Zhang
Boyi Liu
Zhaoran Wang
Tuo Zhao
242
4
0
30 Oct 2023
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm
  with General Parameterization for Infinite Horizon Discounted Reward Markov
  Decision Processes
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision ProcessesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Washim Uddin Mondal
Vaneet Aggarwal
214
18
0
18 Oct 2023
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method
  for Aligning Large Language Models
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language ModelsInternational Conference on Machine Learning (ICML), 2023
Ziniu Li
Tian Xu
Yushun Zhang
Zhihang Lin
Yang Yu
Tian Ding
Zhimin Luo
379
125
0
16 Oct 2023
Improved Communication Efficiency in Federated Natural Policy Gradient
  via ADMM-based Gradient Updates
Improved Communication Efficiency in Federated Natural Policy Gradient via ADMM-based Gradient UpdatesNeural Information Processing Systems (NeurIPS), 2023
Guangchen Lan
Han Wang
James Anderson
Christopher G. Brinton
Vaneet Aggarwal
FedML
240
30
0
09 Oct 2023
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy
  Gradient Methods
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient MethodsInternational Conference on Learning Representations (ICLR), 2023
Sara Klein
Simon Weissmann
Leif Döring
225
12
0
04 Oct 2023
Reinforcement Learning with General Utilities: Simpler Variance
  Reduction and Large State-Action Space
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action SpaceInternational Conference on Machine Learning (ICML), 2023
Anas Barakat
Ilyas Fatkhullin
Niao He
210
14
0
02 Jun 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for
  Fisher-non-degenerate Policies
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate PoliciesInternational Conference on Machine Learning (ICML), 2023
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
351
51
0
03 Feb 2023
Stochastic Dimension-reduced Second-order Methods for Policy
  Optimization
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
96
1
0
28 Jan 2023
Variance-Reduced Conservative Policy Iteration
Variance-Reduced Conservative Policy IterationInternational Conference on Algorithmic Learning Theory (ALT), 2022
Naman Agarwal
Brian Bullins
Karan Singh
183
3
0
12 Dec 2022
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient MethodsNeural Information Processing Systems (NeurIPS), 2022
Yanli Liu
Jianchao Tan
Tamer Basar
W. Yin
300
120
0
15 Nov 2022
Finite-time analysis of single-timescale actor-critic
Finite-time analysis of single-timescale actor-criticNeural Information Processing Systems (NeurIPS), 2022
Xu-yang Chen
Tianyuan Chen
OffRL
307
27
0
18 Oct 2022
On Private Online Convex Optimization: Optimal Algorithms in
  $\ell_p$-Geometry and High Dimensional Contextual Bandits
On Private Online Convex Optimization: Optimal Algorithms in ℓp\ell_pℓp​-Geometry and High Dimensional Contextual Bandits
Yuxuan Han
Zhicong Liang
Zhipeng Liang
Yang Wang
Xingtai Lv
Jiheng Zhang
181
1
0
16 Jun 2022
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Conservative Natural Policy Gradient Primal-Dual Algorithm
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Conservative Natural Policy Gradient Primal-Dual AlgorithmAAAI Conference on Artificial Intelligence (AAAI), 2022
Qinbo Bai
Amrit Singh Bedi
Vaneet Aggarwal
230
26
0
12 Jun 2022
Momentum-Based Policy Gradient with Second-Order Information
Momentum-Based Policy Gradient with Second-Order Information
Saber Salehkaleybar
Sadegh Khorasani
Negar Kiyavash
Niao He
Patrick Thiran
193
12
0
17 May 2022
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method
  with Probabilistic Gradient Estimation
PAGE-PG: A Simple and Loopless Variance-Reduced Policy Gradient Method with Probabilistic Gradient EstimationInternational Conference on Machine Learning (ICML), 2022
Matilde Gargiani
Andrea Zanelli
Andrea Martinelli
Tyler H. Summers
John Lygeros
118
17
0
01 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted
  Iteration
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
262
1
0
31 Jan 2022
On the Hidden Biases of Policy Mirror Ascent in Continuous Action Spaces
On the Hidden Biases of Policy Mirror Ascent in Continuous Action SpacesInternational Conference on Machine Learning (ICML), 2022
Amrit Singh Bedi
Souradip Chakraborty
Anjaly Parayil
Brian M Sadler
Erfaun Noorani
Alec Koppel
325
20
0
28 Jan 2022
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
465
229
0
08 Dec 2021
MDPGT: Momentum-based Decentralized Policy Gradient Tracking
MDPGT: Momentum-based Decentralized Policy Gradient TrackingAAAI Conference on Artificial Intelligence (AAAI), 2021
Zhanhong Jiang
Xian Yeow Lee
Sin Yong Tan
Kai Liang Tan
Aditya Balu
Young M. Lee
Chinmay Hegde
Soumik Sarkar
145
11
0
06 Dec 2021
Distributed Policy Gradient with Variance Reduction in Multi-Agent
  Reinforcement Learning
Distributed Policy Gradient with Variance Reduction in Multi-Agent Reinforcement Learning
Xiaoxiao Zhao
Jinlong Lei
Li Li
Jie-bin Chen
OffRL
223
4
0
25 Nov 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
265
25
0
19 Oct 2021
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic
  Reinforcement Learning and Global Convergence of Policy Gradient Methods
Theoretical Guarantees of Fictitious Discount Algorithms for Episodic Reinforcement Learning and Global Convergence of Policy Gradient Methods
Xin Guo
Anran Hu
Junzi Zhang
OffRL
185
10
0
13 Sep 2021
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network
  Approach
Mean-Field Multi-Agent Reinforcement Learning: A Decentralized Network ApproachMathematics of Operations Research (MOR), 2021
Haotian Gu
Xin Guo
Xiaoli Wei
Renyuan Xu
OOD
211
42
0
05 Aug 2021
A general sample complexity analysis of vanilla policy gradient
A general sample complexity analysis of vanilla policy gradientInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Rui Yuan
Robert Mansel Gower
A. Lazaric
328
79
0
23 Jul 2021
Bregman Gradient Policy Optimization
Bregman Gradient Policy Optimization
Feihu Huang
Shangqian Gao
Heng-Chiao Huang
372
18
0
23 Jun 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with
  Density-Ratio Correction
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio CorrectionInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Jiawei Huang
Nan Jiang
222
6
0
02 Jun 2021
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Joint Optimization of Multi-Objective Reinforcement Learning with Policy Gradient Based AlgorithmJournal of Artificial Intelligence Research (JAIR), 2021
Qinbo Bai
Mridul Agarwal
Vaneet Aggarwal
94
8
0
28 May 2021
Policy Mirror Descent for Regularized Reinforcement Learning: A
  Generalized Framework with Linear Convergence
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear ConvergenceSIAM Journal on Optimization (SIAM J. Optim.), 2021
Wenhao Zhan
Shicong Cen
Baihe Huang
Yuxin Chen
Jason D. Lee
Yuejie Chi
279
88
0
24 May 2021
On the Convergence and Sample Efficiency of Variance-Reduced Policy
  Gradient Method
On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient MethodNeural Information Processing Systems (NeurIPS), 2021
Junyu Zhang
Chengzhuo Ni
Zheng Yu
Csaba Szepesvári
Mengdi Wang
325
72
0
17 Feb 2021
Sample Complexity of Policy Gradient Finding Second-Order Stationary
  Points
Sample Complexity of Policy Gradient Finding Second-Order Stationary PointsAAAI Conference on Artificial Intelligence (AAAI), 2020
Long Yang
Qian Zheng
Gang Pan
227
25
0
02 Dec 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence GuaranteeInternational Conference on Machine Learning (ICML), 2020
Tengyu Xu
Yingbin Liang
Guanghui Lan
295
147
0
11 Nov 2020
A Study of Policy Gradient on a Class of Exactly Solvable Models
A Study of Policy Gradient on a Class of Exactly Solvable Models
Gavin McCracken
Colin Daniels
Rosie Zhao
Anna M. Brandenberger
Prakash Panangaden
Doina Precup
109
0
0
03 Nov 2020
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms
A Deeper Look at Discounting Mismatch in Actor-Critic AlgorithmsAdaptive Agents and Multi-Agent Systems (AAMAS), 2020
Shangtong Zhang
Romain Laroche
H. V. Seijen
Shimon Whiteson
Rémi Tachet des Combes
392
15
0
02 Oct 2020
Variance-Reduced Off-Policy Memory-Efficient Policy Search
Variance-Reduced Off-Policy Memory-Efficient Policy Search
Daoming Lyu
Qi Qi
Mohammad Ghavamzadeh
Hengshuai Yao
Tianbao Yang
Bo Liu
OffRL
166
7
0
14 Sep 2020
Beyond variance reduction: Understanding the true impact of baselines on
  policy optimization
Beyond variance reduction: Understanding the true impact of baselines on policy optimizationInternational Conference on Machine Learning (ICML), 2020
Wesley Chung
Valentin Thomas
Marlos C. Machado
Nicolas Le Roux
OffRL
376
31
0
31 Aug 2020
12
Next