ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2107.11433
  4. Cited By
A general sample complexity analysis of vanilla policy gradient
v1v2v3v4v5 (latest)

A general sample complexity analysis of vanilla policy gradient

International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
23 July 2021
Rui Yuan
Robert Mansel Gower
A. Lazaric
ArXiv (abs)PDFHTML

Papers citing "A general sample complexity analysis of vanilla policy gradient"

50 / 51 papers shown
Title
On the Sample Complexity of Differentially Private Policy Optimization
On the Sample Complexity of Differentially Private Policy Optimization
Yi He
Xingyu Zhou
77
0
0
24 Oct 2025
Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential
Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential
Yuping Zheng
Andrew G. Lamperski
144
0
0
03 Oct 2025
Asynchronous Policy Gradient Aggregation for Efficient Distributed Reinforcement Learning
Asynchronous Policy Gradient Aggregation for Efficient Distributed Reinforcement Learning
Alexander Tyurin
Andrei Spiridonov
Varvara Rudenko
OffRL
100
0
0
29 Sep 2025
On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
On the Convergence of Policy Mirror Descent with Temporal Difference Evaluation
Jiacai Liu
Wenye Li
Ke Wei
84
0
0
23 Sep 2025
GPG-HT: Generalized Policy Gradient with History-Aware Decision Transformer for Probabilistic Path Planning
GPG-HT: Generalized Policy Gradient with History-Aware Decision Transformer for Probabilistic Path Planning
Xing Wei
Yuqi Ouyang
28
1
0
24 Aug 2025
Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design
Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design
Xingyu Su
Xiner Li
Masatoshi Uehara
Sunwoo Kim
Yulai Zhao
Gabriele Scalia
Ehsan Hajiramezanali
Tommaso Biancalani
D. Zhi
Shuiwang Ji
121
5
0
01 Jul 2025
Reusing Trajectories in Policy Gradients Enables Fast Convergence
Reusing Trajectories in Policy Gradients Enables Fast Convergence
Alessandro Montenegro
Federico Mansutti
Marco Mussi
Matteo Papini
Alberto Maria Metelli
OnRL
211
0
0
06 Jun 2025
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach
Swetha Ganesh
Vaneet Aggarwal
185
2
0
26 May 2025
RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution
RobotIQ: Empowering Mobile Robots with Human-Level Planning for Real-World Execution
Emmanuel K. Raptis
Athanasios Ch. Kapoutsis
Elias B. Kosmatopoulos
LM&Ro
208
1
0
18 Feb 2025
Convergence of Policy Mirror Descent Beyond Compatible Function Approximation
Convergence of Policy Mirror Descent Beyond Compatible Function Approximation
Uri Sherman
Tomer Koren
Yishay Mansour
295
0
0
16 Feb 2025
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning ratesNeural Information Processing Systems (NeurIPS), 2025
Jincheng Mei
Bo Dai
Alekh Agarwal
Sharan Vaswani
Anant Raj
Csaba Szepesvári
Dale Schuurmans
270
1
0
11 Feb 2025
A learning-based approach to stochastic optimal control under reach-avoid constraint
A learning-based approach to stochastic optimal control under reach-avoid constraintInternational Conference on Hybrid Systems: Computation and Control (HSCC), 2024
Tingting Ni
Maryam Kamgarpour
374
1
0
21 Dec 2024
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHFAdaptive Agents and Multi-Agent Systems (AAMAS), 2024
Flint Xiaofeng Fan
Cheston Tan
Yew-Soon Ong
Roger Wattenhofer
Wei Tsang Ooi
388
1
0
20 Dec 2024
Structure Matters: Dynamic Policy Gradient
Structure Matters: Dynamic Policy Gradient
Sara Klein
Xiangyuan Zhang
Tamer Basar
Simon Weissmann
Leif Döring
139
1
0
07 Nov 2024
On The Global Convergence Of Online RLHF With Neural Parametrization
On The Global Convergence Of Online RLHF With Neural Parametrization
Mudit Gaur
Amrit Singh Bedi
Raghu Pasupathy
Vaneet Aggarwal
238
1
0
21 Oct 2024
Loss Landscape Characterization of Neural Networks without
  Over-Parametrization
Loss Landscape Characterization of Neural Networks without Over-ParametrizationNeural Information Processing Systems (NeurIPS), 2024
Rustem Islamov
Niccolò Ajroldi
Antonio Orvieto
Aurelien Lucchi
327
9
0
16 Oct 2024
On the Convergence of Single-Timescale Actor-Critic
On the Convergence of Single-Timescale Actor-Critic
Navdeep Kumar
Priyank Agrawal
Giorgia Ramponi
Kfir Y. Levy
Shie Mannor
234
0
0
11 Oct 2024
Towards Fast Rates for Federated and Multi-Task Reinforcement Learning
Towards Fast Rates for Federated and Multi-Task Reinforcement LearningIEEE Conference on Decision and Control (CDC), 2024
Feng Zhu
Robert W. Heath Jr.
Aritra Mitra
159
2
0
09 Sep 2024
Complexity of Minimizing Projected-Gradient-Dominated Functions with
  Stochastic First-order Oracles
Complexity of Minimizing Projected-Gradient-Dominated Functions with Stochastic First-order Oracles
Saeed Masiha
Saber Salehkaleybar
Niao He
Negar Kiyavash
Patrick Thiran
157
2
0
03 Aug 2024
Last-Iterate Global Convergence of Policy Gradients for Constrained
  Reinforcement Learning
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
Alessandro Montenegro
Marco Mussi
Matteo Papini
Alberto Maria Metelli
BDL
153
2
0
15 Jul 2024
Almost sure convergence rates of stochastic gradient methods under gradient domination
Almost sure convergence rates of stochastic gradient methods under gradient domination
Simon Weissmann
Sara Klein
Waïss Azizian
Leif Döring
247
6
0
22 May 2024
Policy Gradient with Active Importance Sampling
Policy Gradient with Active Importance Sampling
Matteo Papini
Giorgio Manganini
Alberto Maria Metelli
Marcello Restelli
OffRL
155
4
0
09 May 2024
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Learning Optimal Deterministic Policies with Stochastic Policy GradientsInternational Conference on Machine Learning (ICML), 2024
Alessandro Montenegro
Marco Mussi
Alberto Maria Metelli
Matteo Papini
233
5
0
03 May 2024
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Asynchronous Federated Reinforcement Learning with Policy Gradient Updates: Algorithm Design and Convergence Analysis
Guangchen Lan
Dong-Jun Han
Abolfazl Hashemi
Vaneet Aggarwal
Christopher G. Brinton
712
22
0
09 Apr 2024
Global Convergence Guarantees for Federated Policy Gradient Methods with
  Adversaries
Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries
Swetha Ganesh
Jiayu Chen
Gugan Thoppe
Vaneet Aggarwal
FedML
267
4
0
15 Mar 2024
Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis
Towards Efficient Risk-Sensitive Policy Gradient: An Iteration Complexity Analysis
Rui Liu
Erfaun Noorani
Erfaun Noorani
John S. Baras
362
4
0
13 Mar 2024
Towards Provable Log Density Policy Gradient
Towards Provable Log Density Policy Gradient
Pulkit Katdare
Anant Joshi
Katherine Driggs-Campbell
197
0
0
03 Mar 2024
Stochastic Gradient Succeeds for Bandits
Stochastic Gradient Succeeds for Bandits
Jincheng Mei
Zixin Zhong
Bo Dai
Alekh Agarwal
Csaba Szepesvári
Dale Schuurmans
204
1
0
27 Feb 2024
On the Complexity of Finite-Sum Smooth Optimization under the
  Polyak-Łojasiewicz Condition
On the Complexity of Finite-Sum Smooth Optimization under the Polyak-Łojasiewicz Condition
Yunyan Bai
Yuxing Liu
Luo Luo
132
1
0
04 Feb 2024
On the Stochastic (Variance-Reduced) Proximal Gradient Method for
  Regularized Expected Reward Optimization
On the Stochastic (Variance-Reduced) Proximal Gradient Method for Regularized Expected Reward Optimization
Ling Liang
Haizhao Yang
160
1
0
23 Jan 2024
Global Convergence of Natural Policy Gradient with Hessian-aided
  Momentum Variance Reduction
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance ReductionJournal of Scientific Computing (J. Sci. Comput.), 2024
Jie Feng
Ke Wei
Jinchi Chen
308
3
0
02 Jan 2024
A safe exploration approach to constrained Markov decision processes
A safe exploration approach to constrained Markov decision processesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Tingting Ni
Maryam Kamgarpour
286
4
0
01 Dec 2023
On the Second-Order Convergence of Biased Policy Gradient Algorithms
On the Second-Order Convergence of Biased Policy Gradient AlgorithmsInternational Conference on Machine Learning (ICML), 2023
Siqiao Mu
Diego Klabjan
305
4
0
05 Nov 2023
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm
  with General Parameterization for Infinite Horizon Discounted Reward Markov
  Decision Processes
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm with General Parameterization for Infinite Horizon Discounted Reward Markov Decision ProcessesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Washim Uddin Mondal
Vaneet Aggarwal
206
18
0
18 Oct 2023
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy
  Gradient Methods
Beyond Stationarity: Convergence Analysis of Stochastic Softmax Policy Gradient MethodsInternational Conference on Learning Representations (ICLR), 2023
Sara Klein
Simon Weissmann
Leif Döring
221
12
0
04 Oct 2023
A Homogenization Approach for Gradient-Dominated Stochastic Optimization
A Homogenization Approach for Gradient-Dominated Stochastic OptimizationConference on Uncertainty in Artificial Intelligence (UAI), 2023
Jiyuan Tan
Chenyu Xue
Chuwen Zhang
Qi Deng
Dongdong Ge
Yinyu Ye
127
2
0
21 Aug 2023
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Fine-Tuning Language Models with Advantage-Induced Policy Alignment
Banghua Zhu
Hiteshi Sharma
Felipe Vieira Frujeri
Shi Dong
Chenguang Zhu
Michael I. Jordan
Jiantao Jiao
OSLM
225
47
0
04 Jun 2023
Reinforcement Learning with General Utilities: Simpler Variance
  Reduction and Large State-Action Space
Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action SpaceInternational Conference on Machine Learning (ICML), 2023
Anas Barakat
Ilyas Fatkhullin
Niao He
202
14
0
02 Jun 2023
On the Linear Convergence of Policy Gradient under Hadamard
  Parameterization
On the Linear Convergence of Policy Gradient under Hadamard ParameterizationInformation and Inference A Journal of the IMA (JIII), 2023
Jiacai Liu
Jinchi Chen
Ke Wei
184
3
0
31 May 2023
Decision-Aware Actor-Critic with Function Approximation and Theoretical
  Guarantees
Decision-Aware Actor-Critic with Function Approximation and Theoretical GuaranteesNeural Information Processing Systems (NeurIPS), 2023
Sharan Vaswani
A. Kazemi
Reza Babanezhad
Nicolas Le Roux
OffRL
277
5
0
24 May 2023
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted
  Markov Decision Processes
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision ProcessesNeural Information Processing Systems (NeurIPS), 2023
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
274
15
0
22 Feb 2023
Stochastic Policy Gradient Methods: Improved Sample Complexity for
  Fisher-non-degenerate Policies
Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate PoliciesInternational Conference on Machine Learning (ICML), 2023
Ilyas Fatkhullin
Anas Barakat
Anastasia Kireeva
Niao He
343
51
0
03 Feb 2023
A Novel Framework for Policy Mirror Descent with General
  Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear ConvergenceNeural Information Processing Systems (NeurIPS), 2023
Carlo Alfano
Rui Yuan
Patrick Rebeschini
466
18
0
30 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy
  Optimization
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
96
1
0
28 Jan 2023
Understanding the Complexity Gains of Single-Task RL with a Curriculum
Understanding the Complexity Gains of Single-Task RL with a CurriculumInternational Conference on Machine Learning (ICML), 2022
Qiyang Li
Yuexiang Zhai
Yi-An Ma
Sergey Levine
339
16
0
24 Dec 2022
On the Global Convergence of Fitted Q-Iteration with Two-layer Neural
  Network Parametrization
On the Global Convergence of Fitted Q-Iteration with Two-layer Neural Network ParametrizationInternational Conference on Machine Learning (ICML), 2022
Mudit Gaur
Vaneet Aggarwal
Mridul Agarwal
MLT
269
1
0
14 Nov 2022
From Gradient Flow on Population Loss to Learning with Stochastic
  Gradient Descent
From Gradient Flow on Population Loss to Learning with Stochastic Gradient DescentNeural Information Processing Systems (NeurIPS), 2022
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
122
5
0
13 Oct 2022
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of
  SGD for Gradient-Dominated Function
Stochastic Second-Order Methods Improve Best-Known Sample Complexity of SGD for Gradient-Dominated FunctionNeural Information Processing Systems (NeurIPS), 2022
Saeed Masiha
Saber Salehkaleybar
Niao He
Negar Kiyavash
Patrick Thiran
270
20
0
25 May 2022
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy
  Gradient Methods with Entropy Regularization
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
Yuhao Ding
Junzi Zhang
Hyunin Lee
Javad Lavaei
373
23
0
19 Oct 2021
On the Global Optimum Convergence of Momentum-based Policy Gradient
On the Global Optimum Convergence of Momentum-based Policy Gradient
Yuhao Ding
Junzi Zhang
Javad Lavaei
253
25
0
19 Oct 2021
12
Next