ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.07296
  4. Cited By
Convergence of Policy Gradient for Entropy Regularized MDPs with Neural
  Network Approximation in the Mean-Field Regime
v1v2 (latest)

Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime

International Conference on Machine Learning (ICML), 2022
18 January 2022
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
ArXiv (abs)PDFHTML

Papers citing "Convergence of Policy Gradient for Entropy Regularized MDPs with Neural Network Approximation in the Mean-Field Regime"

15 / 15 papers shown
Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control
Chengxiu Hua
Jiawen Gu
Yushun Tang
299
1
0
20 Oct 2025
Phase Diagram of Dropout for Two-Layer Neural Networks in the Mean-Field Regime
Phase Diagram of Dropout for Two-Layer Neural Networks in the Mean-Field Regime
Lénaic Chizat
Pierre Marion
Yerkin Yesbay
146
1
0
08 Oct 2025
RPRO: Ranked Preference Reinforcement Optimization for Enhancing Medical QA and Diagnostic Reasoning
RPRO: Ranked Preference Reinforcement Optimization for Enhancing Medical QA and Diagnostic Reasoning
Chia-Hsuan Hsu
J. Ding
Hsin-Ling Hsu
Feng Liu
Li-Hung Yao
Chun-Chieh Liao
Feng Liu
Fang-Ming Hung
LRM
304
0
0
31 Aug 2025
Efficient Computation of Blackwell Optimal Policies using Rational Functions
Efficient Computation of Blackwell Optimal Policies using Rational Functions
Dibyangshu Mukherjee
Shivaram Kalyanakrishnan
OffRL
111
1
0
25 Aug 2025
Mean-Field Generalisation Bounds for Learning Controls in Stochastic Environments
Mean-Field Generalisation Bounds for Learning Controls in Stochastic Environments
Boris Baros
Samuel N. Cohen
C. Reisinger
AI4CE
203
1
0
21 Aug 2025
Non-convex entropic mean-field optimization via Best Response flow
Non-convex entropic mean-field optimization via Best Response flow
Razvan-Andrei Lascu
Mateusz B. Majka
351
2
0
28 May 2025
Meta-reinforcement learning with minimum attention
Meta-reinforcement learning with minimum attention
Pilhwa Lee
Shashank Gupta
OffRL
372
0
0
22 May 2025
Linear convergence of proximal descent schemes on the Wasserstein space
Linear convergence of proximal descent schemes on the Wasserstein space
Razvan-Andrei Lascu
Mateusz B. Majka
David Siska
Łukasz Szpruch
442
2
0
22 Nov 2024
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces
B. Kerimkulov
J. Leahy
David Siska
Lukasz Szpruch
Yufei Zhang
457
17
0
04 Oct 2023
Policy Optimization for Continuous Reinforcement Learning
Policy Optimization for Continuous Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Hanyang Zhao
Wenpin Tang
D. Yao
OffRL
460
34
0
30 May 2023
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and
  Global Optimality
Matryoshka Policy Gradient for Entropy-Regularized RL: Convergence and Global Optimality
François Ged
M. H. Veiga
363
2
0
22 Mar 2023
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement
  Learning via Multi-Level Monte Carlo Actor-Critic
Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-CriticInternational Conference on Machine Learning (ICML), 2023
Wesley A Suttle
Amrit Singh Bedi
Bhrij Patel
Brian M Sadler
Alec Koppel
Dinesh Manocha
328
24
0
28 Jan 2023
Geometry and convergence of natural policy gradient methods
Geometry and convergence of natural policy gradient methodsInformation Geometry (IG), 2022
Johannes Muller
Guido Montúfar
379
17
0
03 Nov 2022
Convergence of policy gradient methods for finite-horizon exploratory
  linear-quadratic control problems
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problemsSIAM Journal of Control and Optimization (SICON), 2022
Michael Giegrich
Christoph Reisinger
Yufei Zhang
360
24
0
01 Nov 2022
Linear convergence of a policy gradient method for some finite horizon
  continuous time control problems
Linear convergence of a policy gradient method for some finite horizon continuous time control problemsSIAM Journal of Control and Optimization (SICON), 2022
C. Reisinger
Wolfgang Stockinger
Yufei Zhang
476
12
0
22 Mar 2022
1
Page 1 of 1