ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.11232
  4. Cited By
Policy Gradient and Actor-Critic Learning in Continuous Time and Space:
  Theory and Algorithms

Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms

22 November 2021
Yanwei Jia
X. Zhou
    OffRL
ArXivPDFHTML

Papers citing "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms"

37 / 37 papers shown
Title
Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo
Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo
Matthieu Meunier
C. Reisinger
Yufei Zhang
39
0
0
27 Mar 2025
Accuracy of Discretely Sampled Stochastic Policies in Continuous-time Reinforcement Learning
Yanwei Jia
Du Ouyang
Yufei Zhang
40
3
0
13 Mar 2025
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Michael Psenka
Alejandro Escontrela
Pieter Abbeel
Yi-An Ma
DiffM
89
23
0
17 Feb 2025
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
D. Yao
Wenpin Tang
55
0
0
03 Feb 2025
Exploratory Utility Maximization Problem with Tsallis Entropy
Exploratory Utility Maximization Problem with Tsallis Entropy
Chen Ziyi
Gu Jia-wen
53
0
0
03 Feb 2025
Reinforcement Learning for Jump-Diffusions, with Financial Applications
Reinforcement Learning for Jump-Diffusions, with Financial Applications
Xuefeng Gao
Lingfei Li
X. Zhou
39
1
0
08 Jan 2025
Robust Reinforcement Learning under Diffusion Models for Data with Jumps
Chenyang Jiang
Donggyu Kim
Alejandra Quintos
Yazhen Wang
72
0
0
18 Nov 2024
Regret of exploratory policy improvement and $q$-learning
Regret of exploratory policy improvement and qqq-learning
Wenpin Tang
X. Zhou
39
0
0
02 Nov 2024
Action Gaps and Advantages in Continuous-Time Distributional
  Reinforcement Learning
Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
Harley Wiltzer
Marc G. Bellemare
D. Meger
Patrick Shafto
Yash Jhaveri
29
1
0
14 Oct 2024
On the grid-sampling limit SDE
On the grid-sampling limit SDE
Christian Bender
Nguyen Tran Thuan
16
1
0
10 Oct 2024
A random measure approach to reinforcement learning in continuous time
A random measure approach to reinforcement learning in continuous time
Christian Bender
Nguyen Tran Thuan
20
2
0
25 Sep 2024
Scores as Actions: a framework of fine-tuning diffusion models by
  continuous-time reinforcement learning
Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
David D. Yao
Wenpin Tang
37
3
0
12 Sep 2024
Reward-Directed Score-Based Diffusion Models via q-Learning
Reward-Directed Score-Based Diffusion Models via q-Learning
Xuefeng Gao
Jiale Zha
X. Zhou
DiffM
28
2
0
07 Sep 2024
Exploratory Optimal Stopping: A Singular Control Formulation
Exploratory Optimal Stopping: A Singular Control Formulation
Jodi Dianetti
Giorgio Ferrari
Renyuan Xu
26
3
0
18 Aug 2024
On Bellman equations for continuous-time policy evaluation I:
  discretization and approximation
On Bellman equations for continuous-time policy evaluation I: discretization and approximation
Wenlong Mou
Yuhua Zhu
OffRL
29
2
0
08 Jul 2024
Reinforcement Learning for Intensity Control: An Application to
  Choice-Based Network Revenue Management
Reinforcement Learning for Intensity Control: An Application to Choice-Based Network Revenue Management
Huiling Meng
Ningyuan Chen
Xuefeng Gao
55
1
0
08 Jun 2024
Control randomisation approach for policy gradient and application to
  reinforcement learning in optimal switching
Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching
R. Denkert
Huyen Pham
X. Warin
33
4
0
27 Apr 2024
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic
  Variation Penalty
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty
Yanwei Jia
36
2
0
19 Apr 2024
Score-based Diffusion Models via Stochastic Differential Equations -- a
  Technical Tutorial
Score-based Diffusion Models via Stochastic Differential Equations -- a Technical Tutorial
Wenpin Tang
Hanyang Zhao
DiffM
36
23
0
12 Feb 2024
Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning
Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning
Xiangyu Cui
Xun Li
Yun Shi
Si Zhao
27
1
0
24 Dec 2023
Data-driven optimal stopping: A pure exploration analysis
Data-driven optimal stopping: A pure exploration analysis
Soren Christensen
Niklas Dexheimer
C. Strauch
36
2
0
10 Dec 2023
Fast Policy Learning for Linear Quadratic Control with Entropy
  Regularization
Fast Policy Learning for Linear Quadratic Control with Entropy Regularization
Xin Guo
Xinyu Li
Renyuan Xu
34
3
0
23 Nov 2023
Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
Andrea Angiuli
J. Fouque
Ruimeng Hu
Alan Raydan
30
5
0
19 Sep 2023
Actor critic learning algorithms for mean-field control with moment
  neural networks
Actor critic learning algorithms for mean-field control with moment neural networks
Huyen Pham
X. Warin
30
5
0
08 Sep 2023
Continuous-time q-learning for mean-field control problems
Continuous-time q-learning for mean-field control problems
Xiaoli Wei
Xian Yu
29
8
0
28 Jun 2023
Policy Optimization for Continuous Reinforcement Learning
Policy Optimization for Continuous Reinforcement Learning
Hanyang Zhao
Wenpin Tang
D. Yao
OffRL
24
17
0
30 May 2023
Actor-Critic learning for mean-field control in continuous time
Actor-Critic learning for mean-field control in continuous time
N. Frikha
Maximilien Germain
Mathieu Laurière
H. Pham
Xuan Song
30
16
0
13 Mar 2023
Managing Temporal Resolution in Continuous Value Estimation: A
  Fundamental Trade-off
Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off
Zichen Zhang
Johannes Kirschner
Junxi Zhang
Francesco Zanini
Alex Ayoub
Masood Dehghan
Dale Schuurmans
OffRL
11
3
0
17 Dec 2022
Convergence of policy gradient methods for finite-horizon exploratory
  linear-quadratic control problems
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
Michael Giegrich
Christoph Reisinger
Yufei Zhang
14
11
0
01 Nov 2022
Square-root regret bounds for continuous-time episodic Markov decision
  processes
Square-root regret bounds for continuous-time episodic Markov decision processes
Xuefeng Gao
X. Zhou
43
6
0
03 Oct 2022
Choquet regularization for reinforcement learning
Choquet regularization for reinforcement learning
Xia Han
Ruodu Wang
X. Zhou
21
2
0
17 Aug 2022
Optimal scheduling of entropy regulariser for continuous-time
  linear-quadratic reinforcement learning
Optimal scheduling of entropy regulariser for continuous-time linear-quadratic reinforcement learning
Lukasz Szpruch
Tanut Treetanthiploet
Yufei Zhang
11
8
0
08 Aug 2022
q-Learning in Continuous Time
q-Learning in Continuous Time
Yanwei Jia
X. Zhou
OffRL
40
67
0
02 Jul 2022
Logarithmic regret bounds for continuous-time average-reward Markov
  decision processes
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
29
8
0
23 May 2022
Linear convergence of a policy gradient method for some finite horizon
  continuous time control problems
Linear convergence of a policy gradient method for some finite horizon continuous time control problems
C. Reisinger
Wolfgang Stockinger
Yufei Zhang
16
5
0
22 Mar 2022
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
24
165
0
08 Dec 2021
Deep Reinforcement Learning for Autonomous Driving: A Survey
Deep Reinforcement Learning for Autonomous Driving: A Survey
B. R. Kiran
Ibrahim Sobh
V. Talpaert
Patrick Mannion
A. A. Sallab
S. Yogamani
P. Pérez
143
1,628
0
02 Feb 2020
1