ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.00713
  4. Cited By
q-Learning in Continuous Time

q-Learning in Continuous Time

2 July 2022
Yanwei Jia
X. Zhou
    OffRL
ArXivPDFHTML

Papers citing "q-Learning in Continuous Time"

32 / 32 papers shown
Title
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do
Yoav Wald
M. Goldstein
Yonathan Efroni
Wouter A. C. van Amsterdam
Rajesh Ranganath
CML
74
0
0
20 Mar 2025
Accuracy of Discretely Sampled Stochastic Policies in Continuous-time Reinforcement Learning
Yanwei Jia
Du Ouyang
Yufei Zhang
40
3
0
13 Mar 2025
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Learning a Diffusion Model Policy from Rewards via Q-Score Matching
Michael Psenka
Alejandro Escontrela
Pieter Abbeel
Yi-An Ma
DiffM
89
23
0
17 Feb 2025
Exploratory Utility Maximization Problem with Tsallis Entropy
Exploratory Utility Maximization Problem with Tsallis Entropy
Chen Ziyi
Gu Jia-wen
53
0
0
03 Feb 2025
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
D. Yao
Wenpin Tang
55
0
0
03 Feb 2025
Reinforcement Learning for Jump-Diffusions, with Financial Applications
Reinforcement Learning for Jump-Diffusions, with Financial Applications
Xuefeng Gao
Lingfei Li
X. Zhou
39
1
0
08 Jan 2025
Robust Reinforcement Learning under Diffusion Models for Data with Jumps
Chenyang Jiang
Donggyu Kim
Alejandra Quintos
Yazhen Wang
72
0
0
18 Nov 2024
Regret of exploratory policy improvement and $q$-learning
Regret of exploratory policy improvement and qqq-learning
Wenpin Tang
X. Zhou
39
0
0
02 Nov 2024
Action Gaps and Advantages in Continuous-Time Distributional
  Reinforcement Learning
Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
Harley Wiltzer
Marc G. Bellemare
D. Meger
Patrick Shafto
Yash Jhaveri
29
1
0
14 Oct 2024
On the grid-sampling limit SDE
On the grid-sampling limit SDE
Christian Bender
Nguyen Tran Thuan
16
1
0
10 Oct 2024
A random measure approach to reinforcement learning in continuous time
A random measure approach to reinforcement learning in continuous time
Christian Bender
Nguyen Tran Thuan
20
2
0
25 Sep 2024
Scores as Actions: a framework of fine-tuning diffusion models by
  continuous-time reinforcement learning
Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning
Hanyang Zhao
Haoxian Chen
Ji Zhang
David D. Yao
Wenpin Tang
37
3
0
12 Sep 2024
Reward-Directed Score-Based Diffusion Models via q-Learning
Reward-Directed Score-Based Diffusion Models via q-Learning
Xuefeng Gao
Jiale Zha
X. Zhou
DiffM
26
2
0
07 Sep 2024
Exploratory Optimal Stopping: A Singular Control Formulation
Exploratory Optimal Stopping: A Singular Control Formulation
Jodi Dianetti
Giorgio Ferrari
Renyuan Xu
26
3
0
18 Aug 2024
On Bellman equations for continuous-time policy evaluation I:
  discretization and approximation
On Bellman equations for continuous-time policy evaluation I: discretization and approximation
Wenlong Mou
Yuhua Zhu
OffRL
29
2
0
08 Jul 2024
Continuous-time q-Learning for Jump-Diffusion Models under Tsallis
  Entropy
Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy
Lijun Bo
Yijie Huang
Xiang Yu
Tingting Zhang
39
3
0
04 Jul 2024
Reinforcement Learning for Intensity Control: An Application to
  Choice-Based Network Revenue Management
Reinforcement Learning for Intensity Control: An Application to Choice-Based Network Revenue Management
Huiling Meng
Ningyuan Chen
Xuefeng Gao
55
1
0
08 Jun 2024
Control randomisation approach for policy gradient and application to
  reinforcement learning in optimal switching
Control randomisation approach for policy gradient and application to reinforcement learning in optimal switching
R. Denkert
Huyen Pham
X. Warin
33
4
0
27 Apr 2024
On the stability of Lipschitz continuous control problems and its
  application to reinforcement learning
On the stability of Lipschitz continuous control problems and its application to reinforcement learning
Namkyeong Cho
Yeoneung Kim
21
0
0
20 Apr 2024
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic
  Variation Penalty
Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty
Yanwei Jia
36
2
0
19 Apr 2024
Score-based Diffusion Models via Stochastic Differential Equations -- a
  Technical Tutorial
Score-based Diffusion Models via Stochastic Differential Equations -- a Technical Tutorial
Wenpin Tang
Hanyang Zhao
DiffM
36
23
0
12 Feb 2024
Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning
Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning
Xiangyu Cui
Xun Li
Yun Shi
Si Zhao
27
1
0
24 Dec 2023
Data-Driven Merton's Strategies via Policy Randomization
Data-Driven Merton's Strategies via Policy Randomization
Min Dai
Yuchao Dong
Yanwei Jia
Xun Yu Zhou
33
10
0
19 Dec 2023
Fast Policy Learning for Linear Quadratic Control with Entropy
  Regularization
Fast Policy Learning for Linear Quadratic Control with Entropy Regularization
Xin Guo
Xinyu Li
Renyuan Xu
34
3
0
23 Nov 2023
Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
Deep Reinforcement Learning for Infinite Horizon Mean Field Problems in Continuous Spaces
Andrea Angiuli
J. Fouque
Ruimeng Hu
Alan Raydan
30
5
0
19 Sep 2023
Continuous-time q-learning for mean-field control problems
Continuous-time q-learning for mean-field control problems
Xiaoli Wei
Xian Yu
29
8
0
28 Jun 2023
Policy Optimization for Continuous Reinforcement Learning
Policy Optimization for Continuous Reinforcement Learning
Hanyang Zhao
Wenpin Tang
D. Yao
OffRL
24
17
0
30 May 2023
Convergence of policy gradient methods for finite-horizon exploratory
  linear-quadratic control problems
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems
Michael Giegrich
Christoph Reisinger
Yufei Zhang
14
11
0
01 Nov 2022
Square-root regret bounds for continuous-time episodic Markov decision
  processes
Square-root regret bounds for continuous-time episodic Markov decision processes
Xuefeng Gao
X. Zhou
40
6
0
03 Oct 2022
Choquet regularization for reinforcement learning
Choquet regularization for reinforcement learning
Xia Han
Ruodu Wang
X. Zhou
21
2
0
17 Aug 2022
Logarithmic regret bounds for continuous-time average-reward Markov
  decision processes
Logarithmic regret bounds for continuous-time average-reward Markov decision processes
Xuefeng Gao
X. Zhou
29
8
0
23 May 2022
Policy Gradient and Actor-Critic Learning in Continuous Time and Space:
  Theory and Algorithms
Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms
Yanwei Jia
X. Zhou
OffRL
16
79
0
22 Nov 2021
1