ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2007.11771
  4. Cited By
Batch Policy Learning in Average Reward Markov Decision Processes

Batch Policy Learning in Average Reward Markov Decision Processes

23 July 2020
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
S. Murphy
    OffRL
ArXivPDFHTML

Papers citing "Batch Policy Learning in Average Reward Markov Decision Processes"

50 / 58 papers shown
Title
Reinforcement Learning for Individual Optimal Policy from Heterogeneous Data
Reinforcement Learning for Individual Optimal Policy from Heterogeneous Data
Rui Miao
B. Shahbaba
A. Qu
OffRL
15
0
0
14 May 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
126
0
0
01 May 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
65
0
0
22 Feb 2025
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement
  Learning
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CML
OffRL
77
1
0
08 Dec 2024
StepCountJITAI: simulation environment for RL with application to
  physical activity adaptive intervention
StepCountJITAI: simulation environment for RL with application to physical activity adaptive intervention
Karine Karine
Benjamin M. Marlin
50
0
0
01 Nov 2024
Combining Experimental and Historical Data for Policy Evaluation
Combining Experimental and Historical Data for Policy Evaluation
Ting Li
Chengchun Shi
Qianglin Wen
Yang Sui
Yongli Qin
Chunbo Lai
Hongtu Zhu
OffRL
44
0
0
01 Jun 2024
Spatially Randomized Designs Can Enhance Policy Evaluation
Spatially Randomized Designs Can Enhance Policy Evaluation
Ying Yang
Chengchun Shi
Fang Yao
Shouyang Wang
Hongtu Zhu
OffRL
33
0
0
18 Mar 2024
Regularized DeepIV with Model Selection
Regularized DeepIV with Model Selection
Zihao Li
Hui Lan
Vasilis Syrgkanis
Mengdi Wang
Masatoshi Uehara
36
2
0
07 Mar 2024
Offline Multi-task Transfer RL with Representational Penalization
Offline Multi-task Transfer RL with Representational Penalization
Avinandan Bose
S. S. Du
Maryam Fazel
OffRL
49
12
0
19 Feb 2024
Off-Policy Evaluation in Markov Decision Processes under Weak
  Distributional Overlap
Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap
Mohammad Mehrabi
Stefan Wager
OffRL
24
14
0
13 Feb 2024
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
Jin Zhu
Runzhe Wan
Zhengling Qi
S. Luo
C. Shi
OffRL
32
0
0
28 Oct 2023
Bridging Distributionally Robust Learning and Offline RL: An Approach to
  Mitigate Distribution Shift and Partial Data Coverage
Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Kishan Panaganti
Zaiyan Xu
D. Kalathil
Mohammad Ghavamzadeh
OOD
OffRL
23
6
0
27 Oct 2023
Randomization Inference When N Equals One
Randomization Inference When N Equals One
Tengyuan Liang
Benjamin Recht
CML
27
5
0
25 Oct 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified
  Error Quantification Framework
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework
Wenzhuo Zhou
Yuhan Li
Ruoqing Zhu
Annie Qu
OffRL
21
4
0
23 Sep 2023
Off-policy Evaluation in Doubly Inhomogeneous Environments
Off-policy Evaluation in Doubly Inhomogeneous Environments
Zeyu Bian
C. Shi
Zhengling Qi
Lan Wang
OffRL
27
3
0
14 Jun 2023
Evaluating Dynamic Conditional Quantile Treatment Effects with
  Applications in Ridesharing
Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing
Ting Li
C. Shi
Zhaohua Lu
Yi Li
Hongtu Zhu
33
2
0
17 May 2023
Assessing the Impact of Context Inference Error and Partial
  Observability on RL Methods for Just-In-Time Adaptive Interventions
Assessing the Impact of Context Inference Error and Partial Observability on RL Methods for Just-In-Time Adaptive Interventions
Karine Karine
P. Klasnja
Susan A. Murphy
Benjamin M. Marlin
OffRL
14
1
0
17 May 2023
Sequential Knockoffs for Variable Selection in Reinforcement Learning
Sequential Knockoffs for Variable Selection in Reinforcement Learning
Tao Ma
Hengrui Cai
Zhengling Qi
C. Shi
Eric B. Laber
16
3
0
24 Mar 2023
A Unified Framework of Policy Learning for Contextual Bandit with
  Confounding Bias and Missing Observations
A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Siyu Chen
Yitan Wang
Zhaoran Wang
Zhuoran Yang
OffRL
28
2
0
20 Mar 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with
  General Function Approximation and Single-Policy Concentrability
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability
Hanlin Zhu
Amy Zhang
OffRL
8
2
0
07 Feb 2023
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints
Akhil Agnihotri
R. Jain
Haipeng Luo
16
1
0
02 Feb 2023
Revisiting Bellman Errors for Offline Model Selection
Revisiting Bellman Errors for Offline Model Selection
Joshua P. Zitovsky
Daniel de Marchi
Rishabh Agarwal
Michael R. Kosorok University of North Carolina at Chapel Hill
OffRL
27
5
0
31 Jan 2023
A Reinforcement Learning Framework for Dynamic Mediation Analysis
A Reinforcement Learning Framework for Dynamic Mediation Analysis
Linjuan Ge
Jitao Wang
C. Shi
Zhanghua Wu
Rui Song
27
5
0
31 Jan 2023
STEEL: Singularity-aware Reinforcement Learning
STEEL: Singularity-aware Reinforcement Learning
Xiaohong Chen
Zhengling Qi
Runzhe Wan
OffRL
22
2
0
30 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
30
15
0
30 Jan 2023
Value Enhancement of Reinforcement Learning via Efficient and Robust
  Trust Region Optimization
Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization
C. Shi
Zhengling Qi
Jianing Wang
Fan Zhou
OffRL
20
3
0
05 Jan 2023
Deep Spectral Q-learning with Application to Mobile Health
Deep Spectral Q-learning with Application to Mobile Health
Yuhe Gao
C. Shi
R. Song
11
0
0
03 Jan 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
Yang Xu
Jin Zhu
C. Shi
S. Luo
R. Song
OffRL
21
12
0
29 Dec 2022
Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
Yang Xu
C. Shi
S. Luo
Lan Wang
R. Song
OffRL
27
4
0
29 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
28
66
0
13 Dec 2022
Doubly Inhomogeneous Reinforcement Learning
Doubly Inhomogeneous Reinforcement Learning
Liyuan Hu
Mengbing Li
C. Shi
Zhanghua Wu
Piotr Fryzlewicz
OffRL
24
2
0
08 Nov 2022
Optimal Conservative Offline RL with General Function Approximation via
  Augmented Lagrangian
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
33
26
0
01 Nov 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
23
16
0
26 Jul 2022
Regularizing a Model-based Policy Stationary Distribution to Stabilize
  Offline Reinforcement Learning
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
Shentao Yang
Yihao Feng
Shujian Zhang
Mi Zhou
OffRL
32
12
0
14 Jun 2022
Conformal Off-policy Prediction
Conformal Off-policy Prediction
Yingying Zhang
C. Shi
S. Luo
OffRL
28
10
0
14 Jun 2022
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise
  Reward
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Tengyu Xu
Yue Wang
Shaofeng Zou
Yingbin Liang
OffRL
22
12
0
13 Jun 2022
Jump-Start Reinforcement Learning
Jump-Start Reinforcement Learning
Ikechukwu Uchendu
Ted Xiao
Yao Lu
Banghua Zhu
Mengyuan Yan
...
Chuyuan Fu
Cong Ma
Jiantao Jiao
Sergey Levine
Karol Hausman
OffRL
OnRL
28
107
0
05 Apr 2022
Testing Stationarity and Change Point Detection in Reinforcement Learning
Testing Stationarity and Change Point Detection in Reinforcement Learning
Mengbing Li
C. Shi
Zhanghua Wu
Piotr Fryzlewicz
OffRL
37
9
0
03 Mar 2022
Statistically Efficient Advantage Learning for Offline Reinforcement
  Learning in Infinite Horizons
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons
C. Shi
S. Luo
Yuan Le
Hongtu Zhu
R. Song
OffRL
OnRL
24
10
0
26 Feb 2022
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
S. Luo
Ying Yang
Chengchun Shi
Fang Yao
Jieping Ye
Hongtu Zhu
41
5
0
22 Feb 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
S. Luo
Hong Zhu
R. Song
OffRL
21
30
0
22 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation
  in Two-sided Markets
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
S. Luo
R. Song
Hongtu Zhu
OffRL
33
6
0
21 Feb 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Xiaohong Chen
Zhengling Qi
OffRL
28
31
0
17 Jan 2022
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
16
4
0
29 Nov 2021
A Minimax Learning Approach to Off-Policy Evaluation in Confounded
  Partially Observable Markov Decision Processes
A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes
C. Shi
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
11
22
0
12 Nov 2021
False Correlation Reduction for Offline Reinforcement Learning
False Correlation Reduction for Offline Reinforcement Learning
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
27
9
0
24 Oct 2021
Off-Policy Evaluation in Partially Observed Markov Decision Processes
  under Sequential Ignorability
Off-Policy Evaluation in Partially Observed Markov Decision Processes under Sequential Ignorability
Yupeng Tang
Seung-seob Lee
OffRL
52
22
0
24 Oct 2021
Projected State-action Balancing Weights for Offline Reinforcement
  Learning
Projected State-action Balancing Weights for Offline Reinforcement Learning
Jiayi Wang
Zhengling Qi
Raymond K. W. Wong
OffRL
22
16
0
10 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
24
111
0
19 Aug 2021
The Curse of Passive Data Collection in Batch Reinforcement Learning
The Curse of Passive Data Collection in Batch Reinforcement Learning
Chenjun Xiao
Ilbin Lee
Bo Dai
Dale Schuurmans
Csaba Szepesvári
OffRL
17
1
0
18 Jun 2021
12
Next