ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.04515
  4. Cited By
Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings
v1v2v3 (latest)

Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings

13 January 2020
C. Shi
Shengyao Zhang
W. Lu
R. Song
    OffRL
ArXiv (abs)PDFHTMLGithub (5★)

Papers citing "Statistical Inference of the Value Function for Reinforcement Learning in Infinite Horizon Settings"

50 / 63 papers shown
A Two-armed Bandit Framework for A/B Testing
A Two-armed Bandit Framework for A/B Testing
Jinjuan Wang
Qianglin Wen
Yu Zhang
Xiaodong Yan
Chengchun Shi
221
1
0
24 Jul 2025
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Hongyi Zhou
Josiah P. Hanna
Jin Zhu
Ying Yang
Chengchun Shi
OffRL
249
4
0
28 May 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
992
1
0
01 May 2025
IGN : Implicit Generative Networks
IGN : Implicit Generative NetworksInternational Conference on Machine Learning and Applications (ICMLA), 2022
Haozheng Luo
Tianyi Wu
Feiyu Han
Zhijun Yan
OffRL
415
1
0
24 Feb 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
691
5
0
22 Feb 2025
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing
Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing
Jitao Wang
C. Shi
John D. Piette
Joshua R. Loftus
Donglin Zeng
Zhenke Wu
OffRL
548
3
0
10 Jan 2025
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement
  Learning
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CMLOffRL
370
7
0
08 Dec 2024
Causal Deepsets for Off-policy Evaluation under Spatial or
  Spatio-temporal Interferences
Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences
Runpeng Dai
Jianing Wang
Fan Zhou
Shuang Luo
Zhiwei Qin
Chengchun Shi
Hongtu Zhu
CMLOffRL
313
3
0
25 Jul 2024
Dynamic Online Recommendation for Two-Sided Market with Bayesian
  Incentive Compatibility
Dynamic Online Recommendation for Two-Sided Market with Bayesian Incentive Compatibility
Yuantong Li
Guang Cheng
Xiaowu Dai
238
1
0
04 Jun 2024
Combining Experimental and Historical Data for Policy Evaluation
Combining Experimental and Historical Data for Policy Evaluation
Ting Li
Chengchun Shi
Qianglin Wen
Yang Sui
Yongli Qin
Chunbo Lai
Hongtu Zhu
OffRL
481
4
0
01 Jun 2024
Estimation of subsidiary performance metrics under optimal policies
Estimation of subsidiary performance metrics under optimal policiesStatistica sinica (SS), 2024
Zhaoqi Li
Houssam Nassif
Alex Luedtke
217
5
0
08 Jan 2024
Neural Network Approximation for Pessimistic Offline Reinforcement
  Learning
Neural Network Approximation for Pessimistic Offline Reinforcement Learning
Di Wu
Yuling Jiao
Li Shen
Haizhao Yang
Xiliang Lu
OffRL
300
2
0
19 Dec 2023
AI in Pharma for Personalized Sequential Decision-Making: Methods,
  Applications and Opportunities
AI in Pharma for Personalized Sequential Decision-Making: Methods, Applications and Opportunities
Yuhan Li
Hongtao Zhang
Keaven M Anderson
Songzi Li
Ruoqing Zhu
190
0
0
30 Nov 2023
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu
Jiyuan Tu
Yichen Zhang
Xi Chen
OffRL
422
5
0
04 Oct 2023
Estimation and Inference in Distributional Reinforcement Learning
Estimation and Inference in Distributional Reinforcement Learning
Liangyu Zhang
Yang Peng
Jiadong Liang
Wenhao Yang
Zhihua Zhang
OffRL
201
4
0
29 Sep 2023
Stackelberg Batch Policy Learning
Stackelberg Batch Policy Learning
Wenzhuo Zhou
Annie Qu
OffRL
328
1
0
28 Sep 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified
  Error Quantification Framework
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework
Wenzhuo Zhou
Yuhan Li
Ruoqing Zhu
Annie Qu
OffRL
327
7
0
23 Sep 2023
Statistical Inference on Multi-armed Bandits with Delayed Feedback
Statistical Inference on Multi-armed Bandits with Delayed FeedbackInternational Conference on Machine Learning (ICML), 2023
Lei Shi
Jingshen Wang
Tianhao Wu
355
7
0
03 Jul 2023
Off-policy Evaluation in Doubly Inhomogeneous Environments
Off-policy Evaluation in Doubly Inhomogeneous EnvironmentsJournal of the American Statistical Association (JASA), 2023
Zeyu Bian
C. Shi
Zhengling Qi
Lan Wang
OffRL
328
12
0
14 Jun 2023
Testing for the Markov Property in Time Series via Deep Conditional
  Generative Learning
Testing for the Markov Property in Time Series via Deep Conditional Generative LearningJournal of The Royal Statistical Society Series B-statistical Methodology (JRSSB), 2023
Yunzhe Zhou
C. Shi
Lexin Li
Q. Yao
AI4TS
215
17
0
30 May 2023
Evaluating Dynamic Conditional Quantile Treatment Effects with
  Applications in Ridesharing
Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in RidesharingJournal of the American Statistical Association (JASA), 2023
Ting Li
C. Shi
Zhaohua Lu
Yi Li
Hongtu Zhu
261
10
0
17 May 2023
Conformal Off-Policy Evaluation in Markov Decision Processes
Conformal Off-Policy Evaluation in Markov Decision ProcessesIEEE Conference on Decision and Control (CDC), 2023
Daniele Foffano
Alessio Russo
Alexandre Proutiere
OffRL
425
9
0
05 Apr 2023
Sequential Knockoffs for Variable Selection in Reinforcement Learning
Sequential Knockoffs for Variable Selection in Reinforcement Learning
Tao Ma
Hengrui Cai
Zhengling Qi
C. Shi
Eric B. Laber
347
7
0
24 Mar 2023
Statistical Inference with Stochastic Gradient Methods under $ϕ$-mixing Data
Statistical Inference with Stochastic Gradient Methods under ϕϕϕ-mixing Data
Ruiqi Liu
Xinyu Chen
Zuofeng Shang
FedML
407
7
0
24 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Offline Minimax Soft-Q-learning Under Realizability and Partial CoverageNeural Information Processing Systems (NeurIPS), 2023
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
389
8
0
05 Feb 2023
Revisiting Bellman Errors for Offline Model Selection
Revisiting Bellman Errors for Offline Model SelectionInternational Conference on Machine Learning (ICML), 2023
Joshua P. Zitovsky
Daniel de Marchi
Rishabh Agarwal
Michael R. Kosorok University of North Carolina at Chapel Hill
OffRL
335
6
0
31 Jan 2023
A Reinforcement Learning Framework for Dynamic Mediation Analysis
A Reinforcement Learning Framework for Dynamic Mediation AnalysisInternational Conference on Machine Learning (ICML), 2023
Linjuan Ge
Jitao Wang
C. Shi
Zhanghua Wu
Rui Song
393
6
0
31 Jan 2023
STEEL: Singularity-aware Reinforcement Learning
STEEL: Singularity-aware Reinforcement Learning
Xiaohong Chen
Zhengling Qi
Runzhe Wan
OffRL
490
3
0
30 Jan 2023
Asymptotic Inference for Multi-Stage Stationary Treatment Policy with Variable Selection
Asymptotic Inference for Multi-Stage Stationary Treatment Policy with Variable Selection
Daiqi Gao
Yufeng Liu
D. Zeng
OffRL
289
0
0
29 Jan 2023
Quasi-optimal Reinforcement Learning with Continuous Actions
Quasi-optimal Reinforcement Learning with Continuous ActionsInternational Conference on Learning Representations (ICLR), 2023
Yuhan Li
Wenzhuo Zhou
Ruoqing Zhu
OffRL
276
9
0
21 Jan 2023
Value Enhancement of Reinforcement Learning via Efficient and Robust
  Trust Region Optimization
Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region OptimizationJournal of the American Statistical Association (JASA), 2023
C. Shi
Zhengling Qi
Jianing Wang
Fan Zhou
OffRL
227
9
0
05 Jan 2023
Deep Spectral Q-learning with Application to Mobile Health
Deep Spectral Q-learning with Application to Mobile Health
Yuhe Gao
C. Shi
R. Song
198
0
0
03 Jan 2023
Inference on Time Series Nonparametric Conditional Moment Restrictions
  Using General Sieves
Inference on Time Series Nonparametric Conditional Moment Restrictions Using General Sieves
Xiaohong Chen
Yuan Liao
Weichen Wang
227
0
0
31 Dec 2022
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent
Online Statistical Inference for Contextual Bandits via Stochastic Gradient Descent
Xinyu Chen
Zehua Lai
He Li
Yichen Zhang
Zhihong Liu
Yichen Zhang
266
7
0
30 Dec 2022
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
361
24
0
29 Dec 2022
Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
Yang Xu
C. Shi
Shuang Luo
Lan Wang
R. Song
OffRL
296
6
0
29 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
299
114
0
13 Dec 2022
Doubly Inhomogeneous Reinforcement Learning
Doubly Inhomogeneous Reinforcement Learning
Liyuan Hu
Mengbing Li
C. Shi
Zhanghua Wu
Piotr Fryzlewicz
OffRL
523
3
0
08 Nov 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPsNeural Information Processing Systems (NeurIPS), 2022
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
499
25
0
26 Jul 2022
Conformal Off-policy Prediction
Conformal Off-policy PredictionInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yingying Zhang
C. Shi
Shuang Luo
OffRL
319
15
0
14 Jun 2022
Testing Stationarity and Change Point Detection in Reinforcement Learning
Testing Stationarity and Change Point Detection in Reinforcement LearningAnnals of Statistics (Ann. Stat.), 2022
Mengbing Li
C. Shi
Zhanghua Wu
Piotr Fryzlewicz
OffRL
582
14
0
03 Mar 2022
Statistically Efficient Advantage Learning for Offline Reinforcement
  Learning in Infinite Horizons
Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite HorizonsJournal of the American Statistical Association (JASA), 2022
C. Shi
Shuang Luo
Yuan Le
Hongtu Zhu
R. Song
OffRLOnRL
278
17
0
26 Feb 2022
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Policy Evaluation for Temporal and/or Spatial Dependent Experiments
Shuang Luo
Ying Yang
Chengchun Shi
Fang Yao
Jieping Ye
Hongtu Zhu
634
11
0
22 Feb 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision ProcessJournal of the American Statistical Association (JASA), 2022
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
444
44
0
22 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation
  in Two-sided Markets
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided MarketsAnnals of Applied Statistics (AOAS), 2022
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
426
8
0
21 Feb 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function
  Approximators: Z-Estimation and Inference Theory
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference TheoryInternational Conference on Machine Learning (ICML), 2022
Ruiqi Zhang
Xuezhou Zhang
Chengzhuo Ni
Mengdi Wang
OffRL
267
19
0
10 Feb 2022
Transfer Q-learning
Transfer Q-learning
Elynn Y. Chen
Sai Li
Sai Li
OffRLOnRL
278
4
0
09 Feb 2022
Reinforcement Learning with Heterogeneous Data: Estimation and Inference
Reinforcement Learning with Heterogeneous Data: Estimation and Inference
Elynn Y. Chen
Rui Song
Michael I. Jordan
OffRL
289
10
0
31 Jan 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Xiaohong Chen
Zhengling Qi
OffRL
494
36
0
17 Jan 2022
A Statistical Analysis of Polyak-Ruppert Averaged Q-learning
A Statistical Analysis of Polyak-Ruppert Averaged Q-learningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Xiang Li
Wenhao Yang
Jiadong Liang
Zhihua Zhang
Michael I. Jordan
448
24
0
29 Dec 2021
12
Next
Page 1 of 2