ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.06668
  4. Cited By
Accountable Off-Policy Evaluation With Kernel Bellman Statistics

Accountable Off-Policy Evaluation With Kernel Bellman Statistics

15 August 2020
Yihao Feng
Zhaolin Ren
Ziyang Tang
Qiang Liu
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Accountable Off-Policy Evaluation With Kernel Bellman Statistics"

34 / 34 papers shown
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Nan Jiang
Tengyang Xie
OffRL
176
10
0
05 Oct 2025
Sampling Complexity of TD and PPO in RKHS
Sampling Complexity of TD and PPO in RKHS
Lu Zou
Wendi Ren
Weizhong Zhang
Liang Ding
Shuang Li
99
1
0
29 Sep 2025
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Hongyi Zhou
Josiah P. Hanna
Jin Zhu
Ying Yang
Chengchun Shi
OffRL
183
3
0
28 May 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
578
4
0
22 Feb 2025
Combining Experimental and Historical Data for Policy Evaluation
Combining Experimental and Historical Data for Policy Evaluation
Ting Li
Chengchun Shi
Qianglin Wen
Yang Sui
Yongli Qin
Chunbo Lai
Hongtu Zhu
OffRL
368
3
0
01 Jun 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
200
0
0
24 Dec 2023
Probabilistic Offline Policy Ranking with Approximate Bayesian
  Computation
Probabilistic Offline Policy Ranking with Approximate Bayesian Computation
Longchao Da
Porter Jenkins
Trevor Schwantes
Jeffrey Dotson
Hua Wei
OffRL
175
3
0
17 Dec 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified
  Error Quantification Framework
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework
Wenzhuo Zhou
Yuhan Li
Ruoqing Zhu
Annie Qu
OffRL
265
7
0
23 Sep 2023
Hallucinated Adversarial Control for Conservative Offline Policy
  Evaluation
Hallucinated Adversarial Control for Conservative Offline Policy EvaluationConference on Uncertainty in Artificial Intelligence (UAI), 2023
Jonas Rothfuss
Bhavya Sukhija
Tobias Birchler
Parnian Kassraie
Andreas Krause
OffRL
204
14
0
02 Mar 2023
A Reinforcement Learning Framework for Dynamic Mediation Analysis
A Reinforcement Learning Framework for Dynamic Mediation AnalysisInternational Conference on Machine Learning (ICML), 2023
Linjuan Ge
Jitao Wang
C. Shi
Zhanghua Wu
Rui Song
247
6
0
31 Jan 2023
Value Enhancement of Reinforcement Learning via Efficient and Robust
  Trust Region Optimization
Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region OptimizationJournal of the American Statistical Association (JASA), 2023
C. Shi
Zhengling Qi
Jianing Wang
Fan Zhou
OffRL
182
9
0
05 Jan 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
286
23
0
29 Dec 2022
A Unified Framework for Alternating Offline Model Training and Policy
  Learning
A Unified Framework for Alternating Offline Model Training and Policy LearningNeural Information Processing Systems (NeurIPS), 2022
Shentao Yang
Shujian Zhang
Yihao Feng
Mi Zhou
OffRL
229
17
0
12 Oct 2022
Conformal Off-policy Prediction
Conformal Off-policy PredictionInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yingying Zhang
C. Shi
Shuang Luo
OffRL
267
13
0
14 Jun 2022
Why So Pessimistic? Estimating Uncertainties for Offline RL through
  Ensembles, and Why Their Independence Matters
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence MattersNeural Information Processing Systems (NeurIPS), 2022
Seyed Kamyar Seyed Ghasemipour
S. Gu
Ofir Nachum
OffRL
220
87
0
27 May 2022
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Bellman Residual Orthogonalization for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Andrea Zanette
Martin J. Wainwright
OffRL
338
8
0
24 Mar 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision ProcessJournal of the American Statistical Association (JASA), 2022
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
361
38
0
22 Feb 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function
  Approximators: Z-Estimation and Inference Theory
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference TheoryInternational Conference on Machine Learning (ICML), 2022
Ruiqi Zhang
Xuezhou Zhang
Chengzhuo Ni
Mengdi Wang
OffRL
231
18
0
10 Feb 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Xiaohong Chen
Zhengling Qi
OffRL
403
35
0
17 Jan 2022
Hyperparameter Selection Methods for Fitted Q-Evaluation with Error
  Guarantee
Hyperparameter Selection Methods for Fitted Q-Evaluation with Error Guarantee
Kohei Miyaguchi
OffRL
239
1
0
07 Jan 2022
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Explaining Off-Policy Actor-Critic From A Bias-Variance Perspective
Ting-Han Fan
Peter J. Ramadge
CMLFAttOffRL
172
2
0
06 Oct 2021
Optimal policy evaluation using kernel-based temporal difference methods
Optimal policy evaluation using kernel-based temporal difference methodsAnnals of Statistics (Ann. Stat.), 2021
Yaqi Duan
Mengdi Wang
Martin J. Wainwright
OffRL
166
30
0
24 Sep 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic SettingsNeural Information Processing Systems (NeurIPS), 2021
Ming Yin
Yu Wang
OffRL
269
19
0
13 May 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval EstimationInternational Conference on Machine Learning (ICML), 2021
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
193
43
0
10 May 2021
Nearly Horizon-Free Offline Reinforcement Learning
Nearly Horizon-Free Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Zhaolin Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
259
51
0
25 Mar 2021
Infinite-Horizon Offline Reinforcement Learning with Linear Function
  Approximation: Curse of Dimensionality and Algorithm
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm
Lin Chen
B. Scherrer
Peter L. Bartlett
OffRL
316
16
0
17 Mar 2021
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and
  Dual Bounds
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual BoundsInternational Conference on Learning Representations (ICLR), 2021
Yihao Feng
Ziyang Tang
Na Zhang
Qiang Liu
OffRL
202
14
0
09 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Instabilities of Offline RL with Pre-Trained Neural RepresentationInternational Conference on Machine Learning (ICML), 2021
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
263
43
0
08 Mar 2021
Bootstrapping Fitted Q-Evaluation for Off-Policy Inference
Bootstrapping Fitted Q-Evaluation for Off-Policy InferenceInternational Conference on Machine Learning (ICML), 2021
Botao Hao
X. Ji
Yaqi Duan
Hao Lu
Csaba Szepesvári
Mengdi Wang
OffRL
225
43
0
06 Feb 2021
Near-Optimal Offline Reinforcement Learning via Double Variance
  Reduction
Near-Optimal Offline Reinforcement Learning via Double Variance ReductionNeural Information Processing Systems (NeurIPS), 2021
Ming Yin
Yu Bai
Yu Wang
OffRL
236
70
0
02 Feb 2021
Offline Policy Selection under Uncertainty
Offline Policy Selection under UncertaintyInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Mengjiao Yang
Bo Dai
Ofir Nachum
George Tucker
Dale Schuurmans
OffRL
207
35
0
12 Dec 2020
Off-Policy Interval Estimation with Lipschitz Value Iteration
Off-Policy Interval Estimation with Lipschitz Value IterationNeural Information Processing Systems (NeurIPS), 2020
Ziyang Tang
Yihao Feng
Na Zhang
Jian Peng
Qiang Liu
OffRL
135
6
0
29 Oct 2020
What are the Statistical Limits of Offline RL with Linear Function
  Approximation?
What are the Statistical Limits of Offline RL with Linear Function Approximation?
Ruosong Wang
Dean Phillips Foster
Sham Kakade
OffRL
381
169
0
22 Oct 2020
Batch Value-function Approximation with Only Realizability
Batch Value-function Approximation with Only RealizabilityInternational Conference on Machine Learning (ICML), 2020
Tengyang Xie
Nan Jiang
OffRL
630
127
0
11 Aug 2020
1