Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1908.08526
Cited By
v1
v2
v3 (latest)
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Journal of machine learning research (JMLR), 2019
22 August 2019
Nathan Kallus
Masatoshi Uehara
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes"
50 / 127 papers shown
Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation
Feichen Gan
Youcun Lu
Yingying Zhang
Yukun Liu
OffRL
145
0
0
29 Oct 2025
Learning density ratios in causal inference using Bregman-Riesz regression
Oliver J. Hines
Caleb H. Miles
CML
185
3
0
17 Oct 2025
Latent Variable Modeling for Robust Causal Effect Estimation
Tetsuro Morimura
Tatsushi Oka
Yugo Suzuki
Daisuke Moriwaki
CML
198
0
0
27 Aug 2025
A Two-armed Bandit Framework for A/B Testing
Jinjuan Wang
Qianglin Wen
Yu Zhang
Xiaodong Yan
Chengchun Shi
220
1
0
24 Jul 2025
Doubly Robust Alignment for Large Language Models
Erhan Xu
Kai Ye
Hongyi Zhou
Luhan Zhu
Francesco Quinzan
Chengchun Shi
353
7
0
01 Jun 2025
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Hongyi Zhou
Josiah P. Hanna
Jin Zhu
Ying Yang
Chengchun Shi
OffRL
249
4
0
28 May 2025
Treatment Effect Estimation for Optimal Decision-Making
Dennis Frauen
Valentyn Melnychuk
Jonas Schweisthal
Mihaela van der Schaar
Stefan Feuerriegel
CML
394
6
0
19 May 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
OffRL
490
0
0
02 May 2025
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
International Conference on Learning Representations (ICLR), 2025
Yuheng Zhang
Nan Jiang
OffRL
302
5
0
03 Mar 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
689
4
0
22 Feb 2025
Learning Counterfactual Outcomes Under Rank Preservation
Peng Wu
Haoxuan Li
Chunyuan Zheng
Yan Zeng
Jiawei Chen
Yang Liu
Ruocheng Guo
Jianchao Tan
352
4
0
10 Feb 2025
Semiparametric Double Reinforcement Learning with Applications to Long-Term Causal Inference
Lars van der Laan
David Hubbard
Allen Tran
Nathan Kallus
Aurélien F. Bibaut
OffRL
496
0
0
12 Jan 2025
A Graphical Approach to State Variable Selection in Off-policy Learning
Joakim Blach Andersen
Qingyuan Zhao
CML
OffRL
290
1
0
03 Jan 2025
Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Ojash Neopane
Aaditya Ramdas
Aarti Singh
CML
293
3
0
21 Nov 2024
Debiased Regression for Root-N-Consistent Conditional Mean Estimation
Masahiro Kato
465
0
0
18 Nov 2024
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics Data
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRL
OnRL
307
6
0
06 Nov 2024
Primal-Dual Spectral Representation for Off-policy Evaluation
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Yang Hu
Tianyi Chen
Na Li
Kai Wang
Bo Dai
OffRL
320
4
0
23 Oct 2024
CSPI-MT: Calibrated Safe Policy Improvement with Multiple Testing for Threshold Policies
Knowledge Discovery and Data Mining (KDD), 2024
Brian M Cho
Ana-Roxana Pop
Kyra Gan
Sam Corbett-Davies
Israel Nir
Ariel Evnine
Nathan Kallus
OffRL
236
2
0
21 Aug 2024
Model-agnostic meta-learners for estimating heterogeneous treatment effects over time
Dennis Frauen
Konstantin Hess
Stefan Feuerriegel
504
16
0
07 Jul 2024
Structured Difference-of-Q via Orthogonal Learning
Defu Cao
Angela Zhou
468
0
0
12 Jun 2024
Combining Experimental and Historical Data for Policy Evaluation
Ting Li
Chengchun Shi
Qianglin Wen
Yang Sui
Yongli Qin
Chunbo Lai
Hongtu Zhu
OffRL
480
4
0
01 Jun 2024
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
OnRL
294
7
0
28 May 2024
Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes
Andrew Bennett
Nathan Kallus
Miruna Oprescu
Wen Sun
Kaiwen Wang
AAML
OffRL
307
4
0
29 Mar 2024
Spatially Randomized Designs Can Enhance Policy Evaluation
Ying Yang
Chengchun Shi
Fang Yao
Shouyang Wang
Hongtu Zhu
OffRL
322
2
0
18 Mar 2024
Triple/Debiased Lasso for Statistical Inference of Conditional Average Treatment Effects
Masahiro Kato
CML
339
1
0
05 Mar 2024
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
318
7
0
22 Feb 2024
Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap
Mohammad Mehrabi
Stefan Wager
OffRL
358
18
0
13 Feb 2024
POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition
Yuta Saito
Jihan Yao
Thorsten Joachims
OffRL
335
14
0
09 Feb 2024
Evaluation of Active Feature Acquisition Methods for Static Feature Settings
Henrik von Kleist
Alireza Zamanian
I. Shpitser
Narges Ahmidi
OffRL
316
4
0
06 Dec 2023
Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits
Neural Information Processing Systems (NeurIPS), 2023
Muhammad Faaiz Taufiq
Arnaud Doucet
Rob Cornish
Jean-François Ton
OffRL
351
10
0
03 Dec 2023
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings
Henrik von Kleist
Alireza Zamanian
I. Shpitser
Narges Ahmidi
OffRL
719
8
0
03 Dec 2023
SCOPE-RL: A Python Library for Offline Reinforcement Learning and Off-Policy Evaluation
Haruka Kiyohara
Ren Kishimoto
K. Kawakami
Ken Kobayashi
Kazuhide Nakata
Yuta Saito
OffRL
ELM
541
5
0
30 Nov 2023
Randomization Inference When N Equals One
Biometrika (Biometrika), 2023
Tengyuan Liang
Benjamin Recht
CML
254
10
0
25 Oct 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework
Wenzhuo Zhou
Yuhan Li
Ruoqing Zhu
Annie Qu
OffRL
327
7
0
23 Sep 2023
Off-policy Evaluation in Doubly Inhomogeneous Environments
Journal of the American Statistical Association (JASA), 2023
Zeyu Bian
C. Shi
Zhengling Qi
Lan Wang
OffRL
328
12
0
14 Jun 2023
High-probability sample complexities for policy evaluation with linear function approximation
IEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
454
10
0
30 May 2023
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling
International Conference on Machine Learning (ICML), 2023
Yuta Saito
Qingyang Ren
Thorsten Joachims
CML
OffRL
340
33
0
14 May 2023
Correcting for Interference in Experiments: A Case Study at Douyin
ACM Conference on Recommender Systems (RecSys), 2023
Vivek F. Farias
Hao Li
Tianyi Peng
Xinyuyang Ren
B. Hassibi
A. Zheng
243
18
0
04 May 2023
Conformal Off-Policy Evaluation in Markov Decision Processes
IEEE Conference on Decision and Control (CDC), 2023
Daniele Foffano
Alessio Russo
Alexandre Proutiere
OffRL
420
9
0
05 Apr 2023
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation
Conference on Uncertainty in Artificial Intelligence (UAI), 2023
Jonas Rothfuss
Bhavya Sukhija
Tobias Birchler
Parnian Kassraie
Andreas Krause
OffRL
274
14
0
02 Mar 2023
Asking for Help: Failure Prediction in Behavioral Cloning through Value Approximation
IEEE International Conference on Robotics and Automation (ICRA), 2023
Cem Gokmen
Daniel Ho
Mohi Khansari
OffRL
213
13
0
08 Feb 2023
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders
David Bruns-Smith
Angela Zhou
OffRL
694
14
0
01 Feb 2023
Asymptotic Inference for Multi-Stage Stationary Treatment Policy with Variable Selection
Daiqi Gao
Yufeng Liu
D. Zeng
OffRL
289
0
0
29 Jan 2023
Model-based Offline Reinforcement Learning with Local Misspecification
AAAI Conference on Artificial Intelligence (AAAI), 2023
Kefan Dong
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
258
6
0
26 Jan 2023
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency
Wenlong Mou
Peng Ding
Martin J. Wainwright
Peter L. Bartlett
OffRL
272
12
0
16 Jan 2023
Quantile Off-Policy Evaluation via Deep Conditional Generative Learning
Yang Xu
C. Shi
Shuang Luo
Lan Wang
R. Song
OffRL
296
6
0
29 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
299
110
0
13 Dec 2022
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
Neural Information Processing Systems (NeurIPS), 2022
Audrey Huang
Nan Jiang
OffRL
217
9
0
27 Oct 2022
A Unified Framework for Alternating Offline Model Training and Policy Learning
Neural Information Processing Systems (NeurIPS), 2022
Shentao Yang
Shujian Zhang
Yihao Feng
Mi Zhou
OffRL
321
17
0
12 Oct 2022
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Ming Yin
Mengdi Wang
Yu Wang
OffRL
407
12
0
03 Oct 2022
1
2
3
Next
Page 1 of 3