Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.11002
Cited By
Optimal Off-Policy Evaluation from Multiple Logging Policies
21 October 2020
Nathan Kallus
Yuta Saito
Masatoshi Uehara
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Optimal Off-Policy Evaluation from Multiple Logging Policies"
30 / 30 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
138
0
0
02 May 2025
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
117
75
0
17 Aug 2020
Near-Optimal Provable Uniform Convergence in Offline Policy Evaluation for Reinforcement Learning
Ming Yin
Yu Bai
Yu Wang
OffRL
75
31
0
07 Jul 2020
Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
29
9
0
06 Jun 2020
GenDICE: Generalized Offline Estimation of Stationary Values
Ruiyi Zhang
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
132
173
0
21 Feb 2020
Adaptive Estimator Selection for Off-Policy Evaluation
Yi-Hsun Su
Pavithra Srinath
A. Krishnamurthy
OffRL
23
45
0
18 Feb 2020
Statistically Efficient Off-Policy Policy Gradients
Nathan Kallus
Masatoshi Uehara
OffRL
47
37
0
10 Feb 2020
Inference for Batched Bandits
Kelly W. Zhang
Lucas Janson
Susan Murphy
64
82
0
08 Feb 2020
Confidence Intervals for Policy Evaluation in Adaptive Experiments
Vitor Hadad
David A. Hirshberg
Ruohan Zhan
Stefan Wager
Susan Athey
36
143
0
07 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
99
186
0
28 Oct 2019
Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies
Xinyun Chen
Lu Wang
Yizhe Hang
Heng Ge
H. Zha
OffRL
69
5
0
10 Oct 2019
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes
Nathan Kallus
Masatoshi Uehara
OffRL
68
185
0
22 Aug 2019
Off-policy Learning for Multiple Loggers
Li He
Long Xia
Wei Zeng
Zhi-Ming Ma
Yue Zhao
Dawei Yin
OffRL
32
10
0
23 Jul 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
55
54
0
09 Jun 2019
Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation
Qiang Liu
Lihong Li
Ziyang Tang
Dengyong Zhou
OffRL
101
354
0
29 Oct 2018
Efficient Counterfactual Learning from Bandit Feedback
Yusuke Narita
Shota Yasui
Kohei Yata
OffRL
52
47
0
10 Sep 2018
Analysis of Noise Contrastive Estimation from the Perspective of Asymptotic Variance
Masatoshi Uehara
Takeru Matsuda
F. Komaki
28
13
0
24 Aug 2018
Counterfactual Mean Embeddings
Krikamol Muandet
Motonobu Kanagawa
Sorawit Saengkyongam
S. Marukatat
CML
OffRL
44
39
0
22 May 2018
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
54
267
0
10 Feb 2018
Balanced Policy Evaluation and Learning
Nathan Kallus
CML
OffRL
243
141
0
21 May 2017
Effective Evaluation using Logged Bandit Feedback from Multiple Loggers
Aman Agarwal
Soumya Basu
Tobias Schnabel
Thorsten Joachims
OffRL
95
68
0
17 Mar 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
63
220
0
04 Dec 2016
Recursive Partitioning for Personalization using Observational Data
Nathan Kallus
CML
67
99
0
31 Aug 2016
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
125
611
0
08 Jun 2016
Off-policy evaluation for slate recommendation
Adith Swaminathan
A. Krishnamurthy
Alekh Agarwal
Miroslav Dudík
John Langford
Damien Jose
I. Zitouni
CML
OffRL
43
227
0
16 May 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
246
573
0
04 Apr 2016
Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
Alexander Luedtke
M. J. van der Laan
84
220
0
24 Mar 2016
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
148
621
0
11 Nov 2015
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
133
285
0
10 Mar 2015
Learning from Logged Implicit Exploration Data
Alexander L. Strehl
John Langford
Sham Kakade
Lihong Li
OffRL
116
254
0
27 Feb 2010
1