Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.02029
Cited By
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
3 June 2021
Ruohan Zhan
Vitor Hadad
David A. Hirshberg
Susan Athey
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits"
18 / 18 papers shown
Title
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu
Jiyuan Tu
Yichen Zhang
Xi Chen
OffRL
45
4
0
04 Oct 2023
Tractable contextual bandits beyond realizability
Sanath Kumar Krishnamurthy
Vitor Hadad
Susan Athey
22
8
0
25 Oct 2020
Optimal Off-Policy Evaluation from Multiple Logging Policies
Nathan Kallus
Yuta Saito
Masatoshi Uehara
OffRL
33
40
0
21 Oct 2020
On conditional versus marginal bias in multi-armed bandits
Jaehyeok Shin
Aaditya Ramdas
Alessandro Rinaldo
22
12
0
19 Feb 2020
Inference for Batched Bandits
Kelly W. Zhang
Lucas Janson
Susan Murphy
64
82
0
08 Feb 2020
Confidence Intervals for Policy Evaluation in Adaptive Experiments
Vitor Hadad
David A. Hirshberg
Ruohan Zhan
Stefan Wager
Susan Athey
36
143
0
07 Nov 2019
Doubly robust off-policy evaluation with shrinkage
Yi-Hsun Su
Maria Dimakopoulou
A. Krishnamurthy
Miroslav Dudík
OffRL
38
104
0
22 Jul 2019
Are sample means in multi-armed bandits positively or negatively biased?
Jaehyeok Shin
Aaditya Ramdas
Alessandro Rinaldo
33
36
0
27 May 2019
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
80
69
0
19 Nov 2017
Why Adaptively Collected Data Have Negative Bias and How to Correct for It
Xinkun Nie
Xiaoying Tian
Jonathan E. Taylor
James Zou
OnRL
48
88
0
07 Aug 2017
Effective Evaluation using Logged Bandit Feedback from Multiple Loggers
Aman Agarwal
Soumya Basu
Tobias Schnabel
Thorsten Joachims
OffRL
90
68
0
17 Mar 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
59
220
0
04 Dec 2016
Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
Alexander Luedtke
M. J. van der Laan
65
220
0
24 Mar 2016
OpenML: networked science in machine learning
Joaquin Vanschoren
Jan N. van Rijn
B. Bischl
Luís Torgo
FedML
AI4CE
103
1,310
0
29 Jul 2014
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal
Navin Goyal
136
993
0
15 Sep 2012
Counterfactual Reasoning and Learning Systems
Léon Bottou
J. Peters
J. Q. Candela
Denis Xavier Charles
D. M. Chickering
Elon Portugaly
Dipankar Ray
Patrice Y. Simard
Edward Snelson
CML
OffRL
180
781
0
11 Sep 2012
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
157
694
0
23 Mar 2011
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
152
574
0
31 Mar 2010
1