Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.00418
Cited By
Post-Contextual-Bandit Inference
1 June 2021
Aurélien F. Bibaut
Antoine Chambaz
Maria Dimakopoulou
Nathan Kallus
Mark van der Laan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Post-Contextual-Bandit Inference"
24 / 24 papers shown
Title
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
177
2
0
22 Feb 2025
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Ruohan Zhan
Vitor Hadad
David A. Hirshberg
Susan Athey
OffRL
38
61
0
03 Jun 2021
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning
Aurélien F. Bibaut
Antoine Chambaz
Maria Dimakopoulou
Nathan Kallus
Mark van der Laan
OffRL
34
13
0
03 Jun 2021
Statistical Inference with M-Estimators on Adaptively Collected Data
Kelly W. Zhang
Lucas Janson
Susan Murphy
OffRL
40
43
0
29 Apr 2021
Confidence Intervals for Policy Evaluation in Adaptive Experiments
Vitor Hadad
David A. Hirshberg
Ruohan Zhan
Stefan Wager
Susan Athey
33
143
0
07 Nov 2019
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
47
91
0
12 Sep 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
55
54
0
09 Jun 2019
On the bias, risk and consistency of sample means in multi-armed bandits
Jaehyeok Shin
Aaditya Ramdas
Alessandro Rinaldo
36
35
0
02 Feb 2019
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
51
267
0
10 Feb 2018
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
80
69
0
19 Nov 2017
OpenML Benchmarking Suites
B. Bischl
Giuseppe Casalicchio
Matthias Feurer
Pieter Gijsbers
Frank Hutter
Michel Lang
R. G. Mantovani
Jan N. van Rijn
Joaquin Vanschoren
VLM
ELM
65
156
0
11 Aug 2017
Why Adaptively Collected Data Have Negative Bias and How to Correct for It
Xinkun Nie
Xiaoying Tian
Jonathan E. Taylor
James Zou
OnRL
48
88
0
07 Aug 2017
Balanced Policy Evaluation and Learning
Nathan Kallus
CML
OffRL
222
141
0
21 May 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
59
220
0
04 Dec 2016
Dynamic Assortment Personalization in High Dimensions
Nathan Kallus
Madeleine Udell
89
66
0
18 Oct 2016
Dynamic Pricing with Demand Covariates
Sheng Qiang
Mohsen Bayati
78
116
0
25 Apr 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
Alexander Luedtke
M. J. van der Laan
65
220
0
24 Mar 2016
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
122
285
0
10 Mar 2015
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
157
694
0
23 Mar 2011
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
152
574
0
31 Mar 2010
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
277
2,935
0
28 Feb 2010
On the minimal penalty for Markov order estimation
R. Handel
79
34
0
25 Aug 2009
The Offset Tree for Learning with Partial Labels
A. Beygelzimer
John Langford
112
184
0
21 Dec 2008
1