ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.00418
  4. Cited By
Post-Contextual-Bandit Inference

Post-Contextual-Bandit Inference

1 June 2021
Aurélien F. Bibaut
Antoine Chambaz
Maria Dimakopoulou
Nathan Kallus
Mark van der Laan
ArXivPDFHTML

Papers citing "Post-Contextual-Bandit Inference"

24 / 24 papers shown
Title
Statistical Inference in Reinforcement Learning: A Selective Survey
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
177
2
0
22 Feb 2025
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual
  Bandits
Off-Policy Evaluation via Adaptive Weighting with Data from Contextual Bandits
Ruohan Zhan
Vitor Hadad
David A. Hirshberg
Susan Athey
OffRL
38
61
0
03 Jun 2021
Risk Minimization from Adaptively Collected Data: Guarantees for
  Supervised and Policy Learning
Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning
Aurélien F. Bibaut
Antoine Chambaz
Maria Dimakopoulou
Nathan Kallus
Mark van der Laan
OffRL
34
13
0
03 Jun 2021
Statistical Inference with M-Estimators on Adaptively Collected Data
Statistical Inference with M-Estimators on Adaptively Collected Data
Kelly W. Zhang
Lucas Janson
Susan Murphy
OffRL
40
43
0
29 Apr 2021
Confidence Intervals for Policy Evaluation in Adaptive Experiments
Confidence Intervals for Policy Evaluation in Adaptive Experiments
Vitor Hadad
David A. Hirshberg
Ruohan Zhan
Stefan Wager
Susan Athey
33
143
0
07 Nov 2019
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with
  Double Reinforcement Learning
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
47
91
0
12 Sep 2019
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for
  Reinforcement Learning
Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning
Nathan Kallus
Masatoshi Uehara
OffRL
55
54
0
09 Jun 2019
On the bias, risk and consistency of sample means in multi-armed bandits
On the bias, risk and consistency of sample means in multi-armed bandits
Jaehyeok Shin
Aaditya Ramdas
Alessandro Rinaldo
36
35
0
02 Feb 2019
More Robust Doubly Robust Off-policy Evaluation
More Robust Doubly Robust Off-policy Evaluation
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
OffRL
51
267
0
10 Feb 2018
Estimation Considerations in Contextual Bandits
Estimation Considerations in Contextual Bandits
Maria Dimakopoulou
Zhengyuan Zhou
Susan Athey
Guido Imbens
80
69
0
19 Nov 2017
OpenML Benchmarking Suites
OpenML Benchmarking Suites
B. Bischl
Giuseppe Casalicchio
Matthias Feurer
Pieter Gijsbers
Frank Hutter
Michel Lang
R. G. Mantovani
Jan N. van Rijn
Joaquin Vanschoren
VLM
ELM
65
156
0
11 Aug 2017
Why Adaptively Collected Data Have Negative Bias and How to Correct for
  It
Why Adaptively Collected Data Have Negative Bias and How to Correct for It
Xinkun Nie
Xiaoying Tian
Jonathan E. Taylor
James Zou
OnRL
48
88
0
07 Aug 2017
Balanced Policy Evaluation and Learning
Balanced Policy Evaluation and Learning
Nathan Kallus
CML
OffRL
222
141
0
21 May 2017
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Yu Wang
Alekh Agarwal
Miroslav Dudík
OffRL
59
220
0
04 Dec 2016
Dynamic Assortment Personalization in High Dimensions
Dynamic Assortment Personalization in High Dimensions
Nathan Kallus
Madeleine Udell
89
66
0
18 Oct 2016
Dynamic Pricing with Demand Covariates
Dynamic Pricing with Demand Covariates
Sheng Qiang
Mohsen Bayati
78
116
0
25 Apr 2016
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
Statistical inference for the mean outcome under a possibly non-unique
  optimal treatment strategy
Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy
Alexander Luedtke
M. J. van der Laan
65
220
0
24 Mar 2016
Doubly Robust Policy Evaluation and Optimization
Doubly Robust Policy Evaluation and Optimization
Miroslav Dudík
D. Erhan
John Langford
Lihong Li
OffRL
122
285
0
10 Mar 2015
Doubly Robust Policy Evaluation and Learning
Doubly Robust Policy Evaluation and Learning
Miroslav Dudík
John Langford
Lihong Li
OffRL
157
694
0
23 Mar 2011
Unbiased Offline Evaluation of Contextual-bandit-based News Article
  Recommendation Algorithms
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
152
574
0
31 Mar 2010
A Contextual-Bandit Approach to Personalized News Article Recommendation
A Contextual-Bandit Approach to Personalized News Article Recommendation
Lihong Li
Wei Chu
John Langford
Robert Schapire
277
2,935
0
28 Feb 2010
On the minimal penalty for Markov order estimation
On the minimal penalty for Markov order estimation
R. Handel
79
34
0
25 Aug 2009
The Offset Tree for Learning with Partial Labels
The Offset Tree for Learning with Partial Labels
A. Beygelzimer
John Langford
112
184
0
21 Dec 2008
1