Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.03722
Cited By
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
11 November 2015
Nan Jiang
Lihong Li
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Doubly Robust Off-policy Value Evaluation for Reinforcement Learning"
11 / 11 papers shown
Title
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
Masanori Nojima
OffRL
131
0
0
02 May 2025
Statistical Inference in Reinforcement Learning: A Selective Survey
Chengchun Shi
OffRL
175
2
0
22 Feb 2025
The Best Instruction-Tuning Data are Those That Fit
Dylan Zhang
Qirun Dai
Hao Peng
ALM
150
6
0
06 Feb 2025
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Claire Chen
Shuze Liu
Shangtong Zhang
OffRL
298
1
0
08 Oct 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu
Claire Chen
Shangtong Zhang
OffRL
124
3
0
03 Oct 2024
Uncertainty Calibration for Counterfactual Propensity Estimation in Recommendation
Wenbo Hu
Xin Sun
Qiang liu
Wenbo Hu
Shu Wu
64
0
0
23 Mar 2023
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
108
75
0
17 Aug 2020
Off-policy Bandits with Deficient Support
Noveen Sachdeva
Yi-Hsun Su
Thorsten Joachims
OffRL
114
75
0
16 Jun 2020
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
59
269
0
14 Mar 2015
Unbiased Offline Evaluation of Contextual-bandit-based News Article Recommendation Algorithms
Lihong Li
Wei Chu
John Langford
Xuanhui Wang
OffRL
150
574
0
31 Mar 2010
1