Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.11642
Cited By
v1
v2
v3 (latest)
Off-Policy Evaluation and Learning for External Validity under a Covariate Shift
26 February 2020
Masahiro Kato
Masatoshi Uehara
Shota Yasui
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Off-Policy Evaluation and Learning for External Validity under a Covariate Shift"
12 / 12 papers shown
Title
Doubly Robust Alignment for Large Language Models
Erhan Xu
Kai Ye
Hongyi Zhou
Luhan Zhu
Francesco Quinzan
Chengchun Shi
24
0
0
01 Jun 2025
Counterfactual Learning with Multioutput Deep Kernels
A. Caron
G. Baio
I. Manolopoulou
BDL
CML
OffRL
72
1
0
20 Nov 2022
Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation
Diego Martinez-Taboada
Dino Sejdinovic
CML
OffRL
43
0
0
02 Nov 2022
Unified Perspective on Probability Divergence via Maximum Likelihood Density Ratio Estimation: Bridging KL-Divergence and Integral Probability Metrics
Masahiro Kato
Masaaki Imaizumi
Kentaro Minami
64
0
0
31 Jan 2022
Evaluating the Robustness of Off-Policy Evaluation
Yuta Saito
Takuma Udagawa
Haruka Kiyohara
Kazuki Mogi
Yusuke Narita
Kei Tateno
ELM
OffRL
33
38
0
31 Aug 2021
Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support
Hung The Tran
Sunil R. Gupta
Thanh Nguyen-Tang
Santu Rana
Svetha Venkatesh
OffRL
52
5
0
24 Jul 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
97
19
0
13 May 2021
Reliable Off-policy Evaluation for Reinforcement Learning
Jie Wang
Rui Gao
H. Zha
OffRL
79
11
0
08 Nov 2020
A Practical Guide of Off-Policy Evaluation for Bandit Problems
Masahiro Kato
Kenshi Abe
Kaito Ariu
Shota Yasui
OffRL
52
3
0
23 Oct 2020
Open Bandit Dataset and Pipeline: Towards Realistic and Reproducible Off-Policy Evaluation
Yuta Saito
Shunsuke Aihara
Megumi Matsutani
Yusuke Narita
OffRL
201
75
0
17 Aug 2020
Non-Negative Bregman Divergence Minimization for Deep Direct Density Ratio Estimation
Masahiro Kato
Takeshi Teshima
84
36
0
12 Jun 2020
Counterfactual Mean Embeddings
Krikamol Muandet
Motonobu Kanagawa
Sorawit Saengkyongam
S. Marukatat
CML
OffRL
101
40
0
22 May 2018
1