Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1906.03393
Cited By
v1
v2
v3
v4 (latest)
Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling
Neural Information Processing Systems (NeurIPS), 2019
8 June 2019
Tengyang Xie
Yifei Ma
Yu Wang
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards Optimal Off-Policy Evaluation for Reinforcement Learning with Marginalized Importance Sampling"
50 / 130 papers shown
BAMAS: Structuring Budget-Aware Multi-Agent Systems
Liming Yang
Junyu Luo
Xuanzhe Liu
Yiling Lou
Zhenpeng Chen
LLMAG
413
0
0
26 Nov 2025
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Nan Jiang
Tengyang Xie
OffRL
243
16
0
05 Oct 2025
Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts
M. Heuillet
Yufei Cui
Boxing Chen
Audrey Durand
Prasanna Parthasarathi
OffRL
ReLM
LRM
428
0
1
13 Aug 2025
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Hongyi Zhou
Josiah P. Hanna
Jin Zhu
Ying Yang
Chengchun Shi
OffRL
276
4
0
28 May 2025
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
Hossein Goli
Michael Gimelfarb
Nathan Samuel de Lara
Haruki Nishimura
Masha Itkina
Florian Shkurti
OffRL
395
2
0
27 May 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
OffRL
513
0
0
02 May 2025
Reinforcement Learning with Continuous Actions Under Unmeasured Confounding
Yuhan Li
Eugene Han
Yifan Hu
Wenzhuo Zhou
Zhengling Qi
Yifan Cui
Ruoqing Zhu
OffRL
1.0K
2
0
01 May 2025
When Machine Learning Meets Importance Sampling: A More Efficient Rare Event Estimation Approach
Ruoning Zhao
Xinyun Chen
163
0
0
18 Apr 2025
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
International Conference on Learning Representations (ICLR), 2025
Yuheng Zhang
Nan Jiang
OffRL
305
5
0
03 Mar 2025
Reweighting Improves Conditional Risk Bounds
Yikai Zhang
Jiahe Lin
Fengpei Li
Songzhu Zheng
Anant Raj
Anderson Schneider
Yuriy Nevmyvaka
219
1
0
04 Jan 2025
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2024
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CML
OffRL
387
8
0
08 Dec 2024
Concept-driven Off Policy Evaluation
Ritam Majumdar
Jack Teversham
Sonali Parbhoo
OffRL
355
0
0
28 Nov 2024
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
Xinyan Guan
Yanjiang Liu
Xinyu Lu
Boxi Cao
Xianpei Han
...
Le Sun
Jie Lou
Bowen Yu
Yaojie Lu
Hongyu Lin
ALM
665
10
0
18 Nov 2024
Scalable Offline Reinforcement Learning for Mean Field Games
Axel Brunnbauer
Julian Lemmel
Z. Babaiee
Sophie A. Neubauer
Radu Grosu
OffRL
283
0
0
23 Oct 2024
Primal-Dual Spectral Representation for Off-policy Evaluation
International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Yang Hu
Tianyi Chen
Na Li
Kai Wang
Bo Dai
OffRL
339
4
0
23 Oct 2024
The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation
Noah Golowich
Ankur Moitra
OffRL
365
3
0
17 Jun 2024
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
International Conference on Machine Learning (ICML), 2024
Jiayi Wang
Zhengling Qi
Raymond K. W. Wong
214
0
0
14 Jun 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Jeongyeol Kwon
Shie Mannor
Constantine Caramanis
Yonathan Efroni
OffRL
451
6
0
03 Jun 2024
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies
Haanvid Lee
Tri Wahyu Guntara
Jongmin Lee
Yung-Kyun Noh
Kee-Eung Kim
OffRL
295
3
0
29 May 2024
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
435
1
0
29 Mar 2024
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
328
7
0
22 Feb 2024
Offline Multi-task Transfer RL with Representational Penalization
Avinandan Bose
S. S. Du
Maryam Fazel
OffRL
383
13
0
19 Feb 2024
Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap
Mohammad Mehrabi
Stefan Wager
OffRL
364
18
0
13 Feb 2024
POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition
Yuta Saito
Jihan Yao
Thorsten Joachims
OffRL
345
14
0
09 Feb 2024
Probabilistic Offline Policy Ranking with Approximate Bayesian Computation
Longchao Da
Porter Jenkins
Trevor Schwantes
Jeffrey Dotson
Hua Wei
OffRL
246
3
0
17 Dec 2023
Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits
Neural Information Processing Systems (NeurIPS), 2023
Muhammad Faaiz Taufiq
Arnaud Doucet
Rob Cornish
Jean-François Ton
OffRL
356
10
0
03 Dec 2023
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Jin Zhu
Runzhe Wan
Zhengling Qi
Shuang Luo
C. Shi
OffRL
399
5
0
28 Oct 2023
Counterfactual-Augmented Importance Sampling for Semi-Offline Policy Evaluation
Neural Information Processing Systems (NeurIPS), 2023
Shengpu Tang
Jenna Wiens
OffRL
CML
304
6
0
26 Oct 2023
Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks
Zihao Li
Xiang Ji
Minshuo Chen
Mengdi Wang
OffRL
313
0
0
16 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
Neural Information Processing Systems (NeurIPS), 2023
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
380
7
0
09 Oct 2023
Stackelberg Batch Policy Learning
Wenzhuo Zhou
Annie Qu
OffRL
340
1
0
28 Sep 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework
Wenzhuo Zhou
Yuhan Li
Ruoqing Zhu
Annie Qu
OffRL
328
7
0
23 Sep 2023
A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using
L
L
L
-
λ
λ
λ
Smoothness
Hengshuai Yao
348
3
0
29 Jul 2023
Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data
Sunil Madhow
Dan Xiao
Ming Yin
Yu-Xiang Wang
OffRL
320
0
0
24 Jun 2023
High-probability sample complexities for policy evaluation with linear function approximation
IEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
459
10
0
30 May 2023
Off-Policy Evaluation for Large Action Spaces via Conjunct Effect Modeling
International Conference on Machine Learning (ICML), 2023
Yuta Saito
Qingyang Ren
Thorsten Joachims
CML
OffRL
358
33
0
14 May 2023
On the Sample Complexity of Vanilla Model-Based Offline Reinforcement Learning with Dependent Samples
AAAI Conference on Artificial Intelligence (AAAI), 2023
Mustafa O. Karabag
Ufuk Topcu
OffRL
339
6
0
07 Mar 2023
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
Adaptive Agents and Multi-Agent Systems (AAMAS), 2023
Ge Gao
Song Ju
Markel Sanz Ausin
Min Chi
OffRL
255
8
0
18 Feb 2023
Revisiting Bellman Errors for Offline Model Selection
International Conference on Machine Learning (ICML), 2023
Joshua P. Zitovsky
Daniel de Marchi
Rishabh Agarwal
Michael R. Kosorok University of North Carolina at Chapel Hill
OffRL
345
6
0
31 Jan 2023
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
International Conference on Machine Learning (ICML), 2023
Shuze Liu
Shangtong Zhang
OffRL
515
7
0
31 Jan 2023
A Reinforcement Learning Framework for Dynamic Mediation Analysis
International Conference on Machine Learning (ICML), 2023
Linjuan Ge
Jitao Wang
C. Shi
Zhanghua Wu
Rui Song
400
6
0
31 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2023
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
566
20
0
30 Jan 2023
Model-based Offline Reinforcement Learning with Local Misspecification
AAAI Conference on Artificial Intelligence (AAAI), 2023
Kefan Dong
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
265
6
0
26 Jan 2023
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Neural Information Processing Systems (NeurIPS), 2023
Yash Chandak
Shiv Shankar
Nathaniel D. Bastian
Bruno Castro da Silva
Emma Brunskil
Philip S. Thomas
OffRL
272
6
0
24 Jan 2023
Minimax Weight Learning for Absorbing MDPs
Statistical Papers (Stat. Pap.), 2023
Fengyin Li
Yuqiang Li
Xianyi Wu
OffRL
171
1
0
09 Jan 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
International Conference on Machine Learning (ICML), 2022
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
365
24
0
29 Dec 2022
Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
592
31
0
19 Dec 2022
Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State Abstraction
AAAI Conference on Artificial Intelligence (AAAI), 2022
Brahma S. Pavse
Josiah P. Hanna
OffRL
228
9
0
14 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
303
114
0
13 Dec 2022
Low Variance Off-policy Evaluation with State-based Importance Sampling
Conference on Algebraic Informatics (AI), 2022
David M. Bossens
Philip S. Thomas
OffRL
538
5
0
07 Dec 2022
1
2
3
Next
Page 1 of 3