ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.03493
  4. Cited By
More Robust Doubly Robust Off-policy Evaluation
v1v2 (latest)

More Robust Doubly Robust Off-policy Evaluation

10 February 2018
Mehrdad Farajtabar
Yinlam Chow
Mohammad Ghavamzadeh
    OffRL
ArXiv (abs)PDFHTML

Papers citing "More Robust Doubly Robust Off-policy Evaluation"

50 / 178 papers shown
A Case for Leveraging Generative AI to Expand and Enhance Training in the Provision of Mental Health Services
A Case for Leveraging Generative AI to Expand and Enhance Training in the Provision of Mental Health Services
Hannah R. Lawrence
Shannon Wiltsey Stirman
Samuel Dorison
Taedong Yun
Megan Jones Bell
AI4MH
198
0
0
08 Oct 2025
An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes
An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes
Emil Javurek
Valentyn Melnychuk
Jonas Schweisthal
Konstantin Hess
Dennis Frauen
Stefan Feuerriegel
187
2
0
30 Sep 2025
Context-Action Embedding Learning for Off-Policy Evaluation in Contextual Bandits
Context-Action Embedding Learning for Off-Policy Evaluation in Contextual Bandits
Kushagra Chandak
Vincent Liu
Haanvid Lee
OffRL
200
0
0
31 Aug 2025
Beyond Prediction: Reinforcement Learning as the Defining Leap in Healthcare AI
Beyond Prediction: Reinforcement Learning as the Defining Leap in Healthcare AI
Dilruk Perera
Gousia Habib
Qianyi Xu
Daniel J. Tan
Kai He
Erik Cambria
Mengling Feng
OffRLAI4TS
333
0
0
28 Aug 2025
Meta Off-Policy Estimation
Meta Off-Policy EstimationACM Conference on Recommender Systems (RecSys), 2025
Olivier Jeunen
OffRL
166
0
0
11 Aug 2025
PERRY: Policy Evaluation with Confidence Intervals using Auxiliary Data
PERRY: Policy Evaluation with Confidence Intervals using Auxiliary Data
Aishwarya Mandyam
Jason Meng
Ge Gao
Jiankai Sun
Mac Schwager
Barbara E. Engelhardt
Emma Brunskill
OffRL
235
2
0
26 Jul 2025
A General Framework for Off-Policy Learning with Partially-Observed Reward
A General Framework for Off-Policy Learning with Partially-Observed RewardInternational Conference on Learning Representations (ICLR), 2025
Rikiya Takehi
Masahiro Asami
K. Kawakami
Yuta Saito
OffRL
206
1
0
17 Jun 2025
Doubly Robust Alignment for Large Language Models
Doubly Robust Alignment for Large Language Models
Erhan Xu
Kai Ye
Hongyi Zhou
Luhan Zhu
Francesco Quinzan
Chengchun Shi
358
10
0
01 Jun 2025
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Demystifying the Paradox of Importance Sampling with an Estimated History-Dependent Behavior Policy in Off-Policy Evaluation
Hongyi Zhou
Josiah P. Hanna
Jin Zhu
Ying Yang
Chengchun Shi
OffRL
261
4
0
28 May 2025
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
STITCH-OPE: Trajectory Stitching with Guided Diffusion for Off-Policy Evaluation
Hossein Goli
Michael Gimelfarb
Nathan Samuel de Lara
Haruki Nishimura
Masha Itkina
Florian Shkurti
OffRL
392
2
0
27 May 2025
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
DOLCE: Decomposing Off-Policy Evaluation/Learning into Lagged and Current Effects
Shu Tamano
OffRL
497
0
0
02 May 2025
Counterfactual Inference under Thompson Sampling
Counterfactual Inference under Thompson SamplingACM Conference on Recommender Systems (RecSys), 2025
Olivier Jeunen
OffRLLRM
339
1
0
03 Apr 2025
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs
Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPsInternational Conference on Learning Representations (ICLR), 2025
Yuheng Zhang
Nan Jiang
OffRL
302
5
0
03 Mar 2025
Clustering Context in Off-Policy Evaluation
Clustering Context in Off-Policy EvaluationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Daniel Guzman-Olivares
Philipp Schmidt
Jacek Golebiowski
Artur Bekasov
CMLOffRL
217
2
0
28 Feb 2025
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement
  Learning
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CMLOffRL
380
7
0
08 Dec 2024
Concept-driven Off Policy Evaluation
Concept-driven Off Policy Evaluation
Ritam Majumdar
Jack Teversham
Sonali Parbhoo
OffRL
345
0
0
28 Nov 2024
Off-policy estimation with adaptively collected data: the power of
  online learning
Off-policy estimation with adaptively collected data: the power of online learningNeural Information Processing Systems (NeurIPS), 2024
Jeonghwan Lee
Cong Ma
OffRL
381
1
0
19 Nov 2024
Abstract Reward Processes: Leveraging State Abstraction for Consistent
  Off-Policy Evaluation
Abstract Reward Processes: Leveraging State Abstraction for Consistent Off-Policy EvaluationNeural Information Processing Systems (NeurIPS), 2024
Shreyas Chaudhari
Ameet Deshpande
Bruno Castro da Silva
Philip S. Thomas
OffRL
266
1
0
03 Oct 2024
Designing an Interpretable Interface for Contextual Bandits
Designing an Interpretable Interface for Contextual Bandits
Andrew Maher
Matia Gobbo
Lancelot Lachartre
Subash Prabanantham
Rowan Swiers
Puli Liyanagama
257
0
0
23 Sep 2024
Effective Off-Policy Evaluation and Learning in Contextual Combinatorial
  Bandits
Effective Off-Policy Evaluation and Learning in Contextual Combinatorial BanditsACM Conference on Recommender Systems (RecSys), 2024
Tatsuhiro Shimizu
Koichi Tanaka
Ren Kishimoto
Haruka Kiyohara
Masahiro Nomura
Yuta Saito
CMLOffRL
359
8
0
20 Aug 2024
Balancing Immediate Revenue and Future Off-Policy Evaluation in Coupon
  Allocation
Balancing Immediate Revenue and Future Off-Policy Evaluation in Coupon Allocation
Naoki Nishimura
Ken Kobayashi
Kazuhide Nakata
OffRL
198
0
0
06 Jul 2024
Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with
  Regularized Importance Sampling
Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
283
4
0
05 Jun 2024
Combining Experimental and Historical Data for Policy Evaluation
Combining Experimental and Historical Data for Policy Evaluation
Ting Li
Chengchun Shi
Qianglin Wen
Yang Sui
Yongli Qin
Chunbo Lai
Hongtu Zhu
OffRL
481
4
0
01 Jun 2024
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates
  of Multiple Estimators
OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
Allen Nie
Yash Chandak
Christina J. Yuan
Anirudhan Badrinath
Yannis Flet-Berliac
Emma Brunskil
OffRL
291
4
0
27 May 2024
Cross-Validated Off-Policy Evaluation
Cross-Validated Off-Policy Evaluation
Matej Cief
Branislav Kveton
Michal Kompan
OffRL
357
2
0
24 May 2024
Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection
  and Learning
Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and LearningNeural Information Processing Systems (NeurIPS), 2024
Otmane Sakhi
Imad Aouali
Pierre Alquier
Nicolas Chopin
OffRL
353
13
0
23 May 2024
Optimal Baseline Corrections for Off-Policy Contextual Bandits
Optimal Baseline Corrections for Off-Policy Contextual BanditsACM Conference on Recommender Systems (RecSys), 2024
Shashank Gupta
Olivier Jeunen
Harrie Oosterhuis
Maarten de Rijke
344
14
0
09 May 2024
Long-term Off-Policy Evaluation and Learning
Long-term Off-Policy Evaluation and Learning
Yuta Saito
Himan Abdollahpouri
Jesse Anderton
Ben Carterette
M. Lalmas
OffRL
295
13
0
24 Apr 2024
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning
  and How to Deal with It
Hyperparameter Optimization Can Even be Harmful in Off-Policy Learning and How to Deal with It
Yuta Saito
Masahiro Nomura
OffRL
344
5
0
23 Apr 2024
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Toru Shirakawa
Yi Li
Yulun Wu
Sky Qiu
Yuxuan Li
Mingduo Zhao
Hiroyasu Iso
Mark van der Laan
346
18
0
05 Apr 2024
Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy
Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy
Kyungbok Lee
M. Paik
OffRL
134
0
0
02 Apr 2024
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
413
1
0
29 Mar 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
318
7
0
22 Feb 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
391
10
0
22 Feb 2024
Offline Multi-task Transfer RL with Representational Penalization
Offline Multi-task Transfer RL with Representational Penalization
Avinandan Bose
S. S. Du
Maryam Fazel
OffRL
369
13
0
19 Feb 2024
POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy
  Decomposition
POTEC: Off-Policy Learning for Large Action Spaces via Two-Stage Policy Decomposition
Yuta Saito
Jihan Yao
Thorsten Joachims
OffRL
335
14
0
09 Feb 2024
Off-Policy Evaluation of Slate Bandit Policies via Optimizing
  Abstraction
Off-Policy Evaluation of Slate Bandit Policies via Optimizing Abstraction
Haruka Kiyohara
Masahiro Nomura
Yuta Saito
703
17
0
03 Feb 2024
Distributionally Robust Policy Evaluation under General Covariate Shift
  in Contextual Bandits
Distributionally Robust Policy Evaluation under General Covariate Shift in Contextual Bandits
Yi Guo
Hao Liu
Yisong Yue
Anqi Liu
OffRL
297
3
0
21 Jan 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
233
0
0
24 Dec 2023
Probabilistic Offline Policy Ranking with Approximate Bayesian
  Computation
Probabilistic Offline Policy Ranking with Approximate Bayesian Computation
Longchao Da
Porter Jenkins
Trevor Schwantes
Jeffrey Dotson
Hua Wei
OffRL
246
3
0
17 Dec 2023
Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits
Marginal Density Ratio for Off-Policy Evaluation in Contextual BanditsNeural Information Processing Systems (NeurIPS), 2023
Muhammad Faaiz Taufiq
Arnaud Doucet
Rob Cornish
Jean-François Ton
OffRL
351
10
0
03 Dec 2023
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
Robust Offline Reinforcement learning with Heavy-Tailed RewardsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Jin Zhu
Runzhe Wan
Zhengling Qi
Shuang Luo
C. Shi
OffRL
391
5
0
28 Oct 2023
State-Action Similarity-Based Representations for Off-Policy Evaluation
State-Action Similarity-Based Representations for Off-Policy EvaluationNeural Information Processing Systems (NeurIPS), 2023
Brahma S. Pavse
Josiah P. Hanna
OffRL
299
4
0
27 Oct 2023
Counterfactual-Augmented Importance Sampling for Semi-Offline Policy
  Evaluation
Counterfactual-Augmented Importance Sampling for Semi-Offline Policy EvaluationNeural Information Processing Systems (NeurIPS), 2023
Shengpu Tang
Jenna Wiens
OffRLCML
301
6
0
26 Oct 2023
Off-Policy Evaluation for Large Action Spaces via Policy Convolution
Off-Policy Evaluation for Large Action Spaces via Policy ConvolutionThe Web Conference (WWW), 2023
Noveen Sachdeva
Lequn Wang
Dawen Liang
Nathan Kallus
Julian McAuley
OffRL
336
17
0
24 Oct 2023
Off-Policy Evaluation for Human Feedback
Off-Policy Evaluation for Human FeedbackNeural Information Processing Systems (NeurIPS), 2023
Qitong Gao
Ge Gao
Juncheng Dong
Vahid Tarokh
Min Chi
Miroslav Pajic
OffRL
390
10
0
11 Oct 2023
Ad-load Balancing via Off-policy Learning in a Content Marketplace
Ad-load Balancing via Off-policy Learning in a Content MarketplaceWeb Search and Data Mining (WSDM), 2023
Hitesh Sagtani
M. Jhawar
Rishabh Mehrotra
Olivier Jeunen
OffRL
443
12
0
19 Sep 2023
Doubly Robust Estimator for Off-Policy Evaluation with Large Action
  Spaces
Doubly Robust Estimator for Off-Policy Evaluation with Large Action SpacesIEEE Symposium Series on Computational Intelligence (IEEE-SSCI), 2023
Tatsuhiro Shimizu
L. Forastiere
OffRL
260
1
0
07 Aug 2023
Leveraging Factored Action Spaces for Off-Policy Evaluation
Leveraging Factored Action Spaces for Off-Policy Evaluation
Aaman Rebello
Shengpu Tang
Jenna Wiens
Sonali Parbhoo Department of Engineering
CMLOffRL
171
2
0
13 Jul 2023
Value-aware Importance Weighting for Off-policy Reinforcement Learning
Value-aware Importance Weighting for Off-policy Reinforcement Learning
Kristopher De Asis
Eric Graves
R. Sutton
OffRL
272
3
0
27 Jun 2023
1234
Next
Page 1 of 4