ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04328
  4. Cited By
Importance Resampling for Off-policy Prediction
v1v2 (latest)

Importance Resampling for Off-policy Prediction

Neural Information Processing Systems (NeurIPS), 2019
11 June 2019
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Importance Resampling for Off-policy Prediction"

29 / 29 papers shown
PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation
PipelineRL: Faster On-policy Reinforcement Learning for Long Sequence Generation
Alexandre Piché
Ehsan Kamaloo
Rafael Pardinas
Xiaoyin Chen
Dzmitry Bahdanau
OffRLLRM
164
3
0
23 Sep 2025
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement
  Learning
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024
Shuguang Yu
Shuxing Fang
Ruixin Peng
Zhengling Qi
Fan Zhou
C. Shi
CMLOffRL
286
5
0
08 Dec 2024
Kernel Metric Learning for In-Sample Off-Policy Evaluation of
  Deterministic RL Policies
Kernel Metric Learning for In-Sample Off-Policy Evaluation of Deterministic RL Policies
Haanvid Lee
Tri Wahyu Guntara
Jongmin Lee
Yung-Kyun Noh
Kee-Eung Kim
OffRL
198
2
0
29 May 2024
Saturn: Sample-efficient Generative Molecular Design using Memory
  Manipulation
Saturn: Sample-efficient Generative Molecular Design using Memory Manipulation
Jeff Guo
Philippe Schwaller
Mamba
249
12
0
27 May 2024
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Imad Aouali
Victor-Emmanuel Brunel
David Rohde
Anna Korba
OffRL
340
9
0
22 Feb 2024
Rankitect: Ranking Architecture Search Battling World-class Engineers at
  Meta Scale
Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta ScaleThe Web Conference (WWW), 2023
Wei Wen
Kuang-Hung Liu
Igor Fedorov
Xin Zhang
Hang Yin
...
Fangqiu Han
Jiyan Yang
Yuchen Hao
Liang Xiong
Wen-Yen Chen
251
2
0
14 Nov 2023
AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline
  Multi-Agent RL via Alternating Stationary Distribution Correction Estimation
AlberDICE: Addressing Out-Of-Distribution Joint Actions in Offline Multi-Agent RL via Alternating Stationary Distribution Correction EstimationNeural Information Processing Systems (NeurIPS), 2023
Daiki E. Matsunaga
Jongmin Lee
Jaeseok Yoon
Stefanos Leonardos
Pieter Abbeel
Kee-Eung Kim
OODDOffRL
166
7
0
03 Nov 2023
$K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic
  Control
KKK-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control
Michael Giegrich
Roel Oomen
C. Reisinger
OffRL
208
2
0
07 Jun 2023
An Instrumental Variable Approach to Confounded Off-Policy Evaluation
An Instrumental Variable Approach to Confounded Off-Policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Yang Xu
Jin Zhu
C. Shi
Shuang Luo
R. Song
OffRL
292
23
0
29 Dec 2022
Actor Prioritized Experience Replay
Actor Prioritized Experience ReplayJournal of Artificial Intelligence Research (JAIR), 2022
Baturay Saglam
Furkan B. Mutlu
Dogan C. Cicek
Suleyman S. Kozat
199
44
0
01 Sep 2022
Conformal Off-policy Prediction
Conformal Off-policy PredictionInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yingying Zhang
C. Shi
Shuang Luo
OffRL
269
13
0
14 Jun 2022
Variance Reduction based Partial Trajectory Reuse to Accelerate Policy
  Gradient Optimization
Variance Reduction based Partial Trajectory Reuse to Accelerate Policy Gradient Optimization
Hua Zheng
Wei Xie
259
3
0
06 May 2022
SOPE: Spectrum of Off-Policy Estimators
SOPE: Spectrum of Off-Policy Estimators
C. J. Yuan
Yash Chandak
S. Giguere
Philip S. Thomas
S. Niekum
OffRL
225
5
0
06 Nov 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
444
3
0
17 Oct 2021
Variational Actor-Critic Algorithms
Variational Actor-Critic Algorithms
Yuhua Zhu
Lexing Ying
OffRL
139
0
0
03 Aug 2021
Scalable Safety-Critical Policy Evaluation with Accelerated Rare Event
  Sampling
Scalable Safety-Critical Policy Evaluation with Accelerated Rare Event SamplingIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2021
Mengdi Xu
Peide Huang
Fengpei Li
Jiacheng Zhu
Xuewei Qi
K. Oguchi
Zhiyuan Huang
Henry Lam
Ding Zhao
213
4
0
19 Jun 2021
Statistical Testing under Distributional Shifts
Statistical Testing under Distributional Shifts
Nikolaj Thams
Sorawit Saengkyongam
Niklas Pfister
J. Peters
OOD
353
11
0
22 May 2021
Learning robust driving policies without online exploration
Learning robust driving policies without online explorationIEEE International Conference on Robotics and Automation (ICRA), 2021
D. Graves
Nhat M. Nguyen
Kimia Hassanzadeh
Jun Jin
Jun Luo
OffRL
172
3
0
15 Mar 2021
Revisiting Prioritized Experience Replay: A Value Perspective
Revisiting Prioritized Experience Replay: A Value Perspective
Ang Li
Zongqing Lu
Chenglin Miao
155
11
0
05 Feb 2021
Offline Learning of Counterfactual Predictions for Real-World Robotic
  Reinforcement Learning
Offline Learning of Counterfactual Predictions for Real-World Robotic Reinforcement LearningIEEE International Conference on Robotics and Automation (ICRA), 2020
Jun Jin
D. Graves
Cameron Haigh
Jun Luo
Martin Jägersand
SSLOffRL
242
6
0
11 Nov 2020
Affordance as general value function: A computational model
Affordance as general value function: A computational modelAdaptive Behavior (AB), 2020
D. Graves
Johannes Günther
Jun Luo
AI4CE
305
6
0
27 Oct 2020
Why resampling outperforms reweighting for correcting sampling bias with
  stochastic gradients
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients
Jing An
Lexing Ying
Yuhua Zhu
309
43
0
28 Sep 2020
Revisiting Fundamentals of Experience Replay
Revisiting Fundamentals of Experience ReplayInternational Conference on Machine Learning (ICML), 2020
W. Fedus
Prajit Ramachandran
Rishabh Agarwal
Yoshua Bengio
Hugo Larochelle
Mark Rowland
Will Dabney
KELMOffRL
249
278
0
13 Jul 2020
An Equivalence between Loss Functions and Non-Uniform Sampling in
  Experience Replay
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience ReplayNeural Information Processing Systems (NeurIPS), 2020
Scott Fujimoto
David Meger
Doina Precup
239
67
0
12 Jul 2020
Learning predictive representations in autonomous driving to improve
  deep reinforcement learning
Learning predictive representations in autonomous driving to improve deep reinforcement learning
D. Graves
Nhat M. Nguyen
Kimia Hassanzadeh
Jun Jin
SSL
174
14
0
26 Jun 2020
Off-Policy Deep Reinforcement Learning with Analogous Disentangled
  Exploration
Off-Policy Deep Reinforcement Learning with Analogous Disentangled ExplorationAdaptive Agents and Multi-Agent Systems (AAMAS), 2020
Hoang Trung-Dung
Yitao Liang
Karen Ullrich
OffRL
153
4
0
25 Feb 2020
Adaptive Experience Selection for Policy Gradient
Adaptive Experience Selection for Policy Gradient
S. Mohamad
Giovanni Montana
151
0
0
17 Feb 2020
Merging Deterministic Policy Gradient Estimations with Varied
  Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Merging Deterministic Policy Gradient Estimations with Varied Bias-Variance Tradeoff for Effective Deep Reinforcement Learning
Gang Chen
180
4
0
24 Nov 2019
Context-Dependent Upper-Confidence Bounds for Directed Exploration
Context-Dependent Upper-Confidence Bounds for Directed ExplorationNeural Information Processing Systems (NeurIPS), 2018
Raksha Kumaraswamy
M. Schlegel
Adam White
Martha White
OffRL
220
12
0
15 Nov 2018
1