ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04733
  4. Cited By
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

10 June 2019
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
    OffRL
ArXivPDFHTML

Papers citing "DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections"

50 / 103 papers shown
Title
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
50
4
0
29 Nov 2021
UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning
  Leveraging Planning
UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning Leveraging Planning
Christopher P. Diehl
Timo Sievernich
Martin Krüger
F. Hoffmann
Torsten Bertram
OffRL
53
26
0
22 Nov 2021
d3rlpy: An Offline Deep Reinforcement Learning Library
d3rlpy: An Offline Deep Reinforcement Learning Library
Takuma Seno
M. Imai
OffRL
GP
65
101
0
06 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
43
10
0
04 Nov 2021
Curriculum Offline Imitation Learning
Curriculum Offline Imitation Learning
Minghuan Liu
Hanye Zhao
Zhengyu Yang
Jian Shen
Weinan Zhang
Li Zhao
Tie-Yan Liu
OffRL
29
1
0
03 Nov 2021
False Correlation Reduction for Offline Reinforcement Learning
False Correlation Reduction for Offline Reinforcement Learning
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
44
9
0
24 Oct 2021
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
Tianjun Zhang
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
Joseph E. Gonzalez
OffRL
48
17
0
22 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
46
2
0
17 Oct 2021
Offline Reinforcement Learning with Soft Behavior Regularization
Offline Reinforcement Learning with Soft Behavior Regularization
Haoran Xu
Xianyuan Zhan
Jianxiong Li
Honglei Yin
OffRL
31
31
0
14 Oct 2021
The $f$-Divergence Reinforcement Learning Framework
The fff-Divergence Reinforcement Learning Framework
Chen Gong
Qiang He
Yunpeng Bai
Zhouyi Yang
Xiaoyu Chen
Xinwen Hou
Xianjie Zhang
Yu Liu
Guoliang Fan
42
3
0
24 Sep 2021
A Workflow for Offline Model-Free Robotic Reinforcement Learning
A Workflow for Offline Model-Free Robotic Reinforcement Learning
Aviral Kumar
Anika Singh
Stephen Tian
Chelsea Finn
Sergey Levine
OffRL
143
85
0
22 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
34
115
0
19 Aug 2021
Supervised Off-Policy Ranking
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
37
5
0
03 Jul 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Yifei Min
Tianhao Wang
Dongruo Zhou
Quanquan Gu
OffRL
42
38
0
22 Jun 2021
OptiDICE: Offline Policy Optimization via Stationary Distribution
  Correction Estimation
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
J. Pineau
Kee-Eung Kim
OffRL
37
91
0
21 Jun 2021
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution
  Matching
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching
Min Sun
Anuj Mahajan
Katja Hofmann
Shimon Whiteson
OffRL
26
12
0
06 Jun 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with
  Density-Ratio Correction
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
28
5
0
02 Jun 2021
On Instrumental Variable Regression for Deep Offline Policy Evaluation
On Instrumental Variable Regression for Deep Offline Policy Evaluation
Yutian Chen
Liyuan Xu
Çağlar Gülçehre
T. Paine
Arthur Gretton
Nando de Freitas
Arnaud Doucet
OffRL
56
18
0
21 May 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
42
19
0
13 May 2021
Deeply-Debiased Off-Policy Interval Estimation
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
30
36
0
10 May 2021
Autoregressive Dynamics Models for Offline Policy Evaluation and
  Optimization
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Michael Ruogu Zhang
T. Paine
Ofir Nachum
Cosmin Paduraru
George Tucker
Ziyun Wang
Mohammad Norouzi
OffRL
35
45
0
28 Apr 2021
Benchmarks for Deep Off-Policy Evaluation
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
40
100
0
30 Mar 2021
Nearly Horizon-Free Offline Reinforcement Learning
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
37
49
0
25 Mar 2021
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement
  Learning
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning
Samarth Sinha
Ajay Mandlekar
Animesh Garg
OffRL
31
107
0
10 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
30
42
0
08 Mar 2021
Off-Policy Imitation Learning from Observations
Off-Policy Imitation Learning from Observations
Zhuangdi Zhu
Kaixiang Lin
Bo Dai
Jiayu Zhou
OffRL
34
86
0
25 Feb 2021
Instrumental Variable Value Iteration for Causal Offline Reinforcement
  Learning
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
Luofeng Liao
Zuyue Fu
Zhuoran Yang
Yixin Wang
Mladen Kolar
Zhaoran Wang
OffRL
46
35
0
19 Feb 2021
Is Pessimism Provably Efficient for Offline RL?
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
32
350
0
30 Dec 2020
POPO: Pessimistic Offline Policy Optimization
POPO: Pessimistic Offline Policy Optimization
Qiang He
Xinwen Hou
OffRL
42
10
0
26 Dec 2020
Revisiting Rainbow: Promoting more Insightful and Inclusive Deep
  Reinforcement Learning Research
Revisiting Rainbow: Promoting more Insightful and Inclusive Deep Reinforcement Learning Research
J. Obando-Ceron
Pablo Samuel Castro
OffRL
20
105
0
20 Nov 2020
Reliable Off-policy Evaluation for Reinforcement Learning
Reliable Off-policy Evaluation for Reinforcement Learning
Jie Wang
Rui Gao
H. Zha
OffRL
27
11
0
08 Nov 2020
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample
  Efficient
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
Botao Hao
Yaqi Duan
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
18
27
0
08 Nov 2020
Harnessing Distribution Ratio Estimators for Learning Agents with
  Quality and Diversity
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
Tanmay Gangwani
Jian Peng
Yuanshuo Zhou
29
10
0
05 Nov 2020
CoinDICE: Off-Policy Confidence Interval Estimation
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
31
84
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
26
42
0
02 Aug 2020
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with
  Latent Confounders
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders
Andrew Bennett
Nathan Kallus
Lihong Li
Ali Mousavi
OffRL
40
43
0
27 Jul 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
Susan Murphy
OffRL
41
81
0
23 Jul 2020
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline
  and Online RL
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
217
119
0
21 Jul 2020
Hyperparameter Selection for Offline Reinforcement Learning
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
56
146
0
17 Jul 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
46
592
0
16 Jun 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
33
38
0
22 Apr 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement
  Learning
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
36
32
0
24 Mar 2020
Stable Policy Optimization via Off-Policy Divergence Regularization
Stable Policy Optimization via Off-Policy Divergence Regularization
Ahmed Touati
Amy Zhang
Joelle Pineau
Pascal Vincent
OffRL
36
17
0
09 Mar 2020
Batch Stationary Distribution Estimation
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
29
22
0
02 Mar 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
32
149
0
21 Feb 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
46
17
0
06 Feb 2020
Reinforcement Learning via Fenchel-Rockafellar Duality
Reinforcement Learning via Fenchel-Rockafellar Duality
Ofir Nachum
Bo Dai
OffRL
21
118
0
07 Jan 2020
Empirical Study of Off-Policy Policy Evaluation for Reinforcement
  Learning
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
35
152
0
15 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
33
184
0
28 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
Previous
123
Next