Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.04733
Cited By
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
10 June 2019
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections"
50 / 103 papers shown
Title
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
50
4
0
29 Nov 2021
UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning Leveraging Planning
Christopher P. Diehl
Timo Sievernich
Martin Krüger
F. Hoffmann
Torsten Bertram
OffRL
53
26
0
22 Nov 2021
d3rlpy: An Offline Deep Reinforcement Learning Library
Takuma Seno
M. Imai
OffRL
GP
65
101
0
06 Nov 2021
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch
Shangtong Zhang
Rémi Tachet des Combes
Romain Laroche
43
10
0
04 Nov 2021
Curriculum Offline Imitation Learning
Minghuan Liu
Hanye Zhao
Zhengyu Yang
Jian Shen
Weinan Zhang
Li Zhao
Tie-Yan Liu
OffRL
29
1
0
03 Nov 2021
False Correlation Reduction for Offline Reinforcement Learning
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
44
9
0
24 Oct 2021
C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
Tianjun Zhang
Benjamin Eysenbach
Ruslan Salakhutdinov
Sergey Levine
Joseph E. Gonzalez
OffRL
48
17
0
22 Oct 2021
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
46
2
0
17 Oct 2021
Offline Reinforcement Learning with Soft Behavior Regularization
Haoran Xu
Xianyuan Zhan
Jianxiong Li
Honglei Yin
OffRL
31
31
0
14 Oct 2021
The
f
f
f
-Divergence Reinforcement Learning Framework
Chen Gong
Qiang He
Yunpeng Bai
Zhouyi Yang
Xiaoyu Chen
Xinwen Hou
Xianjie Zhang
Yu Liu
Guoliang Fan
42
3
0
24 Sep 2021
A Workflow for Offline Model-Free Robotic Reinforcement Learning
Aviral Kumar
Anika Singh
Stephen Tian
Chelsea Finn
Sergey Levine
OffRL
143
85
0
22 Sep 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
34
115
0
19 Aug 2021
Supervised Off-Policy Ranking
Yue Jin
Yue Zhang
Tao Qin
Xudong Zhang
Jian Yuan
Houqiang Li
Tie-Yan Liu
OffRL
37
5
0
03 Jul 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Yifei Min
Tianhao Wang
Dongruo Zhou
Quanquan Gu
OffRL
42
38
0
22 Jun 2021
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee
Wonseok Jeon
Byung-Jun Lee
J. Pineau
Kee-Eung Kim
OffRL
37
91
0
21 Jun 2021
SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching
Min Sun
Anuj Mahajan
Katja Hofmann
Shimon Whiteson
OffRL
26
12
0
06 Jun 2021
On the Convergence Rate of Off-Policy Policy Optimization Methods with Density-Ratio Correction
Jiawei Huang
Nan Jiang
28
5
0
02 Jun 2021
On Instrumental Variable Regression for Deep Offline Policy Evaluation
Yutian Chen
Liyuan Xu
Çağlar Gülçehre
T. Paine
Arthur Gretton
Nando de Freitas
Arnaud Doucet
OffRL
56
18
0
21 May 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu Wang
OffRL
42
19
0
13 May 2021
Deeply-Debiased Off-Policy Interval Estimation
C. Shi
Runzhe Wan
Victor Chernozhukov
R. Song
OffRL
30
36
0
10 May 2021
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Michael Ruogu Zhang
T. Paine
Ofir Nachum
Cosmin Paduraru
George Tucker
Ziyun Wang
Mohammad Norouzi
OffRL
35
45
0
28 Apr 2021
Benchmarks for Deep Off-Policy Evaluation
Justin Fu
Mohammad Norouzi
Ofir Nachum
George Tucker
Ziyun Wang
...
Yutian Chen
Aviral Kumar
Cosmin Paduraru
Sergey Levine
T. Paine
ELM
OffRL
40
100
0
30 Mar 2021
Nearly Horizon-Free Offline Reinforcement Learning
Tongzheng Ren
Jialian Li
Bo Dai
S. Du
Sujay Sanghavi
OffRL
37
49
0
25 Mar 2021
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning
Samarth Sinha
Ajay Mandlekar
Animesh Garg
OffRL
31
107
0
10 Mar 2021
Instabilities of Offline RL with Pre-Trained Neural Representation
Ruosong Wang
Yifan Wu
Ruslan Salakhutdinov
Sham Kakade
OffRL
30
42
0
08 Mar 2021
Off-Policy Imitation Learning from Observations
Zhuangdi Zhu
Kaixiang Lin
Bo Dai
Jiayu Zhou
OffRL
34
86
0
25 Feb 2021
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning
Luofeng Liao
Zuyue Fu
Zhuoran Yang
Yixin Wang
Mladen Kolar
Zhaoran Wang
OffRL
46
35
0
19 Feb 2021
Is Pessimism Provably Efficient for Offline RL?
Ying Jin
Zhuoran Yang
Zhaoran Wang
OffRL
32
350
0
30 Dec 2020
POPO: Pessimistic Offline Policy Optimization
Qiang He
Xinwen Hou
OffRL
42
10
0
26 Dec 2020
Revisiting Rainbow: Promoting more Insightful and Inclusive Deep Reinforcement Learning Research
J. Obando-Ceron
Pablo Samuel Castro
OffRL
20
105
0
20 Nov 2020
Reliable Off-policy Evaluation for Reinforcement Learning
Jie Wang
Rui Gao
H. Zha
OffRL
27
11
0
08 Nov 2020
Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient
Botao Hao
Yaqi Duan
Tor Lattimore
Csaba Szepesvári
Mengdi Wang
OffRL
18
27
0
08 Nov 2020
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
Tanmay Gangwani
Jian Peng
Yuanshuo Zhou
29
10
0
05 Nov 2020
CoinDICE: Off-Policy Confidence Interval Estimation
Bo Dai
Ofir Nachum
Yinlam Chow
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
31
84
0
22 Oct 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
26
42
0
02 Aug 2020
Off-policy Evaluation in Infinite-Horizon Reinforcement Learning with Latent Confounders
Andrew Bennett
Nathan Kallus
Lihong Li
Ali Mousavi
OffRL
40
43
0
27 Jul 2020
Batch Policy Learning in Average Reward Markov Decision Processes
Peng Liao
Zhengling Qi
Runzhe Wan
P. Klasnja
Susan Murphy
OffRL
41
81
0
23 Jul 2020
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed Ghasemipour
Dale Schuurmans
S. Gu
OffRL
217
119
0
21 Jul 2020
Hyperparameter Selection for Offline Reinforcement Learning
T. Paine
Cosmin Paduraru
Andrea Michi
Çağlar Gülçehre
Konrad Zolna
Alexander Novikov
Ziyun Wang
Nando de Freitas
GP
OffRL
56
146
0
17 Jul 2020
AWAC: Accelerating Online Reinforcement Learning with Offline Datasets
Ashvin Nair
Abhishek Gupta
Murtaza Dalal
Sergey Levine
OffRL
OnRL
46
592
0
16 Jun 2020
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong Zhang
Bo Liu
Shimon Whiteson
33
38
0
22 Apr 2020
Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning
Ali Mousavi
Lihong Li
Qiang Liu
Denny Zhou
OffRL
36
32
0
24 Mar 2020
Stable Policy Optimization via Off-Policy Divergence Regularization
Ahmed Touati
Amy Zhang
Joelle Pineau
Pascal Vincent
OffRL
36
17
0
09 Mar 2020
Batch Stationary Distribution Estimation
Junfeng Wen
Bo Dai
Lihong Li
Dale Schuurmans
OffRL
29
22
0
02 Mar 2020
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation
Yaqi Duan
Mengdi Wang
OffRL
32
149
0
21 Feb 2020
Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
Nan Jiang
Jiawei Huang
OffRL
46
17
0
06 Feb 2020
Reinforcement Learning via Fenchel-Rockafellar Duality
Ofir Nachum
Bo Dai
OffRL
21
118
0
07 Jan 2020
Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning
Cameron Voloshin
Hoang Minh Le
Nan Jiang
Yisong Yue
OffRL
35
152
0
15 Nov 2019
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Masatoshi Uehara
Jiawei Huang
Nan Jiang
OffRL
33
184
0
28 Oct 2019
Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation
Ziyang Tang
Yihao Feng
Lihong Li
Dengyong Zhou
Qiang Liu
OffRL
30
67
0
16 Oct 2019
Previous
1
2
3
Next