ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.04733
  4. Cited By
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary
  Distribution Corrections

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

10 June 2019
Ofir Nachum
Yinlam Chow
Bo Dai
Lihong Li
    OffRL
ArXivPDFHTML

Papers citing "DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections"

50 / 101 papers shown
Title
Imagination-Limited Q-Learning for Offline Reinforcement Learning
Imagination-Limited Q-Learning for Offline Reinforcement Learning
Wenhui Liu
Zhijian Wu
Jingchao Wang
Dingjiang Huang
Shuigeng Zhou
OffRL
30
0
0
18 May 2025
Policy Regularization on Globally Accessible States in Cross-Dynamics Reinforcement Learning
Zhenghai Xue
Lang Feng
Jiacheng Xu
Kang Kang
Xiang Wen
Jingyi Wang
Shuicheng Yan
OffRL
58
0
0
10 Mar 2025
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol
Pai Liu
Lingfeng Zhao
Shivangi Agarwal
Jinghan Liu
Audrey Huang
Philip Amortila
Nan Jiang
OODD
OffRL
109
0
0
11 Feb 2025
Dual Alignment Maximin Optimization for Offline Model-based RL
Dual Alignment Maximin Optimization for Offline Model-based RL
Chi Zhou
Wang Luo
Haoran Li
Congying Han
Tiande Guo
Zicheng Zhang
OffRL
78
0
0
02 Feb 2025
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization
Zishun Yu
Tengyu Xu
Di Jin
Karthik Abinav Sankararaman
Yun He
...
Eryk Helenowski
Chen Zhu
Sinong Wang
Hao Ma
Han Fang
LRM
59
5
0
29 Jan 2025
SR-Reward: Taking The Path More Traveled
SR-Reward: Taking The Path More Traveled
Seyed Mahdi Basiri Azad
Zahra Padar
Gabriel Kalweit
Joschka Boedecker
OffRL
77
0
0
04 Jan 2025
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Claire Chen
Shuze Liu
Shangtong Zhang
OffRL
207
1
0
08 Oct 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning
Doubly Optimal Policy Evaluation for Reinforcement Learning
Shuze Liu
Claire Chen
Shangtong Zhang
OffRL
48
2
0
03 Oct 2024
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints
Bram De Cooman
Johan A. K. Suykens
43
0
0
25 Apr 2024
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
32
0
0
29 Mar 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
41
4
0
22 Feb 2024
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
Huy Hoang
Tien Mai
Pradeep Varakantham
OffRL
52
2
0
20 Feb 2024
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation
  Learning
Expert Proximity as Surrogate Rewards for Single Demonstration Imitation Learning
Chia-Cheng Chiang
Li-Cheng Lan
Wei-Fang Sun
Chien Feng
Cho-Jui Hsieh
Chun-Yi Lee
46
0
0
01 Feb 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
31
0
0
24 Dec 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
42
5
0
09 Oct 2023
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced
  Datasets
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets
Zhang-Wei Hong
Aviral Kumar
Sathwik Karnik
Abhishek Bhandwaldar
Akash Srivastava
Joni Pajarinen
Romain Laroche
Abhishek Gupta
Pulkit Agrawal
OffRL
43
19
0
06 Oct 2023
Stackelberg Batch Policy Learning
Stackelberg Batch Policy Learning
Wenzhuo Zhou
Annie Qu
OffRL
46
1
0
28 Sep 2023
Zero-Shot Reinforcement Learning from Low Quality Data
Zero-Shot Reinforcement Learning from Low Quality Data
Scott Jeen
Tom Bewley
Jonathan M. Cullen
OffRL
OnRL
46
1
0
26 Sep 2023
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
Nico Gürtler
Sebastian Blaes
Pavel Kolev
Felix Widmaier
Manuel Wüthrich
Stefan Bauer
Bernhard Schölkopf
Georg Martius
OffRL
38
28
0
28 Jul 2023
Hallucinated Adversarial Control for Conservative Offline Policy
  Evaluation
Hallucinated Adversarial Control for Conservative Offline Policy Evaluation
Jonas Rothfuss
Bhavya Sukhija
Tobias Birchler
Parnian Kassraie
Andreas Krause
OffRL
34
10
0
02 Mar 2023
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
HOPE: Human-Centric Off-Policy Evaluation for E-Learning and Healthcare
Ge Gao
Song Ju
Markel Sanz Ausin
Min Chi
OffRL
34
8
0
18 Feb 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with
  General Function Approximation and Single-Policy Concentrability
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy Concentrability
Hanlin Zhu
Amy Zhang
OffRL
38
2
0
07 Feb 2023
A Strong Baseline for Batch Imitation Learning
A Strong Baseline for Batch Imitation Learning
Matthew Smith
Lucas Maystre
Zhenwen Dai
K. Ciosek
OffRL
25
4
0
06 Feb 2023
Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for
  Parkinson Disease Treatment
Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease Treatment
Qitong Gao
Stephen L. Schimdt
Afsana Chowdhury
Guangyu Feng
Jennifer J. Peters
Katherine Genty
W. Grill
Dennis A. Turner
Miroslav Pajic
OffRL
38
11
0
05 Feb 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
49
16
0
30 Jan 2023
Variational Latent Branching Model for Off-Policy Evaluation
Variational Latent Branching Model for Off-Policy Evaluation
Qitong Gao
Ge Gao
Min Chi
Miroslav Pajic
OffRL
41
6
0
28 Jan 2023
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Off-Policy Evaluation for Action-Dependent Non-Stationary Environments
Yash Chandak
Shiv Shankar
Nathaniel D. Bastian
Bruno Castro da Silva
Emma Brunskil
Philip S. Thomas
OffRL
52
6
0
24 Jan 2023
A first-order augmented Lagrangian method for constrained minimax
  optimization
A first-order augmented Lagrangian method for constrained minimax optimization
Zhaosong Lu
Sanyou Mei
39
6
0
05 Jan 2023
Offline Policy Optimization in RL with Variance Regularizaton
Offline Policy Optimization in RL with Variance Regularizaton
Riashat Islam
Samarth Sinha
Homanga Bharadhwaj
Samin Yeasar Arnob
Zhuoran Yang
Animesh Garg
Zhaoran Wang
Lihong Li
Doina Precup
OffRL
30
0
0
29 Dec 2022
Behavior Estimation from Multi-Source Data for Offline Reinforcement
  Learning
Behavior Estimation from Multi-Source Data for Offline Reinforcement Learning
Guoxi Zhang
H. Kashima
OffRL
34
2
0
29 Nov 2022
Beyond the Return: Off-policy Function Estimation under User-specified
  Error-measuring Distributions
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions
Audrey Huang
Nan Jiang
OffRL
62
9
0
27 Oct 2022
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online
  Reinforcement Learning
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning
Yi Zhao
Rinu Boney
Alexander Ilin
Arno Solin
Joni Pajarinen
OffRL
OnRL
28
39
0
25 Oct 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
34
16
0
26 Jul 2022
Safe-FinRL: A Low Bias and Variance Deep Reinforcement Learning
  Implementation for High-Freq Stock Trading
Safe-FinRL: A Low Bias and Variance Deep Reinforcement Learning Implementation for High-Freq Stock Trading
Zitao Song
Xuyang Jin
Chenliang Li
OffRL
AIFin
31
1
0
13 Jun 2022
Federated Offline Reinforcement Learning
Federated Offline Reinforcement Learning
D. Zhou
Yufeng Zhang
Aaron Sonabend-W
Zhaoran Wang
Junwei Lu
Tianxi Cai
OffRL
40
13
0
11 Jun 2022
Offline Policy Comparison with Confidence: Benchmarks and Baselines
Offline Policy Comparison with Confidence: Benchmarks and Baselines
Anurag Koul
Mariano Phielipp
Alan Fern
OffRL
35
0
0
22 May 2022
User-Interactive Offline Reinforcement Learning
User-Interactive Offline Reinforcement Learning
Phillip Swazinna
Steffen Udluft
Thomas Runkler
OffRL
36
11
0
21 May 2022
A Review of Safe Reinforcement Learning: Methods, Theory and
  Applications
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRL
AI4TS
117
243
0
20 May 2022
Off-Policy Evaluation with Online Adaptation for Robot Exploration in
  Challenging Environments
Off-Policy Evaluation with Online Adaptation for Robot Exploration in Challenging Environments
Yafei Hu
Junyi Geng
Chen Wang
John Keller
Sebastian Scherer
OffRL
39
15
0
07 Apr 2022
Offline Reinforcement Learning Under Value and Density-Ratio
  Realizability: The Power of Gaps
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
34
34
0
25 Mar 2022
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement
  Learning
DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning
Jinxin Liu
Hongyin Zhang
Donglin Wang
OffRL
38
33
0
13 Mar 2022
LobsDICE: Offline Learning from Observation via Stationary Distribution
  Correction Estimation
LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation
Geon-hyeong Kim
Jongmin Lee
Youngsoo Jang
Hongseok Yang
Kyungmin Kim
OffRL
38
15
0
28 Feb 2022
Reinforcement Learning in Practice: Opportunities and Challenges
Reinforcement Learning in Practice: Opportunities and Challenges
Yuxi Li
OffRL
43
9
0
23 Feb 2022
Off-Policy Confidence Interval Estimation with Confounded Markov
  Decision Process
Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process
C. Shi
Jin Zhu
Ye Shen
Shuang Luo
Hong Zhu
R. Song
OffRL
38
30
0
22 Feb 2022
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation
  in Two-sided Markets
A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets
C. Shi
Runzhe Wan
Ge Song
Shuang Luo
R. Song
Hongtu Zhu
OffRL
43
6
0
21 Feb 2022
Versatile Offline Imitation from Observations and Examples via
  Regularized State-Occupancy Matching
Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching
Yecheng Jason Ma
Andrew Shen
Dinesh Jayaraman
Osbert Bastani
OffRL
28
32
0
04 Feb 2022
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from
  Demonstrations
Efficient Performance Bounds for Primal-Dual Reinforcement Learning from Demonstrations
Angeliki Kamoutsi
G. Banjac
John Lygeros
OffRL
31
7
0
28 Dec 2021
Off Environment Evaluation Using Convex Risk Minimization
Off Environment Evaluation Using Convex Risk Minimization
Pulkit Katdare
Shuijing Liu
Katherine Driggs-Campbell
18
2
0
21 Dec 2021
Continual Learning In Environments With Polynomial Mixing Times
Continual Learning In Environments With Polynomial Mixing Times
Matthew D Riemer
Sharath Chandra Raparthy
Ignacio Cases
G. Subbaraj
M. P. Touzel
Irina Rish
CLL
43
8
0
13 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
46
4
0
29 Nov 2021
123
Next