ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.08812
  4. Cited By
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning

Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning

19 August 2021
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
    OffRL
ArXivPDFHTML

Papers citing "Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning"

40 / 90 papers shown
Title
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning
  Approach
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach
Yunzhe Zhou
Zhengling Qi
C. Shi
Lexin Li
OffRL
10
7
0
26 Oct 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRL
OnRL
30
90
0
13 Oct 2022
The Role of Coverage in Online Reinforcement Learning
The Role of Coverage in Online Reinforcement Learning
Tengyang Xie
Dylan J. Foster
Yu Bai
Nan Jiang
Sham Kakade
OffRL
22
57
0
09 Oct 2022
Offline Reinforcement Learning with Differentiable Function
  Approximation is Provably Efficient
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Ming Yin
Mengdi Wang
Yu-Xiang Wang
OffRL
58
11
0
03 Oct 2022
Off-policy estimation of linear functionals: Non-asymptotic theory for
  semi-parametric efficiency
Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency
Wenlong Mou
Martin J. Wainwright
Peter L. Bartlett
OffRL
15
10
0
26 Sep 2022
Distributionally Robust Offline Reinforcement Learning with Linear
  Function Approximation
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation
Xiaoteng Ma
Zhipeng Liang
Jose H. Blanchet
MingWen Liu
Li Xia
Jiheng Zhang
Qianchuan Zhao
Zhengyuan Zhou
OOD
OffRL
23
21
0
14 Sep 2022
Strategic Decision-Making in the Presence of Information Asymmetry:
  Provably Efficient RL with Algorithmic Instruments
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments
Mengxin Yu
Zhuoran Yang
Jianqing Fan
OffRL
13
8
0
23 Aug 2022
A Survey of Learning on Small Data: Generalization, Optimization, and
  Challenge
A Survey of Learning on Small Data: Generalization, Optimization, and Challenge
Xiaofeng Cao
Weixin Bu
Sheng-Jun Huang
Minling Zhang
Ivor W. Tsang
Yew-Soon Ong
James T. Kwok
30
1
0
29 Jul 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
23
16
0
26 Jul 2022
Double Check Your State Before Trusting It: Confidence-Aware
  Bidirectional Offline Model-Based Imagination
Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination
Jiafei Lyu
Xiu Li
Zongqing Lu
OffRL
14
24
0
16 Jun 2022
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise
  Reward
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Tengyu Xu
Yue Wang
Shaofeng Zou
Yingbin Liang
OffRL
20
12
0
13 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu-Xiang Wang
OffRL
25
4
0
10 Jun 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Jiafei Lyu
Xiaoteng Ma
Xiu Li
Zongqing Lu
OffRL
10
101
0
09 Jun 2022
Offline Reinforcement Learning with Differential Privacy
Offline Reinforcement Learning with Differential Privacy
Dan Qiao
Yu-Xiang Wang
OffRL
27
23
0
02 Jun 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient
  Learning
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
28
5
0
01 Jun 2022
On Gap-dependent Bounds for Offline Reinforcement Learning
On Gap-dependent Bounds for Offline Reinforcement Learning
Xinqi Wang
Qiwen Cui
S. Du
OffRL
71
11
0
01 Jun 2022
Provably Efficient Offline Multi-agent Reinforcement Learning via
  Strategy-wise Bonus
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise Bonus
Qiwen Cui
S. Du
OffRL
11
19
0
01 Jun 2022
Pessimism in the Face of Confounders: Provably Efficient Offline
  Reinforcement Learning in Partially Observable Markov Decision Processes
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Miao Lu
Yifei Min
Zhaoran Wang
Zhuoran Yang
OffRL
45
22
0
26 May 2022
When Data Geometry Meets Deep Function: Generalizing Offline
  Reinforcement Learning
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning
Jianxiong Li
Xianyuan Zhan
Haoran Xu
Xiangyu Zhu
Jingjing Liu
Ya-Qin Zhang
OffRL
22
24
0
23 May 2022
Pessimism for Offline Linear Contextual Bandits using $\ell_p$
  Confidence Sets
Pessimism for Offline Linear Contextual Bandits using ℓp\ell_pℓp​ Confidence Sets
Gen Li
Cong Ma
Nathan Srebro
OffRL
28
11
0
21 May 2022
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
  Reinforcement Learning
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement Learning
Boxiang Lyu
Zhaoran Wang
Mladen Kolar
Zhuoran Yang
OffRL
15
4
0
05 May 2022
Offline Reinforcement Learning Under Value and Density-Ratio
  Realizability: The Power of Gaps
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of Gaps
Jinglin Chen
Nan Jiang
OffRL
19
33
0
25 Mar 2022
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
OffRL
22
8
0
24 Mar 2022
Near-optimal Offline Reinforcement Learning with Linear Representation:
  Leveraging Variance Information with Pessimism
Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism
Ming Yin
Yaqi Duan
Mengdi Wang
Yu-Xiang Wang
OffRL
21
65
0
11 Mar 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
  Optimal Sample Complexity
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
18
90
0
28 Feb 2022
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement
  Learning
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
Chenjia Bai
Lingxiao Wang
Zhuoran Yang
Zhihong Deng
Animesh Garg
Peng Liu
Zhaoran Wang
OffRL
19
132
0
23 Feb 2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium
  Learning from Offline Datasets
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
Han Zhong
Wei Xiong
Jiyuan Tan
Liwei Wang
Tong Zhang
Zhaoran Wang
Zhuoran Yang
OffRL
13
37
0
15 Feb 2022
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Adversarially Trained Actor Critic for Offline Reinforcement Learning
Ching-An Cheng
Tengyang Xie
Nan Jiang
Alekh Agarwal
OffRL
11
124
0
05 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted
  Iteration
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
11
1
0
31 Jan 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
Xiaohong Chen
Zhengling Qi
OffRL
20
31
0
17 Jan 2022
When is Offline Two-Player Zero-Sum Markov Game Solvable?
When is Offline Two-Player Zero-Sum Markov Game Solvable?
Qiwen Cui
S. Du
OffRL
23
29
0
10 Jan 2022
Model Selection in Batch Policy Optimization
Model Selection in Batch Policy Optimization
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
OffRL
11
12
0
23 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
16
4
0
29 Nov 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu-Xiang Wang
OffRL
16
82
0
17 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs
Representation Learning for Online and Offline RL in Low-rank MDPs
Masatoshi Uehara
Xuezhou Zhang
Wen Sun
OffRL
48
125
0
09 Oct 2021
Pessimistic Model-based Offline Reinforcement Learning under Partial
  Coverage
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
91
144
0
13 Jul 2021
Bellman-consistent Pessimism for Offline Reinforcement Learning
Bellman-consistent Pessimism for Offline Reinforcement Learning
Tengyang Xie
Ching-An Cheng
Nan Jiang
Paul Mineiro
Alekh Agarwal
OffRL
LRM
20
267
0
13 Jun 2021
Model-free Representation Learning and Exploration in Low-rank MDPs
Model-free Representation Learning and Exploration in Low-rank MDPs
Aditya Modi
Jinglin Chen
A. Krishnamurthy
Nan Jiang
Alekh Agarwal
OffRL
98
78
0
14 Feb 2021
Policy Mirror Descent for Reinforcement Learning: Linear Convergence,
  New Sampling Complexity, and Generalized Problem Classes
Policy Mirror Descent for Reinforcement Learning: Linear Convergence, New Sampling Complexity, and Generalized Problem Classes
Guanghui Lan
87
135
0
30 Jan 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
329
1,949
0
04 May 2020
Previous
12