ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.15085
  4. Cited By
Is Pessimism Provably Efficient for Offline RL?
v1v2v3 (latest)

Is Pessimism Provably Efficient for Offline RL?

International Conference on Machine Learning (ICML), 2020
30 December 2020
Ying Jin
Zhuoran Yang
Zhaoran Wang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Is Pessimism Provably Efficient for Offline RL?"

40 / 290 papers shown
Title
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted
  Iteration
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
230
1
0
31 Jan 2022
Robust Imitation Learning from Corrupted Demonstrations
Robust Imitation Learning from Corrupted Demonstrations
Liu Liu
Ziyang Tang
Lanqing Li
Dijun Luo
154
15
0
29 Jan 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function
  Estimation in Off-policy Evaluation
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy EvaluationInternational Conference on Machine Learning (ICML), 2022
Xiaohong Chen
Zhengling Qi
OffRL
351
35
0
17 Jan 2022
When is Offline Two-Player Zero-Sum Markov Game Solvable?
When is Offline Two-Player Zero-Sum Markov Game Solvable?
Qiwen Cui
S. Du
OffRL
162
29
0
10 Jan 2022
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in
  General-Sum Markov Games with Myopic Followers?
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?
Han Zhong
Zhuoran Yang
Zhaoran Wang
Sai Li
237
33
0
27 Dec 2021
Model Selection in Batch Policy Optimization
Model Selection in Batch Policy OptimizationInternational Conference on Machine Learning (ICML), 2021
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
OffRL
172
12
0
23 Dec 2021
Pessimistic Model Selection for Offline Deep Reinforcement Learning
Pessimistic Model Selection for Offline Deep Reinforcement LearningConference on Uncertainty in Artificial Intelligence (UAI), 2021
Chao-Han Huck Yang
Zhengling Qi
Yifan Cui
Pin-Yu Chen
OffRL
199
4
0
29 Nov 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and
  Generalization
Offline Neural Contextual Bandits: Pessimism, Optimization and GeneralizationInternational Conference on Learning Representations (ICLR), 2021
Thanh Nguyen-Tang
Sunil R. Gupta
A. Nguyen
Svetha Venkatesh
OffRL
166
33
0
27 Nov 2021
Compressive Features in Offline Reinforcement Learning for Recommender
  Systems
Compressive Features in Offline Reinforcement Learning for Recommender Systems
Hung Nguyen
Minh Nguyen
Long Pham
Jennifer Adorno Nieves
OffRL
103
3
0
16 Nov 2021
Towards Hyperparameter-free Policy Selection for Offline Reinforcement
  Learning
Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning
Siyuan Zhang
Nan Jiang
OffRL
294
40
0
26 Oct 2021
False Correlation Reduction for Offline Reinforcement Learning
False Correlation Reduction for Offline Reinforcement LearningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Arvindkumar Krishnakumar
Zuyue Fu
Lingxiao Wang
Zhuoran Yang
Chenjia Bai
Tianyi Zhou
Judy Hoffman
Jing Jiang
OffRL
174
11
0
24 Oct 2021
Offline Reinforcement Learning with Value-based Episodic Memory
Offline Reinforcement Learning with Value-based Episodic Memory
Xiaoteng Ma
Yiqin Yang
Haotian Hu
Qihan Liu
Jun Yang
Chongjie Zhang
Qianchuan Zhao
Bin Liang
OffRL
194
46
0
19 Oct 2021
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Towards Instance-Optimal Offline Reinforcement Learning with Pessimism
Ming Yin
Yu Wang
OffRL
190
85
0
17 Oct 2021
Value Penalized Q-Learning for Recommender Systems
Value Penalized Q-Learning for Recommender Systems
Chengqian Gao
Ke Xu
Kuangqi Zhou
Lanqing Li
Xueqian Wang
Bo Yuan
P. Zhao
OffRL
163
22
0
15 Oct 2021
Representation Learning for Online and Offline RL in Low-rank MDPs
Representation Learning for Online and Offline RL in Low-rank MDPsInternational Conference on Learning Representations (ICLR), 2021
Masatoshi Uehara
Xuezhou Zhang
Wen Sun
OffRL
377
132
0
09 Oct 2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement
  Learning
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
Emma Brunskill
OffRL
222
127
0
19 Aug 2021
Provably Efficient Generative Adversarial Imitation Learning for Online
  and Offline Setting with Linear Function Approximation
Provably Efficient Generative Adversarial Imitation Learning for Online and Offline Setting with Linear Function Approximation
Zhihan Liu
Yufeng Zhang
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
OffRL
107
7
0
19 Aug 2021
Bandit Algorithms for Precision Medicine
Bandit Algorithms for Precision Medicine
Yangyi Lu
Ziping Xu
Ambuj Tewari
213
16
0
10 Aug 2021
Combining Online Learning and Offline Learning for Contextual Bandits
  with Deficient Support
Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support
Hung The Tran
Sunil R. Gupta
Thanh Nguyen-Tang
Santu Rana
Svetha Venkatesh
OffRL
130
6
0
24 Jul 2021
Pessimistic Model-based Offline Reinforcement Learning under Partial
  Coverage
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
329
159
0
13 Jul 2021
Variance-Aware Off-Policy Evaluation with Linear Function Approximation
Variance-Aware Off-Policy Evaluation with Linear Function ApproximationNeural Information Processing Systems (NeurIPS), 2021
Yifei Min
Tianhao Wang
Dongruo Zhou
Quanquan Gu
OffRL
191
40
0
22 Jun 2021
Provably Efficient Representation Selection in Low-rank Markov Decision
  Processes: From Online to Offline RL
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RLConference on Uncertainty in Artificial Intelligence (UAI), 2021
Weitong Zhang
Jiafan He
Dongruo Zhou
Amy Zhang
Quanquan Gu
OffRL
191
12
0
22 Jun 2021
The Curse of Passive Data Collection in Batch Reinforcement Learning
The Curse of Passive Data Collection in Batch Reinforcement Learning
Chenjun Xiao
Ilbin Lee
Bo Dai
Dale Schuurmans
Csaba Szepesvári
OffRL
176
1
0
18 Jun 2021
Bellman-consistent Pessimism for Offline Reinforcement Learning
Bellman-consistent Pessimism for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Tengyang Xie
Ching-An Cheng
Nan Jiang
Paul Mineiro
Alekh Agarwal
OffRLLRM
595
300
0
13 Jun 2021
Corruption-Robust Offline Reinforcement Learning
Corruption-Robust Offline Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021
Xuezhou Zhang
Yiding Chen
Jerry Zhu
Wen Sun
OffRL
150
49
0
11 Jun 2021
Offline Reinforcement Learning as Anti-Exploration
Offline Reinforcement Learning as Anti-ExplorationAAAI Conference on Artificial Intelligence (AAAI), 2021
Shideh Rezaeifar
Robert Dadashi
Nino Vieillard
Léonard Hussenot
Olivier Bachem
Olivier Pietquin
Matthieu Geist
OffRL
180
60
0
11 Jun 2021
Policy Finetuning: Bridging Sample-Efficient Offline and Online
  Reinforcement Learning
Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Tengyang Xie
Nan Jiang
Huan Wang
Caiming Xiong
Yu Bai
OffRLOnRL
220
179
0
09 Jun 2021
Mitigating Covariate Shift in Imitation Learning via Offline Data
  Without Great Coverage
Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage
Jonathan D. Chang
Masatoshi Uehara
Dhruv Sreenivas
Rahul Kidambi
Wen Sun
OffRL
268
36
0
06 Jun 2021
Heuristic-Guided Reinforcement Learning
Heuristic-Guided Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Ching-An Cheng
Andrey Kolobov
Adith Swaminathan
OffRL
224
71
0
05 Jun 2021
Offline Reinforcement Learning as One Big Sequence Modeling Problem
Offline Reinforcement Learning as One Big Sequence Modeling ProblemNeural Information Processing Systems (NeurIPS), 2021
Michael Janner
Qiyang Li
Sergey Levine
OffRL
553
776
0
03 Jun 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic SettingsNeural Information Processing Systems (NeurIPS), 2021
Ming Yin
Yu Wang
OffRL
241
19
0
13 May 2021
Towards Theoretical Understandings of Robust Markov Decision Processes:
  Sample Complexity and Asymptotics
Towards Theoretical Understandings of Robust Markov Decision Processes: Sample Complexity and Asymptotics
Wenhao Yang
Liangyu Zhang
Zhihua Zhang
177
34
0
09 May 2021
Policy Learning with Adaptively Collected Data
Policy Learning with Adaptively Collected DataManagement Sciences (MS), 2021
Ruohan Zhan
Zhimei Ren
Susan Athey
Zhengyuan Zhou
OffRL
211
31
0
05 May 2021
On the Optimality of Batch Policy Optimization Algorithms
On the Optimality of Batch Policy Optimization AlgorithmsInternational Conference on Machine Learning (ICML), 2021
Chenjun Xiao
Yifan Wu
Tor Lattimore
Bo Dai
Jincheng Mei
Lihong Li
Csaba Szepesvári
Dale Schuurmans
OffRL
156
34
0
06 Apr 2021
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale
  of Pessimism
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of PessimismIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2021
Paria Rashidinejad
Banghua Zhu
Cong Ma
Jiantao Jiao
Stuart J. Russell
OffRL
602
308
0
22 Mar 2021
Sample Complexity of Offline Reinforcement Learning with Deep ReLU
  Networks
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks
Thanh Nguyen-Tang
Sunil R. Gupta
Hung The Tran
Svetha Venkatesh
OffRL
326
7
0
11 Mar 2021
Uncertainty Estimation Using Riemannian Model Dynamics for Offline
  Reinforcement Learning
Uncertainty Estimation Using Riemannian Model Dynamics for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Guy Tennenholtz
Shie Mannor
OffRL
182
14
0
22 Feb 2021
Continuous Doubly Constrained Batch Reinforcement Learning
Continuous Doubly Constrained Batch Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2021
Rasool Fakoor
Jonas W. Mueller
Kavosh Asadi
Pratik Chaudhari
Alex Smola
OffRL
495
32
0
18 Feb 2021
COMBO: Conservative Offline Model-Based Policy Optimization
COMBO: Conservative Offline Model-Based Policy OptimizationNeural Information Processing Systems (NeurIPS), 2021
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
519
469
0
16 Feb 2021
Near-Optimal Offline Reinforcement Learning via Double Variance
  Reduction
Near-Optimal Offline Reinforcement Learning via Double Variance ReductionNeural Information Processing Systems (NeurIPS), 2021
Ming Yin
Yu Bai
Yu Wang
OffRL
196
70
0
02 Feb 2021
Previous
123456