Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.13890
Cited By
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
28 February 2022
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity"
20 / 20 papers shown
Title
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen
Jincheng Mei
Katayoon Goshvadi
Hanjun Dai
Tong Yang
Sherry Yang
Dale Schuurmans
Yuejie Chi
Bo Dai
OffRL
60
23
0
20 Feb 2025
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng
Haochen Zhang
Lingzhou Xue
OffRL
70
2
0
10 Oct 2024
Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis
Qining Zhang
Honghao Wei
Lei Ying
OffRL
50
1
0
11 Jun 2024
What Are the Odds? Improving the foundations of Statistical Model Checking
Tobias Meggendorfer
Maximilian Weininger
Patrick Wienhoft
24
4
0
08 Apr 2024
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Jiin Woo
Laixi Shi
Gauri Joshi
Yuejie Chi
OffRL
24
3
0
08 Feb 2024
MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning
Mao Hong
Zhiyue Zhang
Yue Wu
Yan Xu
OffRL
41
0
0
21 Jan 2024
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
90
21
0
25 Jul 2023
Offline Meta Reinforcement Learning with In-Distribution Online Adaptation
Jianhao Wang
Jin Zhang
Haozhe Jiang
Junyu Zhang
Liwei Wang
Chongjie Zhang
OffRL
19
9
0
31 May 2023
High-probability sample complexities for policy evaluation with linear function approximation
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
18
6
0
30 May 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
29
52
0
29 May 2023
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage
Jose H. Blanchet
Miao Lu
Tong Zhang
Han Zhong
OffRL
37
29
0
16 May 2023
Offline Learning in Markov Games with General Function Approximation
Yuheng Zhang
Yunru Bai
Nan Jiang
OffRL
13
8
0
06 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
18
5
0
05 Feb 2023
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP
Fan Chen
Junyu Zhang
Zaiwen Wen
OffRL
18
8
0
13 Jul 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Andrea Zanette
Martin J. Wainwright
OOD
28
5
0
01 Jun 2022
Non-Markovian policies occupancy measures
Romain Laroche
Rémi Tachet des Combes
Jacob Buckman
OffRL
19
1
0
27 May 2022
Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism
Ming Yin
Yaqi Duan
Mengdi Wang
Yu-Xiang Wang
OffRL
21
65
0
11 Mar 2022
Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
Masatoshi Uehara
Wen Sun
OffRL
91
144
0
13 Jul 2021
COMBO: Conservative Offline Model-Based Policy Optimization
Tianhe Yu
Aviral Kumar
Rafael Rafailov
Aravind Rajeswaran
Sergey Levine
Chelsea Finn
OffRL
214
413
0
16 Feb 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
329
1,949
0
04 May 2020
1