ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.15085
  4. Cited By
Is Pessimism Provably Efficient for Offline RL?
v1v2v3 (latest)

Is Pessimism Provably Efficient for Offline RL?

International Conference on Machine Learning (ICML), 2020
30 December 2020
Ying Jin
Zhuoran Yang
Zhaoran Wang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Is Pessimism Provably Efficient for Offline RL?"

50 / 290 papers shown
Title
Double Pessimism is Provably Efficient for Distributionally Robust
  Offline Reinforcement Learning: Generic Algorithm and Robust Partial Coverage
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial CoverageNeural Information Processing Systems (NeurIPS), 2023
Jose H. Blanchet
Miao Lu
Tong Zhang
Han Zhong
OffRL
257
48
0
16 May 2023
Local Optimization Achieves Global Optimality in Multi-Agent
  Reinforcement Learning
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Yulai Zhao
Zhuoran Yang
Zhaoran Wang
Jason D. Lee
182
7
0
08 May 2023
A Survey on Offline Model-Based Reinforcement Learning
A Survey on Offline Model-Based Reinforcement Learning
Haoyang He
OffRL
187
11
0
05 May 2023
What can online reinforcement learning with function approximation
  benefit from general coverage conditions?
What can online reinforcement learning with function approximation benefit from general coverage conditions?International Conference on Machine Learning (ICML), 2023
Fanghui Liu
Luca Viano
Volkan Cevher
OffRL
231
4
0
25 Apr 2023
Provably Feedback-Efficient Reinforcement Learning via Active Reward
  Learning
Provably Feedback-Efficient Reinforcement Learning via Active Reward LearningNeural Information Processing Systems (NeurIPS), 2023
Dingwen Kong
Lin F. Yang
241
16
0
18 Apr 2023
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning
Minimax-Optimal Reward-Agnostic Exploration in Reinforcement LearningAnnual Conference Computational Learning Theory (COLT), 2023
Gen Li
Yuling Yan
Yuxin Chen
Jianqing Fan
OffRL
268
15
0
14 Apr 2023
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Hanze Dong
Wei Xiong
Deepanshu Goyal
Yihan Zhang
Winnie Chow
Boyao Wang
Shizhe Diao
Jipeng Zhang
Kashun Shum
Tong Zhang
ALM
438
632
0
13 Apr 2023
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning
  from Observations
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from ObservationsInternational Conference on Machine Learning (ICML), 2023
Anqi Li
Byron Boots
Ching-An Cheng
OffRL
280
21
0
30 Mar 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Botao Hao
Rahul Jain
Dengwang Tang
Zheng Wen
OffRL
198
5
0
20 Mar 2023
A Unified Framework of Policy Learning for Contextual Bandit with
  Confounding Bias and Missing Observations
A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Siyu Chen
Yitan Wang
Zhaoran Wang
Zhuoran Yang
OffRL
197
2
0
20 Mar 2023
Decision-Making Under Uncertainty: Beyond Probabilities
Decision-Making Under Uncertainty: Beyond ProbabilitiesInternational Journal on Software Tools for Technology Transfer (STTT) (STTT), 2023
Thom S. Badings
T. D. Simão
Marnix Suilen
N. Jansen
UDPER
212
17
0
10 Mar 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online
  Fine-Tuning
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningNeural Information Processing Systems (NeurIPS), 2023
Mitsuhiko Nakamoto
Yuexiang Zhai
Anika Singh
Max Sobol Mark
Yi-An Ma
Chelsea Finn
Aviral Kumar
Sergey Levine
OffRLOnRL
513
183
0
09 Mar 2023
Finite-sample Guarantees for Nash Q-learning with Linear Function
  Approximation
Finite-sample Guarantees for Nash Q-learning with Linear Function ApproximationConference on Uncertainty in Artificial Intelligence (UAI), 2023
Pedro Cisneros-Velarde
Oluwasanmi Koyejo
254
1
0
01 Mar 2023
The In-Sample Softmax for Offline Reinforcement Learning
The In-Sample Softmax for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023
Chenjun Xiao
Zheng Chen
Yangchen Pan
Adam White
Martha White
OffRL
166
28
0
28 Feb 2023
The Provable Benefits of Unsupervised Data Sharing for Offline
  Reinforcement Learning
The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
163
9
0
27 Feb 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function
  Approximation
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function ApproximationInternational Conference on Learning Representations (ICLR), 2023
Thanh Nguyen-Tang
R. Arora
OffRL
194
6
0
24 Feb 2023
Adversarial Model for Offline Reinforcement Learning
Adversarial Model for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
M. Bhardwaj
Tengyang Xie
Byron Boots
Nan Jiang
Ching-An Cheng
AAMLOffRL
214
35
0
21 Feb 2023
Robust Knowledge Transfer in Tiered Reinforcement Learning
Robust Knowledge Transfer in Tiered Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Jiawei Huang
Niao He
OffRL
328
1
0
10 Feb 2023
PASTA: Pessimistic Assortment Optimization
PASTA: Pessimistic Assortment OptimizationInternational Conference on Machine Learning (ICML), 2023
Juncheng Dong
Weibin Mo
Zhengling Qi
Cong Shi
Ethan X. Fang
Vahid Tarokh
OffRL
136
2
0
08 Feb 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with
  General Function Approximation and Single-Policy Concentrability
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy ConcentrabilityNeural Information Processing Systems (NeurIPS), 2023
Hanlin Zhu
Amy Zhang
OffRL
258
5
0
07 Feb 2023
Offline Learning in Markov Games with General Function Approximation
Offline Learning in Markov Games with General Function ApproximationInternational Conference on Machine Learning (ICML), 2023
Yuheng Zhang
Yunru Bai
Nan Jiang
OffRL
277
11
0
06 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage
Offline Minimax Soft-Q-learning Under Realizability and Partial CoverageNeural Information Processing Systems (NeurIPS), 2023
Masatoshi Uehara
Nathan Kallus
Jason D. Lee
Wen Sun
OffRL
311
8
0
05 Feb 2023
Reinforcement Learning in Low-Rank MDPs with Density Features
Reinforcement Learning in Low-Rank MDPs with Density FeaturesInternational Conference on Machine Learning (ICML), 2023
Audrey Huang
Jinglin Chen
Nan Jiang
OffRL
210
14
0
04 Feb 2023
Selective Uncertainty Propagation in Offline RL
Selective Uncertainty Propagation in Offline RLAAAI Conference on Artificial Intelligence (AAAI), 2023
Sanath Kumar Krishnamurthy
Shrey Modi
Tanmay Gangwani
S. Katariya
Branislav Kveton
A. Rangi
OffRL
539
0
0
01 Feb 2023
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders
David Bruns-Smith
Angela Zhou
OffRL
544
13
0
01 Feb 2023
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Learning in POMDPs is Sample-Efficient with Hindsight ObservabilityInternational Conference on Machine Learning (ICML), 2023
Jonathan Lee
Alekh Agarwal
Christoph Dann
Tong Zhang
247
23
0
31 Jan 2023
STEEL: Singularity-aware Reinforcement Learning
STEEL: Singularity-aware Reinforcement Learning
Xiaohong Chen
Zhengling Qi
Runzhe Wan
OffRL
355
3
0
30 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
331
19
0
30 Jan 2023
Model-based Offline Reinforcement Learning with Local Misspecification
Model-based Offline Reinforcement Learning with Local MisspecificationAAAI Conference on Artificial Intelligence (AAAI), 2023
Kefan Dong
Yannis Flet-Berliac
Allen Nie
Emma Brunskill
OffRL
188
6
0
26 Jan 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
  $K$-wise Comparisons
Principled Reinforcement Learning with Human Feedback from Pairwise or KKK-wise ComparisonsInternational Conference on Machine Learning (ICML), 2023
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
434
248
0
26 Jan 2023
Risk Sensitive Dead-end Identification in Safety-Critical Offline
  Reinforcement Learning
Risk Sensitive Dead-end Identification in Safety-Critical Offline Reinforcement Learning
Taylor W. Killian
S. Parbhoo
Marzyeh Ghassemi
OffRL
226
8
0
13 Jan 2023
Safe Policy Improvement for POMDPs via Finite-State Controllers
Safe Policy Improvement for POMDPs via Finite-State ControllersAAAI Conference on Artificial Intelligence (AAAI), 2023
T. D. Simão
Marnix Suilen
N. Jansen
OffRL
160
10
0
12 Jan 2023
Offline Reinforcement Learning for Human-Guided Human-Machine
  Interaction with Private Information
Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private InformationManagement Sciences (MS), 2022
Zuyue Fu
Zhengling Qi
Zhuoran Yang
Zhaoran Wang
Lan Wang
OffRL
171
1
0
23 Dec 2022
Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality
Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
382
30
0
19 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
238
100
0
13 Dec 2022
Multi-Task Off-Policy Learning from Bandit Feedback
Multi-Task Off-Policy Learning from Bandit FeedbackInternational Conference on Machine Learning (ICML), 2022
Joey Hong
Branislav Kveton
S. Katariya
Manzil Zaheer
Mohammad Ghavamzadeh
OffRL
325
11
0
09 Dec 2022
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from
  Mixed Datasets
TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed DatasetsIndustrial Conference on Data Mining (IDM), 2022
Yuanying Cai
Wei Shen
Li Zhao
Wei Shen
Xuyun Zhang
Lei Song
Jiang Bian
Tao Qin
Tie-Yan Liu
OffRL
144
5
0
05 Dec 2022
Flow to Control: Offline Reinforcement Learning with Lossless Primitive
  Discovery
Flow to Control: Offline Reinforcement Learning with Lossless Primitive DiscoveryAAAI Conference on Artificial Intelligence (AAAI), 2022
Yiqin Yang
Haotian Hu
Wenzhe Li
Siyuan Li
Jun Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
164
12
0
02 Dec 2022
Efficient Reinforcement Learning Through Trajectory Generation
Efficient Reinforcement Learning Through Trajectory GenerationConference on Learning for Dynamics & Control (L4DC), 2022
Wenqi Cui
Linbin Huang
Weiwei Yang
Baosen Zhang
OffRL
204
0
0
30 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement
  Operators
Offline Reinforcement Learning with Closed-Form Policy Improvement OperatorsInternational Conference on Machine Learning (ICML), 2022
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
OffRL
232
18
0
29 Nov 2022
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement
  Learning
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning
Chong Chen
Hongyao Tang
Yi-An Ma
Chao Wang
Qianli Shen
Dong Li
Jianye Hao
OffRL
226
0
0
28 Nov 2022
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function ApproximationAAAI Conference on Artificial Intelligence (AAAI), 2022
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
172
23
0
23 Nov 2022
Data-Driven Offline Decision-Making via Invariant Representation
  Learning
Data-Driven Offline Decision-Making via Invariant Representation LearningNeural Information Processing Systems (NeurIPS), 2022
Qi
Yi-Hsun Su
Aviral Kumar
Sergey Levine
OffRL
243
30
0
21 Nov 2022
Leveraging Offline Data in Online Reinforcement Learning
Leveraging Offline Data in Online Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
Andrew Wagenmaker
Aldo Pacchiano
OffRLOnRL
265
44
0
09 Nov 2022
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies
  with Offline Data
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data
Tengyang Xie
M. Bhardwaj
Nan Jiang
Ching-An Cheng
OffRL
183
10
0
08 Nov 2022
Oracle Inequalities for Model Selection in Offline Reinforcement
  Learning
Oracle Inequalities for Model Selection in Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
Emma Brunskill
OffRL
322
14
0
03 Nov 2022
Offline RL With Realistic Datasets: Heteroskedasticity and Support
  Constraints
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Anika Singh
Aviral Kumar
Q. Vuong
Yevgen Chebotar
Sergey Levine
OffRL
182
15
0
02 Nov 2022
Behavior Prior Representation learning for Offline Reinforcement
  Learning
Behavior Prior Representation learning for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022
Hongyu Zang
Xin Li
Jie Yu
Chen Liu
Riashat Islam
Rémi Tachet des Combes
Romain Laroche
OffRLOnRL
321
12
0
02 Nov 2022
Optimal Conservative Offline RL with General Function Approximation via
  Augmented Lagrangian
Optimal Conservative Offline RL with General Function Approximation via Augmented LagrangianInternational Conference on Learning Representations (ICLR), 2022
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
348
32
0
01 Nov 2022
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning
  Approach
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning ApproachInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yunzhe Zhou
Zhengling Qi
C. Shi
Lexin Li
OffRL
199
8
0
26 Oct 2022
Previous
123456
Next