ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.06926
  4. Cited By
Bellman-consistent Pessimism for Offline Reinforcement Learning
v1v2v3v4v5v6 (latest)

Bellman-consistent Pessimism for Offline Reinforcement Learning

Neural Information Processing Systems (NeurIPS), 2021
13 June 2021
Tengyang Xie
Ching-An Cheng
Nan Jiang
Paul Mineiro
Alekh Agarwal
    OffRLLRM
ArXiv (abs)PDFHTML

Papers citing "Bellman-consistent Pessimism for Offline Reinforcement Learning"

50 / 224 papers shown
Title
Learning in POMDPs is Sample-Efficient with Hindsight Observability
Learning in POMDPs is Sample-Efficient with Hindsight ObservabilityInternational Conference on Machine Learning (ICML), 2023
Jonathan Lee
Alekh Agarwal
Christoph Dann
Tong Zhang
164
23
0
31 Jan 2023
STEEL: Singularity-aware Reinforcement Learning
STEEL: Singularity-aware Reinforcement Learning
Xiaohong Chen
Zhengling Qi
Runzhe Wan
OffRL
225
3
0
30 Jan 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
224
19
0
30 Jan 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
  $K$-wise Comparisons
Principled Reinforcement Learning with Human Feedback from Pairwise or KKK-wise ComparisonsInternational Conference on Machine Learning (ICML), 2023
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
310
240
0
26 Jan 2023
Machine Learning for Large-Scale Optimization in 6G Wireless Networks
Machine Learning for Large-Scale Optimization in 6G Wireless NetworksIEEE Communications Surveys and Tutorials (COMST), 2023
Yandong Shi
Lixiang Lian
Yuanming Shi
Zixin Wang
Yong Zhou
Liqun Fu
Lin Bai
Jun Zhang
Wei Zhang
AI4CE
192
121
0
03 Jan 2023
Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality
Policy learning "without" overlap: Pessimism and generalized empirical Bernstein's inequality
Ying Jin
Zhimei Ren
Zhuoran Yang
Zhaoran Wang
OffRL
292
29
0
19 Dec 2022
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
186
97
0
13 Dec 2022
Confidence-Conditioned Value Functions for Offline Reinforcement
  Learning
Confidence-Conditioned Value Functions for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022
Joey Hong
Aviral Kumar
Sergey Levine
OffRL
158
21
0
08 Dec 2022
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based
  Offline Reinforcement Learning
One Risk to Rule Them All: A Risk-Sensitive Perspective on Model-Based Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Marc Rigter
Bruno Lacerda
Nick Hawes
OffRL
244
10
0
30 Nov 2022
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement
  Learning
State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning
Chong Chen
Hongyao Tang
Yi-An Ma
Chao Wang
Qianli Shen
Dong Li
Jianye Hao
OffRL
150
0
0
28 Nov 2022
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function ApproximationAAAI Conference on Artificial Intelligence (AAAI), 2022
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
132
23
0
23 Nov 2022
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
When is Realizability Sufficient for Off-Policy Reinforcement Learning?International Conference on Machine Learning (ICML), 2022
Andrea Zanette
OffRL
163
16
0
10 Nov 2022
Leveraging Offline Data in Online Reinforcement Learning
Leveraging Offline Data in Online Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
Andrew Wagenmaker
Aldo Pacchiano
OffRLOnRL
185
45
0
09 Nov 2022
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies
  with Offline Data
ARMOR: A Model-based Framework for Improving Arbitrary Baseline Policies with Offline Data
Tengyang Xie
M. Bhardwaj
Nan Jiang
Ching-An Cheng
OffRL
134
10
0
08 Nov 2022
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness
  to Model Misspecification
Max-Min Off-Policy Actor-Critic Method Focusing on Worst-Case Robustness to Model MisspecificationNeural Information Processing Systems (NeurIPS), 2022
Takumi Tanabe
Reimi Sato
Kazuto Fukuchi
Jun Sakuma
Youhei Akimoto
OffRL
166
14
0
07 Nov 2022
Oracle Inequalities for Model Selection in Offline Reinforcement
  Learning
Oracle Inequalities for Model Selection in Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
Emma Brunskill
OffRL
198
13
0
03 Nov 2022
Offline RL With Realistic Datasets: Heteroskedasticity and Support
  Constraints
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints
Anika Singh
Aviral Kumar
Q. Vuong
Yevgen Chebotar
Sergey Levine
OffRL
142
14
0
02 Nov 2022
Optimal Conservative Offline RL with General Function Approximation via
  Augmented Lagrangian
Optimal Conservative Offline RL with General Function Approximation via Augmented LagrangianInternational Conference on Learning Representations (ICLR), 2022
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
248
32
0
01 Nov 2022
Beyond the Return: Off-policy Function Estimation under User-specified
  Error-measuring Distributions
Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring DistributionsNeural Information Processing Systems (NeurIPS), 2022
Audrey Huang
Nan Jiang
OffRL
180
8
0
27 Oct 2022
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning
  Approach
Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning ApproachInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Yunzhe Zhou
Zhengling Qi
C. Shi
Lexin Li
OffRL
128
8
0
26 Oct 2022
Offline congestion games: How feedback type affects data coverage
  requirement
Offline congestion games: How feedback type affects data coverage requirementInternational Conference on Learning Representations (ICLR), 2022
Haozhe Jiang
Qiwen Cui
Zhihan Xiong
Maryam Fazel
S. Du
OffRL
116
1
0
24 Oct 2022
Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient
Hybrid RL: Using Both Offline and Online Data Can Make RL EfficientInternational Conference on Learning Representations (ICLR), 2022
Yuda Song
Yi Zhou
Ayush Sekhari
J. Andrew Bagnell
A. Krishnamurthy
Wen Sun
OffRLOnRL
219
126
0
13 Oct 2022
The Role of Coverage in Online Reinforcement Learning
The Role of Coverage in Online Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022
Tengyang Xie
Dylan J. Foster
Yu Bai
Nan Jiang
Sham Kakade
OffRL
120
65
0
09 Oct 2022
Offline Reinforcement Learning with Differentiable Function
  Approximation is Provably Efficient
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Ming Yin
Mengdi Wang
Yu Wang
OffRL
217
12
0
03 Oct 2022
Relational Reasoning via Set Transformers: Provable Efficiency and
  Applications to MARL
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARLNeural Information Processing Systems (NeurIPS), 2022
Fengzhuo Zhang
Boyi Liu
Kaixin Wang
Vincent Y. F. Tan
Zhuoran Yang
Zhaoran Wang
OffRLLRM
168
12
0
20 Sep 2022
Distributionally Robust Offline Reinforcement Learning with Linear
  Function Approximation
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation
Xiaoteng Ma
Zhipeng Liang
Jose H. Blanchet
MingWen Liu
Li Xia
Jiheng Zhang
Qianchuan Zhao
Zhengyuan Zhou
OODOffRL
209
33
0
14 Sep 2022
Strategic Decision-Making in the Presence of Information Asymmetry:
  Provably Efficient RL with Algorithmic Instruments
Strategic Decision-Making in the Presence of Information Asymmetry: Provably Efficient RL with Algorithmic Instruments
Mengxin Yu
Zhuoran Yang
Jianqing Fan
OffRL
184
9
0
23 Aug 2022
Robust Reinforcement Learning using Offline Data
Robust Reinforcement Learning using Offline DataNeural Information Processing Systems (NeurIPS), 2022
Kishan Panaganti
Zaiyan Xu
D. Kalathil
Mohammad Ghavamzadeh
OffRL
160
96
0
10 Aug 2022
Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Future-Dependent Value-Based Off-Policy Evaluation in POMDPsNeural Information Processing Systems (NeurIPS), 2022
Masatoshi Uehara
Haruka Kiyohara
Andrew Bennett
Victor Chernozhukov
Nan Jiang
Nathan Kallus
C. Shi
Wen Sun
OffRL
262
23
0
26 Jul 2022
Online Learning with Off-Policy Feedback
Online Learning with Off-Policy FeedbackInternational Conference on Algorithmic Learning Theory (ALT), 2022
Germano Gabbianelli
Matteo Papini
Gergely Neu
OffRL
97
4
0
18 Jul 2022
Offline Equilibrium Finding
Offline Equilibrium Finding
Shuxin Li
Xinrun Wang
Youzhi Zhang
Jakub Cerny
Pengdeng Li
Hau Chan
Bo An
OffRL
196
2
0
12 Jul 2022
An Empirical Study of Implicit Regularization in Deep Offline RL
An Empirical Study of Implicit Regularization in Deep Offline RL
Çağlar Gülçehre
Srivatsan Srinivasan
Jakub Sygnowski
Georg Ostrovski
Mehrdad Farajtabar
Matt Hoffman
Razvan Pascanu
Arnaud Doucet
OffRL
176
19
0
05 Jul 2022
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise
  Reward
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise RewardIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2022
Tengyu Xu
Yue Wang
Shaofeng Zou
Yingbin Liang
OffRL
166
15
0
13 Jun 2022
Federated Offline Reinforcement Learning
Federated Offline Reinforcement LearningJournal of the American Statistical Association (JASA), 2022
D. Zhou
Yufeng Zhang
Aaron Sonabend-W
Zhaoran Wang
Junwei Lu
Tianxi Cai
OffRL
195
17
0
11 Jun 2022
Mildly Conservative Q-Learning for Offline Reinforcement Learning
Mildly Conservative Q-Learning for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Jiafei Lyu
Xiaoteng Ma
Xiu Li
Zongqing Lu
OffRL
215
124
0
09 Jun 2022
On the Role of Discount Factor in Offline Reinforcement Learning
On the Role of Discount Factor in Offline Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
Haotian Hu
Yiqin Yang
Qianchuan Zhao
Chongjie Zhang
OffRL
165
21
0
07 Jun 2022
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
RORL: Robust Offline Reinforcement Learning via Conservative SmoothingNeural Information Processing Systems (NeurIPS), 2022
Rui Yang
Chenjia Bai
Xiaoteng Ma
Zhaoran Wang
Chongjie Zhang
Lei Han
OffRL
239
96
0
06 Jun 2022
Pessimistic Off-Policy Optimization for Learning to Rank
Pessimistic Off-Policy Optimization for Learning to RankEuropean Conference on Artificial Intelligence (ECAI), 2022
Matej Cief
Branislav Kveton
Michal Kompan
OffRL
164
3
0
06 Jun 2022
When does return-conditioned supervised learning work for offline
  reinforcement learning?
When does return-conditioned supervised learning work for offline reinforcement learning?Neural Information Processing Systems (NeurIPS), 2022
David Brandfonbrener
A. Bietti
Jacob Buckman
Romain Laroche
Joan Bruna
OffRL
145
76
0
02 Jun 2022
Offline Reinforcement Learning with Differential Privacy
Offline Reinforcement Learning with Differential PrivacyNeural Information Processing Systems (NeurIPS), 2022
Dan Qiao
Yu Wang
OffRL
205
26
0
02 Jun 2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient
  Learning
Stabilizing Q-learning with Linear Architectures for Provably Efficient LearningInternational Conference on Machine Learning (ICML), 2022
Andrea Zanette
Martin J. Wainwright
OOD
210
5
0
01 Jun 2022
Provably Efficient Offline Multi-agent Reinforcement Learning via
  Strategy-wise Bonus
Provably Efficient Offline Multi-agent Reinforcement Learning via Strategy-wise BonusNeural Information Processing Systems (NeurIPS), 2022
Qiwen Cui
S. Du
OffRL
125
23
0
01 Jun 2022
Why So Pessimistic? Estimating Uncertainties for Offline RL through
  Ensembles, and Why Their Independence Matters
Why So Pessimistic? Estimating Uncertainties for Offline RL through Ensembles, and Why Their Independence MattersNeural Information Processing Systems (NeurIPS), 2022
Seyed Kamyar Seyed Ghasemipour
S. Gu
Ofir Nachum
OffRL
154
82
0
27 May 2022
Pessimism in the Face of Confounders: Provably Efficient Offline
  Reinforcement Learning in Partially Observable Markov Decision Processes
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision ProcessesInternational Conference on Learning Representations (ICLR), 2022
Miao Lu
Yifei Min
Zhaoran Wang
Zhuoran Yang
OffRL
239
25
0
26 May 2022
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and
  Constant Regret
Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant RegretNeural Information Processing Systems (NeurIPS), 2022
Jiawei Huang
Li Zhao
Tao Qin
Wei Chen
Nan Jiang
Tie-Yan Liu
OffRL
268
4
0
25 May 2022
When Data Geometry Meets Deep Function: Generalizing Offline
  Reinforcement Learning
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2022
Jianxiong Li
Xianyuan Zhan
Haoran Xu
Xiangyu Zhu
Jingjing Liu
Ya Zhang
OffRL
208
30
0
23 May 2022
Pessimism for Offline Linear Contextual Bandits using $\ell_p$
  Confidence Sets
Pessimism for Offline Linear Contextual Bandits using ℓp\ell_pℓp​ Confidence SetsNeural Information Processing Systems (NeurIPS), 2022
Gen Li
Cong Ma
Nathan Srebro
OffRL
186
17
0
21 May 2022
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline
  Reinforcement Learning
Pessimism meets VCG: Learning Dynamic Mechanism Design via Offline Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
Boxiang Lyu
Zhaoran Wang
Mladen Kolar
Zhuoran Yang
OffRL
99
9
0
05 May 2022
When Should We Prefer Offline Reinforcement Learning Over Behavioral
  Cloning?
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning?
Aviral Kumar
Joey Hong
Anika Singh
Sergey Levine
OffRL
191
91
0
12 Apr 2022
Offline Reinforcement Learning Under Value and Density-Ratio
  Realizability: The Power of Gaps
Offline Reinforcement Learning Under Value and Density-Ratio Realizability: The Power of GapsConference on Uncertainty in Artificial Intelligence (UAI), 2022
Jinglin Chen
Nan Jiang
OffRL
207
36
0
25 Mar 2022
Previous
12345
Next