v1v2v3v4v5v6 (latest)

Bellman-consistent Pessimism for Offline Reinforcement Learning

Neural Information Processing Systems (NeurIPS), 2021

13 June 2021

Papers citing "Bellman-consistent Pessimism for Offline Reinforcement Learning"

50 / 224 papers shown

Title
Towards Robust Offline Reinforcement Learning under Diverse Data Corruption Rui Yang Han Zhong Jiawei Xu Amy Zhang Chong Zhang Lei Han Tong Zhang OffRL OnRL 260 22 0 19 Oct 2023
Action-Quantized Offline Reinforcement Learning for Robotic Skill LearningConference on Robot Learning (CoRL), 2023 Jianlan Luo Perry Dong Jeffrey Wu Aviral Kumar Xinyang Geng Sergey Levine OffRL 175 29 0 18 Oct 2023
Learning Regularized Monotone Graphon Mean-Field GamesNeural Information Processing Systems (NeurIPS), 2023 Fengzhuo Zhang Vincent Y. F. Tan Zhaoran Wang Zhuoran Yang 142 10 0 12 Oct 2023
Bi-Level Offline Policy Optimization with Limited ExplorationNeural Information Processing Systems (NeurIPS), 2023 Wenzhuo Zhou OffRL 175 5 0 10 Oct 2023
Sample-Efficient Multi-Agent RL: An Optimization PerspectiveInternational Conference on Learning Representations (ICLR), 2023 Nuoya Xiong Zhihan Liu Zhaoran Wang Zhuoran Yang 147 1 0 10 Oct 2023
$$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis$ $\mathcal{B}$ -Coder: Value-Based Deep Reinforcement Learning for Program SynthesisInternational Conference on Learning Representations (ICLR), 2023 Zishun Yu Yunzhe Tao Liyu Chen Tao Sun Hongxia Yang 161 18 0 04 Oct 2023
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023 Qiwei Di Heyang Zhao Jiafan He Quanquan Gu OffRL 169 6 0 02 Oct 2023
Stackelberg Batch Policy Learning Wenzhuo Zhou Annie Qu OffRL 166 1 0 28 Sep 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework Wenzhuo Zhou Yuhan Li Ruoqing Zhu Annie Qu OffRL 161 7 0 23 Sep 2023
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Jianzhun Shao Yun Qu Chen Chen Hongchang Zhang Xiangyang Ji OffRL 138 34 0 22 Sep 2023
Model-based Offline Policy Optimization with Adversarial NetworkEuropean Conference on Artificial Intelligence (ECAI), 2023 Junming Yang Xingguo Chen Shengyuan Wang Bolei Zhang OffRL 93 3 0 05 Sep 2023
Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems Xiang Ji Huazheng Wang Minshuo Chen Tuo Zhao Mengdi Wang OffRL 207 8 0 24 Jul 2023
Model-based Offline Reinforcement Learning with Count-based ConservatismInternational Conference on Machine Learning (ICML), 2023 Byeongchang Kim Min Hwan Oh OffRL 115 12 0 21 Jul 2023
Bayesian Safe Policy Learning with Chance Constrained Optimization: Application to Military Security Assessment during the Vietnam War Zeyang Jia Eli Ben-Michael Kosuke Imai 206 5 0 17 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline DataNeural Information Processing Systems (NeurIPS), 2023 Ruiqi Zhang Andrea Zanette OffRL OnRL 176 9 0 10 Jul 2023
Provably Efficient UCB-type Algorithms For Learning Predictive State RepresentationsInternational Conference on Learning Representations (ICLR), 2023 Ruiquan Huang Yitao Liang J. Yang OffRL 246 6 0 01 Jul 2023
Soft Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample ComplexityInternational Conference on Learning Representations (ICLR), 2023 Runyu Zhang Yang Hu Na Li 263 11 0 20 Jun 2023
Provably Efficient Offline Reinforcement Learning with Perturbed Data SourcesInternational Conference on Machine Learning (ICML), 2023 Chengshuai Shi Wei Xiong Cong Shen Jing Yang OffRL 144 4 0 14 Jun 2023
Oracle-Efficient Pessimism: Offline Policy Optimization in Contextual BanditsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 Lequn Wang A. Krishnamurthy Aleksandrs Slivkins OffRL 180 12 0 13 Jun 2023
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 Kihyuk Hong Yuhang Li Ambuj Tewari OffRL 226 8 0 13 Jun 2023
Unified Off-Policy Learning to Rank: a Reinforcement Learning PerspectiveNeural Information Processing Systems (NeurIPS), 2023 Zeyu Zhang Yi-Hsun Su Hui Yuan Yiran Wu R. Balasubramanian Qingyun Wu Huazheng Wang Mengdi Wang OffRL CML 212 7 0 13 Jun 2023
Instructed Diffuser with Temporal Condition Guidance for Offline Reinforcement Learning Jifeng Hu Yan Sun Sili Huang Siyuan Guo Hechang Chen Li Shen Lichao Sun Yi-Ju Chang Dacheng Tao DiffM OffRL 123 15 0 08 Jun 2023
Survival Instinct in Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Anqi Li Dipendra Kumar Misra Andrey Kolobov Ching-An Cheng OffRL 172 19 0 05 Jun 2023
Bayesian Regret Minimization in Offline BanditsInternational Conference on Machine Learning (ICML), 2023 Marek Petrik Guy Tennenholtz Mohammad Ghavamzadeh OffRL 223 0 0 02 Jun 2023
Delphic Offline Reinforcement Learning under Nonidentifiable Hidden ConfoundingInternational Conference on Learning Representations (ICLR), 2023 Alizée Pace Hugo Yèche Bernhard Schölkopf Gunnar Rätsch Guy Tennenholtz OffRL 137 8 0 01 Jun 2023
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning Peizhong Ju A. Ghosh Ness B. Shroff 176 5 0 01 Jun 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023 Zhihan Liu Miao Lu Wei Xiong Han Zhong Haotian Hu Shenao Zhang Sirui Zheng Zhuoran Yang Zhaoran Wang OffRL 248 24 0 29 May 2023
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Kaiwen Wang Kevin Zhou Runzhe Wu Nathan Kallus Wen Sun OffRL 292 23 0 25 May 2023
Provable Offline Preference-Based Reinforcement LearningInternational Conference on Learning Representations (ICLR), 2023 Wenhao Zhan Masatoshi Uehara Nathan Kallus Jason D. Lee Wen Sun OffRL 218 38 0 24 May 2023
Offline Primal-Dual Reinforcement Learning for Linear MDPsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 Germano Gabbianelli Gergely Neu Nneka Okolo Matteo Papini OffRL 140 11 0 22 May 2023
Offline Reinforcement Learning with Additional Covering Distributions Chenjie Mao OffRL 175 0 0 22 May 2023
Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Gen Li Wenhao Zhan Jason D. Lee Yuejie Chi Yuxin Chen OffRL OnRL 195 14 0 17 May 2023
Double Pessimism is Provably Efficient for Distributionally Robust Offline Reinforcement Learning: Generic Algorithm and Robust Partial CoverageNeural Information Processing Systems (NeurIPS), 2023 Jose H. Blanchet Miao Lu Tong Zhang Han Zhong OffRL 156 45 0 16 May 2023
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023 Yulai Zhao Zhuoran Yang Zhaoran Wang Jason D. Lee 126 6 0 08 May 2023
What can online reinforcement learning with function approximation benefit from general coverage conditions?International Conference on Machine Learning (ICML), 2023 Fanghui Liu Luca Viano Volkan Cevher OffRL 136 4 0 25 Apr 2023
MAHALO: Unifying Offline Reinforcement Learning and Imitation Learning from ObservationsInternational Conference on Machine Learning (ICML), 2023 Anqi Li Byron Boots Ching-An Cheng OffRL 192 21 0 30 Mar 2023
Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale Botao Hao Rahul Jain Dengwang Tang Zheng Wen OffRL 138 5 0 20 Mar 2023
A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations Siyu Chen Yitan Wang Zhaoran Wang Zhuoran Yang OffRL 141 2 0 20 Mar 2023
Variance-aware robust reinforcement learning with linear function approximation under heavy-tailed rewards Xiang Li Qiang Sun 169 9 0 09 Mar 2023
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-TuningNeural Information Processing Systems (NeurIPS), 2023 Mitsuhiko Nakamoto Yuexiang Zhai Anika Singh Max Sobol Mark Yi-An Ma Chelsea Finn Aviral Kumar Sergey Levine OffRL OnRL 332 162 0 09 Mar 2023
The Virtues of Laziness in Model-based RL: A Unified Objective and AlgorithmsInternational Conference on Machine Learning (ICML), 2023 Anirudh Vemula Yuda Song Aarti Singh J. Andrew Bagnell Sanjiban Choudhury OffRL 125 15 0 01 Mar 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function ApproximationInternational Conference on Learning Representations (ICLR), 2023 Thanh Nguyen-Tang R. Arora OffRL 174 6 0 24 Feb 2023
Adversarial Model for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 M. Bhardwaj Tengyang Xie Byron Boots Nan Jiang Ching-An Cheng AAML OffRL 166 33 0 21 Feb 2023
Distributional Offline Policy Evaluation with Predictive Error GuaranteesInternational Conference on Machine Learning (ICML), 2023 Runzhe Wu Masatoshi Uehara Wen Sun OffRL 137 17 0 19 Feb 2023
Robust Knowledge Transfer in Tiered Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Jiawei Huang Niao He OffRL 214 1 0 10 Feb 2023
Provably Efficient Offline Goal-Conditioned Reinforcement Learning with General Function Approximation and Single-Policy ConcentrabilityNeural Information Processing Systems (NeurIPS), 2023 Hanlin Zhu Amy Zhang OffRL 190 5 0 07 Feb 2023
Offline Learning in Markov Games with General Function ApproximationInternational Conference on Machine Learning (ICML), 2023 Yuheng Zhang Yunru Bai Nan Jiang OffRL 170 10 0 06 Feb 2023
Offline Minimax Soft-Q-learning Under Realizability and Partial CoverageNeural Information Processing Systems (NeurIPS), 2023 Masatoshi Uehara Nathan Kallus Jason D. Lee Wen Sun OffRL 227 6 0 05 Feb 2023
Reinforcement Learning in Low-Rank MDPs with Density FeaturesInternational Conference on Machine Learning (ICML), 2023 Audrey Huang Jinglin Chen Nan Jiang OffRL 146 14 0 04 Feb 2023
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders David Bruns-Smith Angela Zhou OffRL 267 13 0 01 Feb 2023

All Papers

Bellman-consistent Pessimism for Offline Reinforcement Learning

Papers citing "Bellman-consistent Pessimism for Offline Reinforcement Learning"