Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.04157
Cited By
The Role of Coverage in Online Reinforcement Learning
9 October 2022
Tengyang Xie
Dylan J. Foster
Yu Bai
Nan Jiang
Sham Kakade
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Role of Coverage in Online Reinforcement Learning"
41 / 41 papers shown
Title
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
77
0
0
26 Feb 2025
Decision Making in Hybrid Environments: A Model Aggregation Approach
Haolin Liu
Chen-Yu Wei
Julian Zimmert
83
0
0
09 Feb 2025
Improving Environment Novelty Quantification for Effective Unsupervised Environment Design
Jayden Teoh
Wenjun Li
Pradeep Varakantham
53
1
0
08 Feb 2025
Hybrid Preference Optimization for Alignment: Provably Faster Convergence Rates by Combining Offline Preferences with Online Exploration
Avinandan Bose
Zhihan Xiong
Aadirupa Saha
S. Du
Maryam Fazel
71
1
0
13 Dec 2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF
Heyang Zhao
Chenlu Ye
Quanquan Gu
Tong Zhang
OffRL
57
3
0
07 Nov 2024
The Central Role of the Loss Function in Reinforcement Learning
Kaiwen Wang
Nathan Kallus
Wen Sun
OffRL
41
7
0
19 Sep 2024
Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization
Talha Bozkus
Urbashi Mitra
28
2
0
29 Aug 2024
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Kevin Tan
Wei Fan
Yuting Wei
OffRL
69
2
0
08 Aug 2024
Hybrid Reinforcement Learning from Offline Observation Alone
Yuda Song
J. Andrew Bagnell
Aarti Singh
OffRL
74
2
0
11 Jun 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
Jeongyeol Kwon
Shie Mannor
C. Caramanis
Yonathan Efroni
OffRL
29
2
0
03 Jun 2024
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
Tengyang Xie
Dylan J. Foster
Akshay Krishnamurthy
Corby Rosset
Ahmed Hassan Awadallah
Alexander Rakhlin
41
33
0
31 May 2024
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Haotian Hu
Yiqin Yang
Jianing Ye
Chengjie Wu
Ziqing Mai
Yujing Hu
Tangjie Lv
Changjie Fan
Qianchuan Zhao
Chongjie Zhang
OffRL
OnRL
37
3
0
31 May 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
Corby Rosset
Ching-An Cheng
Arindam Mitra
Michael Santacroce
Ahmed Hassan Awadallah
Tengyang Xie
144
114
0
04 Apr 2024
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
19
0
0
29 Mar 2024
The Value of Reward Lookahead in Reinforcement Learning
Nadav Merlis
Dorian Baudry
Vianney Perchet
13
0
0
18 Mar 2024
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage
Kevin Tan
Ziping Xu
OffRL
OnRL
34
4
0
07 Mar 2024
Advancing Investment Frontiers: Industry-grade Deep Reinforcement Learning for Portfolio Optimization
Philip Ndikum
Serge Ndikum
44
1
0
27 Feb 2024
On the Performance of Empirical Risk Minimization with Smoothed Data
Adam Block
Alexander Rakhlin
Abhishek Shetty
39
3
0
22 Feb 2024
Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap
Mohammad Mehrabi
Stefan Wager
OffRL
24
14
0
13 Feb 2024
Online Iterative Reinforcement Learning from Human Feedback with General Preference Model
Chen Ye
Wei Xiong
Yuheng Zhang
Nan Jiang
Tong Zhang
OffRL
38
9
0
11 Feb 2024
Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
Jiawei Huang
Niao He
Andreas Krause
24
6
0
08 Feb 2024
Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning
P. Amortila
Tongyi Cao
Akshay Krishnamurthy
OffRL
OOD
38
1
0
22 Jan 2024
Agnostic Interactive Imitation Learning: New Theory and Practical Algorithms
Yichen Li
Chicheng Zhang
OffRL
31
0
0
28 Dec 2023
Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees
Yifei Zhou
Ayush Sekhari
Yuda Song
Wen Sun
OffRL
OnRL
30
8
0
14 Nov 2023
On the Theory of Risk-Aware Agents: Bridging Actor-Critic and Economics
Michal Nauman
Marek Cygan
27
1
0
30 Oct 2023
Unsupervised Behavior Extraction via Random Intent Priors
Haotian Hu
Yiqin Yang
Jianing Ye
Ziqing Mai
Chongjie Zhang
OffRL
32
6
0
28 Oct 2023
When is Agnostic Reinforcement Learning Statistically Tractable?
Zeyu Jia
Gene Li
Alexander Rakhlin
Ayush Sekhari
Nathan Srebro
OffRL
22
5
0
09 Oct 2023
Efficient Model-Free Exploration in Low-Rank MDPs
Zakaria Mhammedi
Adam Block
Dylan J. Foster
Alexander Rakhlin
OffRL
19
13
0
08 Jul 2023
Active Coverage for PAC Reinforcement Learning
Aymen Al Marjani
Andrea Tirinzoni
E. Kaufmann
OffRL
16
4
0
23 Jun 2023
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
Zhihan Liu
Miao Lu
Wei Xiong
Han Zhong
Haotian Hu
Shenao Zhang
Sirui Zheng
Zhuoran Yang
Zhaoran Wang
OffRL
32
22
0
29 May 2023
The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning
Kaiwen Wang
Kevin Zhou
Runzhe Wu
Nathan Kallus
Wen Sun
OffRL
26
17
0
25 May 2023
On the Statistical Efficiency of Mean Field Reinforcement Learning with General Function Approximation
Jiawei Huang
Batuhan Yardim
Niao He
31
10
0
18 May 2023
What can online reinforcement learning with function approximation benefit from general coverage conditions?
Fanghui Liu
Luca Viano
V. Cevher
OffRL
12
2
0
25 Apr 2023
Eluder-based Regret for Stochastic Contextual MDPs
Orin Levy
Asaf B. Cassel
Alon Cohen
Yishay Mansour
23
5
0
27 Nov 2022
Model-Free Reinforcement Learning with the Decision-Estimation Coefficient
Dylan J. Foster
Noah Golowich
Jian Qian
Alexander Rakhlin
Ayush Sekhari
OffRL
30
9
0
25 Nov 2022
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Andrea Zanette
OffRL
11
14
0
10 Nov 2022
Leveraging Offline Data in Online Reinforcement Learning
Andrew Wagenmaker
Aldo Pacchiano
OffRL
OnRL
27
36
0
09 Nov 2022
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
107
166
0
06 Jan 2021
Reward-Free Exploration for Reinforcement Learning
Chi Jin
A. Krishnamurthy
Max Simchowitz
Tiancheng Yu
OffRL
104
194
0
07 Feb 2020
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Yining Wang
Ruosong Wang
S. Du
A. Krishnamurthy
127
135
0
09 Dec 2019
Deep Reinforcement Learning for Dialogue Generation
Jiwei Li
Will Monroe
Alan Ritter
Michel Galley
Jianfeng Gao
Dan Jurafsky
198
1,327
0
05 Jun 2016
1