ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.05804
  4. Cited By
Near-optimal Offline Reinforcement Learning with Linear Representation:
  Leveraging Variance Information with Pessimism

Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism

11 March 2022
Ming Yin
Yaqi Duan
Mengdi Wang
Yu-Xiang Wang
    OffRL
ArXivPDFHTML

Papers citing "Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism"

50 / 55 papers shown
Title
Towards Optimal Differentially Private Regret Bounds in Linear MDPs
Towards Optimal Differentially Private Regret Bounds in Linear MDPs
Sharan Sahu
55
0
0
12 Apr 2025
Towards User-level Private Reinforcement Learning with Human Feedback
Towards User-level Private Reinforcement Learning with Human Feedback
J. Zhang
Mingxi Lei
Meng Ding
Mengdi Li
Zihang Xiang
Difei Xu
Jinhui Xu
Di Wang
42
0
0
22 Feb 2025
Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques
Preference-Based Multi-Agent Reinforcement Learning: Data Coverage and Algorithmic Techniques
Natalia Zhang
X. Wang
Qiwen Cui
Runlong Zhou
Sham Kakade
Simon S. Du
OffRL
48
0
0
10 Jan 2025
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic
  Management in Network Simulation
NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation
Momin Haider
Ming Yin
Menglei Zhang
Arpit Gupta
Jing Zhu
Yu-Xiang Wang
OffRL
26
1
0
30 Oct 2024
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning
Yen-Ru Lai
Fu-Chieh Chang
Pei-Yuan Wu
OffRL
64
1
0
22 Aug 2024
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Kevin Tan
Wei Fan
Yuting Wei
OffRL
69
2
0
08 Aug 2024
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
Dake Zhang
Boxiang Lyu
Shuang Qiu
Mladen Kolar
Tong Zhang
OffRL
30
0
0
10 Jul 2024
The Role of Inherent Bellman Error in Offline Reinforcement Learning
  with Linear Function Approximation
The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation
Noah Golowich
Ankur Moitra
OffRL
29
2
0
17 Jun 2024
Self-Play with Adversarial Critic: Provable and Scalable Offline
  Alignment for Language Models
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models
Xiang Ji
Sanjeev Kulkarni
Mengdi Wang
Tengyang Xie
OffRL
35
4
0
06 Jun 2024
Learning the Target Network in Function Space
Learning the Target Network in Function Space
Kavosh Asadi
Yao Liu
Shoham Sabach
Ming Yin
Rasool Fakoor
33
0
0
03 Jun 2024
Combining Experimental and Historical Data for Policy Evaluation
Combining Experimental and Historical Data for Policy Evaluation
Ting Li
Chengchun Shi
Qianglin Wen
Yang Sui
Yongli Qin
Chunbo Lai
Hongtu Zhu
OffRL
44
0
0
01 Jun 2024
Towards Robust Policy: Enhancing Offline Reinforcement Learning with
  Adversarial Attacks and Defenses
Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defenses
Thanh Nguyen
T. Luu
Tri Ton
Chang D. Yoo
OffRL
AAML
32
0
0
18 May 2024
Sample Complexity of Offline Distributionally Robust Linear Markov
  Decision Processes
Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes
He Wang
Laixi Shi
Yuejie Chi
OffRL
29
6
0
19 Mar 2024
Is Offline Decision Making Possible with Only Few Samples? Reliable
  Decisions in Data-Starved Bandits via Trust Region Enhancement
Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement
Ruiqi Zhang
Yuexiang Zhai
Andrea Zanette
41
0
0
24 Feb 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
27
4
0
22 Feb 2024
Iterative Preference Learning from Human Feedback: Bridging Theory and
  Practice for RLHF under KL-Constraint
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint
Wei Xiong
Hanze Dong
Chen Ye
Ziqi Wang
Han Zhong
Heng Ji
Nan Jiang
Tong Zhang
OffRL
36
155
0
18 Dec 2023
Differentially Private Reward Estimation with Preference Feedback
Differentially Private Reward Estimation with Preference Feedback
Sayak Ray Chowdhury
Xingyu Zhou
Nagarajan Natarajan
26
4
0
30 Oct 2023
Posterior Sampling with Delayed Feedback for Reinforcement Learning with
  Linear Function Approximation
Posterior Sampling with Delayed Feedback for Reinforcement Learning with Linear Function Approximation
Nikki Lijing Kuang
Ming Yin
Mengdi Wang
Yu-Xiang Wang
Yian Ma
24
6
0
29 Oct 2023
Bridging Distributionally Robust Learning and Offline RL: An Approach to
  Mitigate Distribution Shift and Partial Data Coverage
Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage
Kishan Panaganti
Zaiyan Xu
D. Kalathil
Mohammad Ghavamzadeh
OOD
OffRL
13
6
0
27 Oct 2023
Pessimistic Nonlinear Least-Squares Value Iteration for Offline
  Reinforcement Learning
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
Qiwei Di
Heyang Zhao
Jiafan He
Quanquan Gu
OffRL
50
5
0
02 Oct 2023
Robust Offline Reinforcement Learning -- Certify the Confidence Interval
Aayush Mishra
Simon S. Du
OffRL
16
0
0
28 Sep 2023
Settling the Sample Complexity of Online Reinforcement Learning
Settling the Sample Complexity of Online Reinforcement Learning
Zihan Zhang
Yuxin Chen
Jason D. Lee
S. Du
OffRL
90
21
0
25 Jul 2023
Provable Benefits of Policy Learning from Human Preferences in
  Contextual Bandit Problems
Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems
Xiang Ji
Huazheng Wang
Minshuo Chen
Tuo Zhao
Mengdi Wang
OffRL
19
6
0
24 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments
  using Offline Data
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data
Ruiqi Zhang
Andrea Zanette
OffRL
OnRL
35
5
0
10 Jul 2023
Offline Policy Evaluation for Reinforcement Learning with Adaptively
  Collected Data
Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data
Sunil Madhow
Dan Xiao
Ming Yin
Yu-Xiang Wang
OffRL
18
0
0
24 Jun 2023
Provably Efficient Offline Reinforcement Learning with Perturbed Data
  Sources
Provably Efficient Offline Reinforcement Learning with Perturbed Data Sources
Chengshuai Shi
Wei Xiong
Cong Shen
Jing Yang
OffRL
25
3
0
14 Jun 2023
Regularization and Variance-Weighted Regression Achieves Minimax
  Optimality in Linear MDPs: Theory and Practice
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice
Toshinori Kitamura
Tadashi Kozuno
Yunhao Tang
Nino Vieillard
Michal Valko
...
Olivier Pietquin
M. Geist
Csaba Szepesvári
Wataru Kumagai
Yutaka Matsuo
OffRL
24
2
0
22 May 2023
A Unified Framework of Policy Learning for Contextual Bandit with
  Confounding Bias and Missing Observations
A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Siyu Chen
Yitan Wang
Zhaoran Wang
Zhuoran Yang
OffRL
28
2
0
20 Mar 2023
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function
  Approximation
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
Thanh Nguyen-Tang
R. Arora
OffRL
38
5
0
24 Feb 2023
Offline Learning in Markov Games with General Function Approximation
Offline Learning in Markov Games with General Function Approximation
Yuheng Zhang
Yunru Bai
Nan Jiang
OffRL
13
8
0
06 Feb 2023
Importance Weighted Actor-Critic for Optimal Conservative Offline
  Reinforcement Learning
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu
Paria Rashidinejad
Jiantao Jiao
OffRL
30
15
0
30 Jan 2023
Scaling Marginalized Importance Sampling to High-Dimensional
  State-Spaces via State Abstraction
Scaling Marginalized Importance Sampling to High-Dimensional State-Spaces via State Abstraction
Brahma S. Pavse
Josiah P. Hanna
OffRL
32
7
0
14 Dec 2022
Near-Optimal Differentially Private Reinforcement Learning
Near-Optimal Differentially Private Reinforcement Learning
Dan Qiao
Yu-Xiang Wang
22
13
0
09 Dec 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement
  Operators
Offline Reinforcement Learning with Closed-Form Policy Improvement Operators
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu-Xiang Wang
William Yang Wang
OffRL
24
15
0
29 Nov 2022
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function Approximation
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
50
15
0
23 Nov 2022
Leveraging Offline Data in Online Reinforcement Learning
Leveraging Offline Data in Online Reinforcement Learning
Andrew Wagenmaker
Aldo Pacchiano
OffRL
OnRL
27
36
0
09 Nov 2022
Optimal Conservative Offline RL with General Function Approximation via
  Augmented Lagrangian
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian
Paria Rashidinejad
Hanlin Zhu
Kunhe Yang
Stuart J. Russell
Jiantao Jiao
OffRL
33
26
0
01 Nov 2022
Offline Reinforcement Learning with Differentiable Function
  Approximation is Provably Efficient
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
Ming Yin
Mengdi Wang
Yu-Xiang Wang
OffRL
61
11
0
03 Oct 2022
Relational Reasoning via Set Transformers: Provable Efficiency and
  Applications to MARL
Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL
Fengzhuo Zhang
Boyi Liu
Kaixin Wang
Vincent Y. F. Tan
Zhuoran Yang
Zhaoran Wang
OffRL
LRM
49
10
0
20 Sep 2022
Distributionally Robust Offline Reinforcement Learning with Linear
  Function Approximation
Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation
Xiaoteng Ma
Zhipeng Liang
Jose H. Blanchet
MingWen Liu
Li Xia
Jiheng Zhang
Qianchuan Zhao
Zhengyuan Zhou
OOD
OffRL
25
21
0
14 Sep 2022
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise
  Reward
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward
Tengyu Xu
Yue Wang
Shaofeng Zou
Yingbin Liang
OffRL
22
12
0
13 Jun 2022
Offline Stochastic Shortest Path: Learning, Evaluation and Towards
  Optimality
Offline Stochastic Shortest Path: Learning, Evaluation and Towards Optimality
Ming Yin
Wenjing Chen
Mengdi Wang
Yu-Xiang Wang
OffRL
25
4
0
10 Jun 2022
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
RORL: Robust Offline Reinforcement Learning via Conservative Smoothing
Rui Yang
Chenjia Bai
Xiaoteng Ma
Zhaoran Wang
Chongjie Zhang
Lei Han
OffRL
24
74
0
06 Jun 2022
Offline Reinforcement Learning with Differential Privacy
Offline Reinforcement Learning with Differential Privacy
Dan Qiao
Yu-Xiang Wang
OffRL
27
23
0
02 Jun 2022
Pessimism in the Face of Confounders: Provably Efficient Offline
  Reinforcement Learning in Partially Observable Markov Decision Processes
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Miao Lu
Yifei Min
Zhaoran Wang
Zhuoran Yang
OffRL
45
22
0
26 May 2022
Pessimism for Offline Linear Contextual Bandits using $\ell_p$
  Confidence Sets
Pessimism for Offline Linear Contextual Bandits using ℓp\ell_pℓp​ Confidence Sets
Gen Li
Cong Ma
Nathan Srebro
OffRL
28
11
0
21 May 2022
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Bellman Residual Orthogonalization for Offline Reinforcement Learning
Andrea Zanette
Martin J. Wainwright
OffRL
22
8
0
24 Mar 2022
The Efficacy of Pessimism in Asynchronous Q-Learning
The Efficacy of Pessimism in Asynchronous Q-Learning
Yuling Yan
Gen Li
Yuxin Chen
Jianqing Fan
OffRL
70
40
0
14 Mar 2022
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards
  Optimal Sample Complexity
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity
Laixi Shi
Gen Li
Yuting Wei
Yuxin Chen
Yuejie Chi
OffRL
21
90
0
28 Feb 2022
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted
  Iteration
Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration
Chengzhuo Ni
Ruiqi Zhang
Xiang Ji
Xuezhou Zhang
Mengdi Wang
OffRL
13
1
0
31 Jan 2022
12
Next