ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2002.09516
  4. Cited By
Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation

International Conference on Machine Learning (ICML), 2020
21 February 2020
Yaqi Duan
Mengdi Wang
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation"

50 / 114 papers shown
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Finite-Time Bounds for Average-Reward Fitted Q-Iteration
Jongmin Lee
Ernest K. Ryu
OffRL
128
0
0
20 Oct 2025
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Nan Jiang
Tengyang Xie
OffRL
235
15
0
05 Oct 2025
A Tutorial: An Intuitive Explanation of Offline Reinforcement Learning Theory
A Tutorial: An Intuitive Explanation of Offline Reinforcement Learning Theory
Fengdi Che
OffRL
181
0
0
11 Aug 2025
Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis
Augmenting Online RL with Offline Data is All You Need: A Unified Hybrid RL Algorithm Design and Analysis
Ruiquan Huang
Donghao Li
Chengshuai Shi
Cong Shen
Jing Yang
OffRL
484
0
0
01 Jul 2025
Generalized Linear Markov Decision Process
Generalized Linear Markov Decision Process
Sinian Zhang
Kaicheng Zhang
Ziping Xu
Tianxi Cai
D. Zhou
273
0
0
01 Jun 2025
Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment
SquareχχχPO: Differentially Private and Robust χ2χ^2χ2-Preference Optimization in Offline Direct Alignment
Xingyu Zhou
Yulian Wu
Wenqian Weng
Francesco Orabona
421
0
0
27 May 2025
NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis Prediction
NeuroSep-CP-LCB: A Deep Learning-based Contextual Multi-armed Bandit Algorithm with Uncertainty Quantification for Early Sepsis Prediction
Anni Zhou
Raheem Beyah
Rishikesan Kamaleswaran
324
1
0
20 Mar 2025
Logarithmic Neyman Regret for Adaptive Estimation of the Average
  Treatment Effect
Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment EffectInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Ojash Neopane
Aaditya Ramdas
Aarti Singh
CML
283
3
0
21 Nov 2024
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from
  Shifted-Dynamics Data
Hybrid Transfer Reinforcement Learning: Provable Sample Efficiency from Shifted-Dynamics DataInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Chengrui Qu
Laixi Shi
Kishan Panaganti
Pengcheng You
Adam Wierman
OffRLOnRL
299
6
0
06 Nov 2024
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning
Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning
Yen-Ru Lai
Fu-Chieh Chang
Pei-Yuan Wu
OffRL
581
1
0
22 Aug 2024
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs
Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPsNeural Information Processing Systems (NeurIPS), 2024
Kevin Tan
Wei Fan
Yuting Wei
OffRL
362
5
0
08 Aug 2024
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning
Dake Zhang
Boxiang Lyu
Delin Qu
Mladen Kolar
Tong Zhang
OffRL
289
3
0
10 Jul 2024
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric ModelsInternational Conference on Machine Learning (ICML), 2024
Jiayi Wang
Zhengling Qi
Raymond K. W. Wong
206
0
0
14 Jun 2024
Self-Play with Adversarial Critic: Provable and Scalable Offline
  Alignment for Language Models
Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models
Xiang Ji
Sanjeev Kulkarni
Mengdi Wang
Tengyang Xie
OffRL
376
11
0
06 Jun 2024
From Words to Actions: Unveiling the Theoretical Underpinnings of
  LLM-Driven Autonomous Systems
From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems
Jianliang He
Siyu Chen
Fengzhuo Zhang
Zhuoran Yang
LM&RoLLMAG
349
11
0
30 May 2024
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical
  Behaviors in Deep Off-Policy RL
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRLOnRL
293
7
0
28 May 2024
Trajectory Data Suffices for Statistically Efficient Learning in Offline
  RL with Linear $q^π$-Realizability and Concentrability
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear qπq^πqπ-Realizability and Concentrability
Volodymyr Tkachuk
Gellert Weisz
Csaba Szepesvári
OffRL
234
3
0
27 May 2024
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Vanshaj Khattar
Yuhao Ding
Bilgehan Sel
Javad Lavaei
Ming Jin
OffRL
286
23
0
26 May 2024
Imitation Learning in Discounted Linear MDPs without exploration
  assumptions
Imitation Learning in Discounted Linear MDPs without exploration assumptionsInternational Conference on Machine Learning (ICML), 2024
Luca Viano
Stratis Skoulakis
Volkan Cevher
345
8
0
03 May 2024
Multiple-policy Evaluation via Density Estimation
Multiple-policy Evaluation via Density Estimation
Yilei Chen
Aldo Pacchiano
I. Paschalidis
OffRL
390
1
0
29 Mar 2024
Diffusion Model for Data-Driven Black-Box Optimization
Diffusion Model for Data-Driven Black-Box Optimization
Zihao Li
Hui Yuan
Kaixuan Huang
Chengzhuo Ni
Yinyu Ye
Minshuo Chen
Mengdi Wang
DiffM
328
22
0
20 Mar 2024
On the Curses of Future and History in Future-dependent Value Functions
  for Off-policy Evaluation
On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
Yuheng Zhang
Nan Jiang
OffRL
298
7
0
22 Feb 2024
Towards Robust Model-Based Reinforcement Learning Against Adversarial
  Corruption
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption
Chen Ye
Jiafan He
Quanquan Gu
Tong Zhang
350
10
0
14 Feb 2024
Reward-Relevance-Filtered Linear Offline Reinforcement Learning
Reward-Relevance-Filtered Linear Offline Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024
Angela Zhou
OffRL
297
3
0
23 Jan 2024
Taming "data-hungry" reinforcement learning? Stability in continuous
  state-action spaces
Taming "data-hungry" reinforcement learning? Stability in continuous state-action spacesNeural Information Processing Systems (NeurIPS), 2024
Yaqi Duan
Martin J. Wainwright
OffRL
242
3
0
10 Jan 2024
On Sample-Efficient Offline Reinforcement Learning: Data Diversity,
  Posterior Sampling, and Beyond
On Sample-Efficient Offline Reinforcement Learning: Data Diversity, Posterior Sampling, and Beyond
Thanh Nguyen-Tang
Raman Arora
OffRL
370
5
0
06 Jan 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
232
0
0
24 Dec 2023
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
Robust Offline Reinforcement learning with Heavy-Tailed RewardsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Jin Zhu
Runzhe Wan
Zhengling Qi
Shuang Luo
C. Shi
OffRL
382
5
0
28 Oct 2023
On the Convergence and Sample Complexity Analysis of Deep Q-Networks
  with $ε$-Greedy Exploration
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with εεε-Greedy ExplorationNeural Information Processing Systems (NeurIPS), 2023
Shuai Zhang
Hongkang Li
Meng Wang
Miao Liu
Pin-Yu Chen
Songtao Lu
Sijia Liu
K. Murugesan
Subhajit Chaudhury
360
47
0
24 Oct 2023
Corruption-Robust Offline Reinforcement Learning with General Function
  Approximation
Corruption-Robust Offline Reinforcement Learning with General Function ApproximationNeural Information Processing Systems (NeurIPS), 2023
Chen Ye
Rui Yang
Quanquan Gu
Tong Zhang
OffRL
475
30
0
23 Oct 2023
Sample Complexity of Preference-Based Nonparametric Off-Policy
  Evaluation with Deep Networks
Sample Complexity of Preference-Based Nonparametric Off-Policy Evaluation with Deep Networks
Zihao Li
Xiang Ji
Minshuo Chen
Mengdi Wang
OffRL
301
0
0
16 Oct 2023
Bi-Level Offline Policy Optimization with Limited Exploration
Bi-Level Offline Policy Optimization with Limited ExplorationNeural Information Processing Systems (NeurIPS), 2023
Wenzhuo Zhou
OffRL
309
5
0
10 Oct 2023
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for
  Dimension-Dependent Adaptivity
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent AdaptivityInternational Conference on Learning Representations (ICLR), 2023
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
OffRL
356
2
0
02 Oct 2023
Stackelberg Batch Policy Learning
Stackelberg Batch Policy Learning
Wenzhuo Zhou
Annie Qu
OffRL
315
1
0
28 Sep 2023
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified
  Error Quantification Framework
Distributional Shift-Aware Off-Policy Interval Estimation: A Unified Error Quantification Framework
Wenzhuo Zhou
Yuhan Li
Ruoqing Zhu
Annie Qu
OffRL
321
7
0
23 Sep 2023
The Optimal Approximation Factors in Misspecified Off-Policy Value
  Function Estimation
The Optimal Approximation Factors in Misspecified Off-Policy Value Function EstimationInternational Conference on Machine Learning (ICML), 2023
Philip Amortila
Nan Jiang
Csaba Szepesvári
OffRL
277
5
0
25 Jul 2023
Offline Policy Evaluation for Reinforcement Learning with Adaptively
  Collected Data
Offline Policy Evaluation for Reinforcement Learning with Adaptively Collected Data
Sunil Madhow
Dan Xiao
Ming Yin
Yu-Xiang Wang
OffRL
314
0
0
24 Jun 2023
On the Model-Misspecification in Reinforcement Learning
On the Model-Misspecification in Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Yunfan Li
Lin F. Yang
329
6
0
19 Jun 2023
High-probability sample complexities for policy evaluation with linear
  function approximation
High-probability sample complexities for policy evaluation with linear function approximationIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023
Gen Li
Weichen Wu
Yuejie Chi
Cong Ma
Alessandro Rinaldo
Yuting Wei
OffRL
450
10
0
30 May 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via
  Pessimism
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
553
88
0
29 May 2023
Conformal Off-Policy Evaluation in Markov Decision Processes
Conformal Off-Policy Evaluation in Markov Decision ProcessesIEEE Conference on Decision and Control (CDC), 2023
Daniele Foffano
Alessio Russo
Alexandre Proutiere
OffRL
407
9
0
05 Apr 2023
A Unified Framework of Policy Learning for Contextual Bandit with
  Confounding Bias and Missing Observations
A Unified Framework of Policy Learning for Contextual Bandit with Confounding Bias and Missing Observations
Siyu Chen
Yitan Wang
Zhaoran Wang
Zhuoran Yang
OffRL
240
3
0
20 Mar 2023
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
Pseudo-Labeling for Kernel Ridge Regression under Covariate Shift
Kaizheng Wang
335
17
0
20 Feb 2023
A Review of Off-Policy Evaluation in Reinforcement Learning
A Review of Off-Policy Evaluation in Reinforcement Learning
Masatoshi Uehara
C. Shi
Nathan Kallus
OffRL
292
110
0
13 Dec 2022
Counterfactual Learning with General Data-generating Policies
Counterfactual Learning with General Data-generating PoliciesAAAI Conference on Artificial Intelligence (AAAI), 2022
Yusuke Narita
Kyohei Okumura
Akihiro Shimizu
Kohei Yata
CMLOffRL
180
2
0
04 Dec 2022
Offline Policy Evaluation and Optimization under Confounding
Offline Policy Evaluation and Optimization under ConfoundingInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2022
Chinmaya Kausik
Yangyi Lu
Kevin Tan
Maggie Makar
Yixin Wang
Ambuj Tewari
OffRL
419
15
0
29 Nov 2022
Offline Reinforcement Learning with Closed-Form Policy Improvement
  Operators
Offline Reinforcement Learning with Closed-Form Policy Improvement OperatorsInternational Conference on Machine Learning (ICML), 2022
Jiachen Li
Edwin Zhang
Ming Yin
Qinxun Bai
Yu Wang
William Yang Wang
OffRL
308
18
0
29 Nov 2022
On Instance-Dependent Bounds for Offline Reinforcement Learning with
  Linear Function Approximation
On Instance-Dependent Bounds for Offline Reinforcement Learning with Linear Function ApproximationAAAI Conference on Artificial Intelligence (AAAI), 2022
Thanh Nguyen-Tang
Ming Yin
Sunil R. Gupta
Svetha Venkatesh
R. Arora
OffRL
230
24
0
23 Nov 2022
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
When is Realizability Sufficient for Off-Policy Reinforcement Learning?International Conference on Machine Learning (ICML), 2022
Andrea Zanette
OffRL
362
16
0
10 Nov 2022
Oracle Inequalities for Model Selection in Offline Reinforcement
  Learning
Oracle Inequalities for Model Selection in Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Jonathan Lee
George Tucker
Ofir Nachum
Bo Dai
Emma Brunskill
OffRL
384
14
0
03 Nov 2022
123
Next
Page 1 of 3