ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.05311
  4. Cited By
When is Realizability Sufficient for Off-Policy Reinforcement Learning?
v1v2 (latest)

When is Realizability Sufficient for Off-Policy Reinforcement Learning?

International Conference on Machine Learning (ICML), 2022
10 November 2022
Andrea Zanette
    OffRL
ArXiv (abs)PDFHTML

Papers citing "When is Realizability Sufficient for Off-Policy Reinforcement Learning?"

16 / 16 papers shown
Title
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
Nan Jiang
Tengyang Xie
OffRL
148
9
0
05 Oct 2025
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
Andreas Schlaginhaufen
Reda Ouhamma
Maryam Kamgarpour
180
1
0
11 Jun 2025
Quantum Non-Linear Bandit Optimization
Zakaria Shams Siam
Chaowen Guan
Chong Liu
326
1
0
04 Mar 2025
Enhancing PPO with Trajectory-Aware Hybrid Policies
Qisai Liu
Zhanhong Jiang
Hsin-Jung Yang
Mahsa Khosravi
Joshua R. Waite
Soumik Sarkar
222
0
0
21 Feb 2025
A Model Selection Approach for Corruption Robust Reinforcement Learning
A Model Selection Approach for Corruption Robust Reinforcement LearningInternational Conference on Algorithmic Learning Theory (ALT), 2021
Chen-Yu Wei
Christoph Dann
Julian Zimmert
265
48
0
31 Dec 2024
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu-Juan Luo
Tianying Ji
Gang Hua
Jianwei Zhang
Huazhe Xu
Xianyuan Zhan
OffRL
268
4
0
29 May 2024
Optimal Design for Human Feedback
Optimal Design for Human Feedback
Subhojyoti Mukherjee
Anusha Lalitha
Kousha Kalantari
Aniket Deshmukh
Ge Liu
Yifei Ma
Branislav Kveton
344
0
0
22 Apr 2024
A Natural Extension To Online Algorithms For Hybrid RL With Limited
  Coverage
A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage
Kevin Tan
Ziping Xu
OffRLOnRL
273
5
0
07 Mar 2024
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Yifei Zhou
Andrea Zanette
Jiayi Pan
Sergey Levine
Aviral Kumar
277
119
0
29 Feb 2024
Regularized Q-Learning with Linear Function Approximation
Regularized Q-Learning with Linear Function ApproximationIEEE Transactions on Automatic Control (TAC), 2024
Jiachen Xi
Alfredo Garcia
P. Momcilovic
432
2
0
26 Jan 2024
Free from Bellman Completeness: Trajectory Stitching via Model-based
  Return-conditioned Supervised Learning
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised LearningInternational Conference on Learning Representations (ICLR), 2023
Zhaoyi Zhou
Chuning Zhu
Runlong Zhou
Qiwen Cui
Abhishek Gupta
S. S. Du
OffRL
174
10
0
30 Oct 2023
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for
  Dimension-Dependent Adaptivity
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent AdaptivityInternational Conference on Learning Representations (ICLR), 2023
Emmeran Johnson
Ciara Pike-Burke
Patrick Rebeschini
OffRL
228
2
0
02 Oct 2023
Provable Benefits of Policy Learning from Human Preferences in
  Contextual Bandit Problems
Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems
Xiang Ji
Huazheng Wang
Minshuo Chen
Tuo Zhao
Mengdi Wang
OffRL
266
8
0
24 Jul 2023
Policy Finetuning in Reinforcement Learning via Design of Experiments
  using Offline Data
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline DataNeural Information Processing Systems (NeurIPS), 2023
Ruiqi Zhang
Andrea Zanette
OffRLOnRL
236
9
0
10 Jul 2023
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement
  Learning
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Kihyuk Hong
Yuhang Li
Ambuj Tewari
OffRL
302
9
0
13 Jun 2023
Principled Reinforcement Learning with Human Feedback from Pairwise or
  $K$-wise Comparisons
Principled Reinforcement Learning with Human Feedback from Pairwise or KKK-wise ComparisonsInternational Conference on Machine Learning (ICML), 2023
Banghua Zhu
Jiantao Jiao
Michael I. Jordan
OffRL
362
245
0
26 Jan 2023
1