ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.14897
  4. Cited By
Statistically Efficient Variance Reduction with Double Policy Estimation
  for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

28 August 2023
Hanhan Zhou
Tian-Shing Lan
Vaneet Aggarwal
    OffRL
ArXivPDFHTML

Papers citing "Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning"

3 / 3 papers shown
Title
MAC-PO: Multi-Agent Experience Replay via Collective Priority
  Optimization
MAC-PO: Multi-Agent Experience Replay via Collective Priority Optimization
Yongsheng Mei
Hanhan Zhou
Tian-Shing Lan
Guru Venkataramani
Peng Wei
33
38
0
21 Feb 2023
You Can't Count on Luck: Why Decision Transformers and RvS Fail in
  Stochastic Environments
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
Keiran Paster
Sheila A. McIlraith
Jimmy Ba
OffRL
136
27
0
31 May 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
212
832
0
12 Oct 2021
1