ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.10160
  4. Cited By
Online Policy Learning from Offline Preferences

Online Policy Learning from Offline Preferences

15 March 2024
Guoxi Zhang
Han Bao
Hisashi Kashima
    OffRL
ArXivPDFHTML

Papers citing "Online Policy Learning from Offline Preferences"

1 / 1 papers shown
Title
Batch Reinforcement Learning from Crowds
Batch Reinforcement Learning from Crowds
Guoxi Zhang
H. Kashima
OffRL
32
5
0
08 Nov 2021
1