Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.09574
Cited By
Online Bandit Learning with Offline Preference Data
13 June 2024
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Online Bandit Learning with Offline Preference Data"
2 / 2 papers shown
Title
e-COP : Episodic Constrained Optimization of Policies
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Sahil Singla
OffRL
27
0
0
13 Jun 2024
Artificial Replay: A Meta-Algorithm for Harnessing Historical Data in Bandits
Siddhartha Banerjee
Sean R. Sinclair
Milind Tambe
Lily Xu
C. Yu
AI4TS
29
6
0
30 Sep 2022
1