Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.11692
Cited By
A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning
22 June 2021
Yunchang Yang
Tianhao Wu
Han Zhong
Evrard Garcelon
Matteo Pirotta
A. Lazaric
Liwei Wang
S. Du
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning"
4 / 4 papers shown
Title
Pairwise Proximal Policy Optimization: Harnessing Relative Feedback for LLM Alignment
Tianhao Wu
Banghua Zhu
Ruoyu Zhang
Zhaojin Wen
Kannan Ramchandran
Jiantao Jiao
41
54
0
30 Sep 2023
Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees
Pengfei Li
Jianyi Yang
Shaolei Ren
OffRL
27
4
0
31 May 2023
UCB Momentum Q-learning: Correcting the bias without forgetting
Pierre Menard
O. D. Domingues
Xuedong Shang
Michal Valko
79
40
0
01 Mar 2021
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints
Chi Jin
Zhuoran Yang
Zhaoran Wang
OffRL
122
166
0
06 Jan 2021
1