Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.08225
Cited By
Policy Optimization as Online Learning with Mediator Feedback
15 December 2020
Alberto Maria Metelli
Matteo Papini
P. DÓro
Marcello Restelli
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Policy Optimization as Online Learning with Mediator Feedback"
1 / 1 papers shown
Title
Bounded regret in stochastic multi-armed bandits
Sébastien Bubeck
Vianney Perchet
Philippe Rigollet
63
90
0
06 Feb 2013
1