Towards minimax policies for online linear optimization with bandit feedback

14 February 2012

Papers citing "Towards minimax policies for online linear optimization with bandit feedback"

3 / 3 papers shown

Title
Efficient Near-Optimal Algorithm for Online Shortest Paths in Directed Acyclic Graphs with Bandit Feedback Against Adaptive Adversaries Arnab Maiti Zhiyuan Fan Kevin Jamieson Lillian J. Ratliff Gabriele Farina 242 0 0 01 Apr 2025
Adaptive Sampling for Stochastic Risk-Averse Learning Sebastian Curi Kfir Y. Levy Stefanie Jegelka Andreas Krause 64 53 0 28 Oct 2019
Minimax Policies for Combinatorial Prediction Games Jean-Yves Audibert Sébastien Bubeck Gabor Lugosi OffRL 111 81 0 24 May 2011