Learning from Bandit Feedback: An Overview of the State-of-the-art

18 September 2019

Papers citing "Learning from Bandit Feedback: An Overview of the State-of-the-art"

3 / 3 papers shown

Title
Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of Simulation Imad Aouali Amine Benhalloum Martin Bompaire Benjamin Heymann Olivier Jeunen D. Rohde Otmane Sakhi Flavian Vasile OffRL 56 2 0 18 Sep 2022
Residual Overfit Method of Exploration James McInerney Nathan Kallus OffRL UQCV 22 0 0 06 Oct 2021
MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces Marlesson R. O. Santana Luckeciano C. Melo Fernando H. F. Camargo Bruno Brandão Anderson Soares Renan M. Oliveira Sandor Caetano OffRL 45 15 0 30 Sep 2020