No-Regret Reinforcement Learning in Smooth MDPsInternational Conference on Machine Learning (ICML), 2024 |
Apple Tasting Revisited: Bayesian Approaches to Partially Monitored
Online Binary Classification James A. Grant David S. Leslie |
Policy Optimization as Online Learning with Mediator FeedbackAAAI Conference on Artificial Intelligence (AAAI), 2020 |
Smooth Bandit Optimization: Generalization to Hölder SpaceInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020 |