Reward-Directed Conditional Diffusion: Provable Distribution Estimation
and Reward ImprovementNeural Information Processing Systems (NeurIPS), 2023 |
PAC-Bayesian Offline Contextual Bandits With GuaranteesInternational Conference on Machine Learning (ICML), 2022 |
Offline Policy Optimization with Eligible ActionsConference on Uncertainty in Artificial Intelligence (UAI), 2022 |
Offline Neural Contextual Bandits: Pessimism, Optimization and
GeneralizationInternational Conference on Learning Representations (ICLR), 2021 |