Provably Feedback-Efficient Reinforcement Learning via Active Reward
LearningNeural Information Processing Systems (NeurIPS), 2023 |
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective
Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022 |
Gap-Dependent Unsupervised Exploration for Reinforcement LearningInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2021 |
Provably Efficient Algorithms for Multi-Objective Competitive RLInternational Conference on Machine Learning (ICML), 2021 |