Misspecification in Inverse Reinforcement LearningAAAI Conference on Artificial Intelligence (AAAI), 2022 |
Reward Gaming in Conditional Text GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 |
Scaling Laws for Reward Model OveroptimizationInternational Conference on Machine Learning (ICML), 2022 |
The Alignment Problem from a Deep Learning PerspectiveInternational Conference on Learning Representations (ICLR), 2022 |