Efficient Planning in Large MDPs with Weak Linear Function ApproximationNeural Information Processing Systems (NeurIPS), 2020 |
BRPO: Batch Residual Policy OptimizationInternational Joint Conference on Artificial Intelligence (IJCAI), 2020 |
Large Scale Markov Decision Processes with Changing RewardsNeural Information Processing Systems (NeurIPS), 2019 |