Deterministic Policies for Constrained Reinforcement Learning in
Polynomial-TimeNeural Information Processing Systems (NeurIPS), 2024 |
A safe exploration approach to constrained Markov decision processesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 |
Improved Sample Complexity Analysis of Natural Policy Gradient Algorithm
with General Parameterization for Infinite Horizon Discounted Reward Markov
Decision ProcessesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 |