Federated UCBVI: Communication-Efficient Federated Regret Minimization
with Heterogeneous AgentsInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024 |
Robot Policy Learning with Temporal Optimal Transport RewardNeural Information Processing Systems (NeurIPS), 2024 |
Optimizing Backward Policies in GFlowNets via Trajectory Likelihood MaximizationInternational Conference on Learning Representations (ICLR), 2024 |
Probabilistic Inference in Reinforcement Learning Done RightNeural Information Processing Systems (NeurIPS), 2023 |
Minimax Optimal Q Learning with Nearest NeighborsIEEE Transactions on Information Theory (IEEE Trans. Inf. Theory), 2023 |