v1v2v3 (latest)

Reward Biased Maximum Likelihood Estimation for Reinforcement Learning

Conference on Learning for Dynamics & Control (L4DC), 2020

16 November 2020

Papers citing "Reward Biased Maximum Likelihood Estimation for Reinforcement Learning"

12 / 12 papers shown

Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHFInternational Conference on Learning Representations (ICLR), 2024

768

20 Feb 2025

Incentivize without Bonus: Provably Efficient Model-based Online Multi-agent RL for Markov Games

477

13 Feb 2025

Provable Policy Gradient Methods for Average-Reward Markov Potential GamesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2024

334

09 Mar 2024

Value-Biased Maximum Likelihood Estimation for Model-based Reinforcement Learning in Discounted Linear MDPs

220

17 Oct 2023

Maximize to Explore: One Objective Function Fusing Estimation, Planning, and ExplorationNeural Information Processing Systems (NeurIPS), 2023

Wei Xiong

395

29 May 2023

When Is Partially Observable Reinforcement Learning Not Scary?Annual Conference Computational Learning Theory (COLT), 2022

280

125

19 Apr 2022

Reward-Biased Maximum Likelihood Estimation for Neural Contextual BanditsAAAI Conference on Artificial Intelligence (AAAI), 2022

Yu-Heng Hung

Ping-Chun Hsieh

294

08 Mar 2022

Augmented RBMLE-UCB Approach for Adaptive Control of Linear Quadratic SystemsNeural Information Processing Systems (NeurIPS), 2022

Akshay Mete

Rahul Singh

P. R. Kumar

168

25 Jan 2022

Reinforcement Learning for Finite-Horizon Restless Multi-Armed Multi-Action Bandits

Efstathia Soufleri

Jian Li

Rahul Singh

277

20 Sep 2021

Learning Augmented Index Policy for Optimal Service Placement at the Network Edge

Efstathia Soufleri

Rahul Singh

Jian Li

305

10 Jan 2021

Whittle index based Q-learning for restless bandits with average reward

Konstantin Avrachenkov

Vivek Borkar

307

29 Apr 2020

Learning in Markov Decision Processes under ConstraintsIEEE Transactions on Control of Network Systems (TCNS), 2020

Rahul Singh

Abhishek Gupta

Ness B. Shroff

458

27 Feb 2020