ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.11684
16
0

Quantum Speedups in Regret Analysis of Infinite Horizon Average-Reward Markov Decision Processes

18 October 2023
Bhargav Ganguly
Yang Xu
Vaneet Aggarwal
ArXivPDFHTML
Abstract

This paper investigates the potential of quantum acceleration in addressing infinite horizon Markov Decision Processes (MDPs) to enhance average reward outcomes. We introduce an innovative quantum framework for the agent's engagement with an unknown MDP, extending the conventional interaction paradigm. Our approach involves the design of an optimism-driven tabular Reinforcement Learning algorithm that harnesses quantum signals acquired by the agent through efficient quantum mean estimation techniques. Through thorough theoretical analysis, we demonstrate that the quantum advantage in mean estimation leads to exponential advancements in regret guarantees for infinite horizon Reinforcement Learning. Specifically, the proposed Quantum algorithm achieves a regret bound of O~(1)\tilde{\mathcal{O}}(1)O~(1), a significant improvement over the O~(T)\tilde{\mathcal{O}}(\sqrt{T})O~(T​) bound exhibited by classical counterparts.

View on arXiv
Comments on this paper