ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1804.06021
13
94

Model-Free Linear Quadratic Control via Reduction to Expert Prediction

17 April 2018
Yasin Abbasi-Yadkori
N. Lazić
Csaba Szepesvári
    OffRL
ArXivPDFHTML
Abstract

Model-free approaches for reinforcement learning (RL) and continuous control find policies based only on past states and rewards, without fitting a model of the system dynamics. They are appealing as they are general purpose and easy to implement; however, they also come with fewer theoretical guarantees than model-based RL. In this work, we present a new model-free algorithm for controlling linear quadratic (LQ) systems, and show that its regret scales as O(Tξ+2/3)O(T^{\xi+2/3})O(Tξ+2/3) for any small ξ>0\xi>0ξ>0 if time horizon satisfies T>C1/ξT>C^{1/\xi}T>C1/ξ for a constant CCC. The algorithm is based on a reduction of control of Markov decision processes to an expert prediction problem. In practice, it corresponds to a variant of policy iteration with forced exploration, where the policy in each phase is greedy with respect to the average of all previous value functions. This is the first model-free algorithm for adaptive control of LQ systems that provably achieves sublinear regret and has a polynomial computation cost. Empirically, our algorithm dramatically outperforms standard policy iteration, but performs worse than a model-based approach.

View on arXiv
Comments on this paper