ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.09254
22
153

Improper Learning for Non-Stochastic Control

25 January 2020
Max Simchowitz
Karan Singh
Elad Hazan
ArXivPDFHTML
Abstract

We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states, known as non-stochastic control. We introduce a controller parametrization based on the denoised observations, and prove that applying online gradient descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies. In the fully-adversarial setting, our controller attains an optimal regret bound of T\sqrt{T}T​-when the system is known, and, when combined with an initial stage of least-squares estimation, T2/3T^{2/3}T2/3 when the system is unknown; both yield the first sublinear regret for the partially observed setting. Our bounds are the first in the non-stochastic control setting that compete with \emph{all} stabilizing linear dynamical controllers, not just state feedback. Moreover, in the presence of semi-adversarial noise containing both stochastic and adversarial components, our controller attains the optimal regret bounds of poly(log⁡T)\mathrm{poly}(\log T)poly(logT) when the system is known, and T\sqrt{T}T​ when unknown. To our knowledge, this gives the first end-to-end T\sqrt{T}T​ regret for online Linear Quadratic Gaussian controller, and applies in a more general setting with adversarial losses and semi-adversarial noise.

View on arXiv
Comments on this paper