ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2010.05673
73
14
v1v2 (latest)

Is Plug-in Solver Sample-Efficient for Feature-based Reinforcement Learning?

12 October 2020
Qiwen Cui
Lin F. Yang
    OffRL
ArXiv (abs)PDFHTML
Abstract

It is believed that a model-based approach for reinforcement learning (RL) is the key to reduce sample complexity. However, the understanding of the sample optimality of model-based RL is still largely missing, even for the linear case. This work considers sample complexity of finding an ϵ\epsilonϵ-optimal policy in a Markov decision process (MDP) that admits a linear additive feature representation, given only access to a generative model. We solve this problem via a plug-in solver approach, which builds an empirical model and plans in this empirical model via an arbitrary plug-in solver. We prove that under the anchor-state assumption, which implies implicit non-negativity in the feature space, the minimax sample complexity of finding an ϵ\epsilonϵ-optimal policy in a γ\gammaγ-discounted MDP is O(K/(1−γ)3ϵ2)O(K/(1-\gamma)^3\epsilon^2)O(K/(1−γ)3ϵ2), which only depends on the dimensionality KKK of the feature space and has no dependence on the state or action space. We further extend our results to a relaxed setting where anchor-states may not exist and show that a plug-in approach can be sample efficient as well, providing a flexible approach to design model-based algorithms for RL.

View on arXiv
Comments on this paper