13
0

Layered State Discovery for Incremental Autonomous Exploration

Abstract

We study the autonomous exploration (AX) problem proposed by Lim & Auer (2012). In this setting, the objective is to discover a set of ϵ\epsilon-optimal policies reaching a set SL\mathcal{S}_L^{\rightarrow} of incrementally LL-controllable states. We introduce a novel layered decomposition of the set of incrementally LL-controllable states that is based on the iterative application of a state-expansion operator. We leverage these results to design Layered Autonomous Exploration (LAE), a novel algorithm for AX that attains a sample complexity of O~(LSL(1+ϵ)ΓL(1+ϵ)Aln12(SL(1+ϵ))/ϵ2)\tilde{\mathcal{O}}(LS^{\rightarrow}_{L(1+\epsilon)}\Gamma_{L(1+\epsilon)} A \ln^{12}(S^{\rightarrow}_{L(1+\epsilon)})/\epsilon^2), where SL(1+ϵ)S^{\rightarrow}_{L(1+\epsilon)} is the number of states that are incrementally L(1+ϵ)L(1+\epsilon)-controllable, AA is the number of actions, and ΓL(1+ϵ)\Gamma_{L(1+\epsilon)} is the branching factor of the transitions over such states. LAE improves over the algorithm of Tarbouriech et al. (2020a) by a factor of L2L^2 and it is the first algorithm for AX that works in a countably-infinite state space. Moreover, we show that, under a certain identifiability assumption, LAE achieves minimax-optimal sample complexity of O~(LSLAln12(SL)/ϵ2)\tilde{\mathcal{O}}(LS^{\rightarrow}_{L}A\ln^{12}(S^{\rightarrow}_{L})/\epsilon^2), outperforming existing algorithms and matching for the first time the lower bound proved by Cai et al. (2022) up to logarithmic factors.

View on arXiv
Comments on this paper