Planning for robotic exploration based on forward simulation

A robotic agent is tasked to explore an a priori unknown environment. The objective is to maximize the amount of information about the partially observable state. The problem is formulated as a partially observable Markov decision process (POMDP) with an information-theoretic objective function, further approximated to a form suitable for robotic exploration. An open-loop approximation is applied with receding horizon control to solve the problem. Algorithms based on evaluating the utilities of sequences of actions by forward simulation are presented for both finite and uncountable action spaces. The advantages of the receding horizon approach to myopic planning are demonstrated in simulated and real-world exploration experiments. The proposed method is applicable to a wide range of domains, both in dynamic and static environments, by modifying the underlying state transition and observation models.
View on arXiv