Closing the Planning-Learning Loop with Application to Autonomous Driving in a Crowd

IEEE Transactions on robotics (TRO), 2021

11 January 2021

Abstract

Imagine an autonomous robot vehicle driving in dense, possibly unregulated urban traffic. To contend with an uncertain, interactive environment with heterogeneous traffic of cars, motorcycles, buses, ..., the robot vehicle has to plan in both short and long terms in order to drive effectively and approach human-level performance. Planning explicitly over a long time horizon, however, incurs prohibitive computational cost and is impractical under real-time constraints. To achieve real-time performance for large-scale planning, this work introduces Learning from Tree Search for Driving (LeTS-Drive), which integrates planning and learning in a closed loop. LeTS-Drive learns a driving policy from a planner, which is based on sparsely sampled tree search. The learned policy in turn guides online planning for real-time vehicle control. These two steps are repeated to form a closed loop so that the planner and the learner inform each other and improve in synchrony. The entire system can learn on its own in a self-supervised manner, without human effort on explicit data labeling. We applied LeTSDrive to autonomous driving in crowded urban environments in simulation. Experimental results show clearly that LeTS-Drive outperforms either planning or learning alone, as well as open-loop integration of planning and learning.

View on arXiv

Comments on this paper