Learning to Walk in Every Direction
Numerous algorithms have been proposed to allow legged robots to learn to walk. However, their vast majority are devised to learn to walk along a straight line, which not sufficient to accomplish any real-world mission. Here we introduce TBR-Learning, a new learning algorithm that simultaneously discovers several hundreds of simple walking controllers, one for each possible direction. By taking advantage of solutions that are usually discarded, TBR-Learning is substantially faster than independently learning each controller. Our technique relies on two methods: (1) novelty search with local competition, which comes from the artificial life research field and (2) the transferability approach, which combines simulations and real tests to optimize a policy. We evaluate this new technique on a hexapod robot. Results show that with only a few dozens of short experiments performed on the physical robot, the algorithm learns a collection of controllers that allows the robot to reach each point of its reachable space. Overall, TBR-Learning opens a new kind of learning algorithm that simultaneously optimizes all the achievable behaviors of a robot.
View on arXiv