3D Simulation for Robot Arm Control with Deep Q-Learning
Intelligent control of robotic arms has huge potential over the coming years, but as of now will often fail to adapt when presented with new and unfamiliar environments. Recent trends to solve this problem have seen a shift to end-to-end solutions using deep reinforcement learning to learn policies from visual input, rather than relying on a handcrafted, modular pipeline. Building upon the recent success of deep Q-networks, we present an approach which uses three-dimensional simulations to train a 7-DOF robotic arm in a robot arm control task without any prior knowledge. Policies accept images of the environment as input and output motor actions. However, the high-dimensionality of the policies as well as the large state space makes policy search difficult. This is overcome by ensuring interesting states are explored via intermediate rewards that guide the policy towards higher reward states. Our results demonstrate that deep Q-networks can be used to learn policies for a task that involves locating a cube, grasping, and then finally lifting. The agent is able to learn to deal with a range of starting joint configurations and starting cube positions when tested in simulation. Moreover, we show that policies trained via simulation have the potential to be directly applied to real-world equivalents without any further training. We believe that robot simulations can decrease the dependency on physical robots and ultimately improve productivity of training robot control tasks.
View on arXiv