We present a learning-based method for representing grasp poses of a high-DOF hand using neural networks. Due to redundancy in such high-DOF grippers, there exists a large number of equally effective grasp poses for a given target object, making it difficult for the neural network to find consistent grasp poses. We resolve this ambiguity by generating an augmented dataset that covers many possible grasps for each target object and train our neural networks using a consistency loss function to identify a one-to-one mapping from objects to grasp poses. We further enhance the quality of neural-network-predicted grasp poses using a collision loss function to avoid penetrations. We use an object dataset that combines the BigBIRD Database, the KIT Database, the YCB Database, and the Grasp Dataset to show that our method can generate high-DOF grasp poses with higher accuracy than supervised learning baselines. The quality of the grasp poses is on par with the groundtruth poses in the dataset. In addition, our method is robust and can handle noisy object models such as those constructed from multi-view depth images, allowing our method to be implemented on a 25-DOF Shadow Hand hardware platform.
View on arXiv