While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting grasp poses based on point cloud input. Our main contribution is to propose an -equivariant model that maps each point in the cloud to a continuous grasp quality function over the 2-sphere using a spherical harmonic basis. Compared with reasoning about a finite set of samples, this formulation improves the accuracy and efficiency of our model when a large number of samples would otherwise be needed. In order to accomplish this, we propose a novel variation on EquiFormerV2 that leverages a UNet-style backbone to enlarge the number of points the model can handle. Our resulting method, which we name , significantly outperforms baselines in both simulation and physical experiments.
View on arXiv