Push-Grasp Policy Learning Using Equivariant Models and Grasp Score Optimization

Goal-conditioned robotic grasping in cluttered environments remains a challenging problem due to occlusions caused by surrounding objects, which prevent direct access to the target object. A promising solution to mitigate this issue is combining pushing and grasping policies, enabling active rearrangement of the scene to facilitate target retrieval. However, existing methods often overlook the rich geometric structures inherent in such tasks, thus limiting their effectiveness in complex, heavily cluttered scenarios. To address this, we propose the Equivariant Push-Grasp Network, a novel framework for joint pushing and grasping policy learning. Our contributions are twofold: (1) leveraging SE(2)-equivariance to improve both pushing and grasping performance and (2) a grasp score optimization-based training strategy that simplifies the joint learning process. Experimental results show that our method improves grasp success rates by 49% in simulation and by 35% in real-world scenarios compared to strong baselines, representing a significant advancement in push-grasp policy learning.
View on arXiv@article{hu2025_2504.03053, title={ Push-Grasp Policy Learning Using Equivariant Models and Grasp Score Optimization }, author={ Boce Hu and Heng Tian and Dian Wang and Haojie Huang and Xupeng Zhu and Robin Walters and Robert Platt }, journal={arXiv preprint arXiv:2504.03053}, year={ 2025 } }