448
v1v2 (latest)

Merging and Disentangling Views in Visual Reinforcement Learning for Robotic Manipulation

Main:9 Pages
16 Figures
Bibliography:5 Pages
1 Tables
Appendix:9 Pages
Abstract

Vision is well-known for its use in manipulation, especially using visual servoing. Due to the 3D nature of the world, using multiple camera views and merging them creates better representations for Q-learning and in turn, trains more sample efficient policies. Nevertheless, these multi-view policies are sensitive to failing cameras and can be burdensome to deploy. To mitigate these issues, we introduce a Merge And Disentanglement (MAD) algorithm that efficiently merges views to increase sample efficiency while simultaneously disentangling views by augmenting multi-view feature inputs with single-view features. This produces robust policies and allows lightweight deployment. We demonstrate the efficiency and robustness of our approach using Meta-World and ManiSkill3. For project website and code, seethis https URL

View on arXiv
Comments on this paper