EgoSpot:Egocentric Multimodal Control for Hands-Free Mobile Manipulation
We propose a novel hands-free control framework for the Boston Dynamics Spot robot using the Microsoft HoloLens 2 mixed-reality headset. Enabling accessible robot control is critical for allowing individuals with physical disabilities to benefit from robotic assistance in daily activities, teleoperation, and remote interaction tasks. However, most existing robot control interfaces rely on manual input devices such as joysticks or handheld controllers, which can be difficult or impossible for users with limited motor capabilities. To address this limitation, we develop an intuitive multimodal control system that leverages egocentric sensing from a wearable device. Our system integrates multiple control signals, including eye gaze, head gestures, and voice commands, to enable hands-free interaction. These signals are fused to support real-time control of both robot locomotion and arm manipulation. Experimental results show that our approach achieves performance comparable to traditional joystick-based control in terms of task completion time and user experience, while significantly improving accessibility and naturalness of interaction. Our results highlight the potential of egocentric multimodal interfaces to make mobile manipulation robots more inclusive and usable for a broader population. A demonstration of the system is available on our project webpage.
View on arXiv