Title
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models Chen Wang Fei Xia Wenhao Yu Tingnan Zhang Ruohan Zhang Ce Liu Li Fei-Fei Jie Tan Jacky Liang 31 0 0 17 Apr 2025
MotionGlot: A Multi-Embodied Motion Generation Model Sudarshan Harithas Srinath Sridhar 68 1 0 22 Oct 2024
SPINE: Online Semantic Planning for Missions with Incomplete Natural Language Specifications in Unstructured Environments Zachary Ravichandran Varun Murali Mariliza Tzes George J. Pappas Vijay Kumar LRM 51 6 0 03 Oct 2024
KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems Zixuan Wang Bo Yu Junzhe Zhao Wenhao Sun Sai Hou Shuai Liang Xing Hu Yinhe Han Yiming Gan 37 1 0 23 Sep 2024
Open-vocabulary Queryable Scene Representations for Real World Planning Boyuan Chen F. Xia Brian Ichter Kanishka Rao K. Gopalakrishnan Michael S. Ryoo Austin Stone Daniel Kappler LM&Ro 144 179 0 20 Sep 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action Dhruv Shah B. Osinski Brian Ichter Sergey Levine LM&Ro 136 430 0 10 Jul 2022
ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings Arjun Majumdar Gunjan Aggarwal Bhavika Devnani Judy Hoffman Dhruv Batra LM&Ro 144 148 0 24 Jun 2022