v1v2 (latest)

Episodic Transformer for Vision-and-Language Navigation

IEEE International Conference on Computer Vision (ICCV), 2021

13 May 2021

Papers citing "Episodic Transformer for Vision-and-Language Navigation"

50 / 140 papers shown

Title
End-to-End (Instance)-Image Goal Navigation through Correspondence as an Emergent PhenomenonInternational Conference on Learning Representations (ICLR), 2023 G. Bono L. Antsfeld Boris Chidlovskii Zhi Zheng Christian Wolf 3DV 140 17 0 28 Sep 2023
Hierarchical Imitation Learning for Stochastic EnvironmentsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023 Maximilian Igl Punit Shah Paul Mougin S. Srinivasan Tarun Gupta Brandyn White K. Shiarlis Shimon Whiteson OOD 157 3 0 25 Sep 2023
Discuss Before Moving: Visual Language Navigation via Multi-expert DiscussionsIEEE International Conference on Robotics and Automation (ICRA), 2023 Yuxing Long Xiaoqi Li Wenzhe Cai Hao Dong LLMAG LM&Ro 343 102 0 20 Sep 2023
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-FunctionsConference on Robot Learning (CoRL), 2023 Yevgen Chebotar Q. Vuong A. Irpan Karol Hausman F. Xia ... Brianna Zitkovich Tomas Jackson Kanishka Rao Chelsea Finn Sergey Levine OffRL 321 123 0 18 Sep 2023
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven NavigationNeural Information Processing Systems (NeurIPS), 2023 Hongchen Wang Andy Guan Hong Chen Xiaoqi Li Mingdong Wu Hao Dong 347 24 0 15 Sep 2023
Grounded Entity-Landmark Adaptive Pre-training for Vision-and-Language NavigationIEEE International Conference on Computer Vision (ICCV), 2023 Yibo Cui Liang Xie Yakun Zhang Meishan Zhang Ye Yan Erwei Yin LM&Ro 187 26 0 24 Aug 2023
Target-Grounded Graph-Aware Transformer for Aerial Vision-and-Dialog Navigation Yi-Chiao Su Dongyan An Yuan Xu Kehan Chen Yan Huang 252 4 0 22 Aug 2023
Multi-Level Compositional Reasoning for Interactive Instruction FollowingAAAI Conference on Artificial Intelligence (AAAI), 2023 Suvaansh Bhambri Byeonghwi Kim Jonghyun Choi LM&Ro 214 14 0 18 Aug 2023
DREAMWALKER: Mental Planning for Continuous Vision-Language NavigationIEEE International Conference on Computer Vision (ICCV), 2023 Hanqing Wang Wei Liang Luc Van Gool Wenguan Wang LM&Ro 196 75 0 14 Aug 2023
Context-Aware Planning and Environment-Aware Memory for Instruction Following Embodied AgentsIEEE International Conference on Computer Vision (ICCV), 2023 Byeonghwi Kim Jinyeon Kim Yuyeong Kim Cheol-Hui Min Jonghyun Choi LM&Ro 417 37 0 14 Aug 2023
Object Goal Navigation with Recursive Implicit MapsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023 Shizhe Chen Thomas Chabal Ivan Laptev Cordelia Schmid 186 31 0 10 Aug 2023
Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI Hangjie Shi Leslie Ball Govind Thattai Desheng Zhang Lu Hu ... Michael Johnston Akshaya Iyengar Arindam Mandal Premkumar Natarajan R. Ghanadan 112 6 0 09 Aug 2023
Bird's-Eye-View Scene Graph for Vision-Language NavigationIEEE International Conference on Computer Vision (ICCV), 2023 Ruitao Liu Xiaohan Wang Wenguan Wang Yi Yang 292 81 0 09 Aug 2023
LEMMA: Learning Language-Conditioned Multi-Robot ManipulationIEEE Robotics and Automation Letters (RA-L), 2023 Ran Gong Xiaofeng Gao Qiaozi Gao Suhaila Shakiah Govind Thattai Gaurav Sukhatme LM&Ro 229 13 0 02 Aug 2023
MAEA: Multimodal Attribution for Embodied AI Vidhi Jain Jayant Sravan Tamarapalli Sahiti Yerramilli Yonatan Bisk 179 9 0 25 Jul 2023
GridMM: Grid Memory Map for Vision-and-Language NavigationIEEE International Conference on Computer Vision (ICCV), 2023 Zihan Wang Xiangyang Li Jiahao Yang Yeqi Liu Shuqiang Jiang 355 99 0 24 Jul 2023
Learning Navigational Visual Representations with Semantic Map SupervisionIEEE International Conference on Computer Vision (ICCV), 2023 Yicong Hong Yang Zhou Ruiyi Zhang Franck Dernoncourt Trung Bui Stephen Gould Hao Tan SSL 199 42 0 23 Jul 2023
Breaking Down the Task: A Unit-Grained Hybrid Training Framework for Vision and Language Decision Making Ruipu Luo Jiwen Zhang Zhongyu Wei VLM 177 0 0 16 Jul 2023
Goal-Conditioned Predictive Coding for Offline Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023 Zilai Zeng Ce Zhang Shijie Wang Chen Sun OffRL 185 12 0 07 Jul 2023
Improving Long-Horizon Imitation Through Instruction PredictionAAAI Conference on Artificial Intelligence (AAAI), 2023 Joey Hejna Pieter Abbeel Lerrel Pinto 174 12 0 21 Jun 2023
SPRINT: Scalable Policy Pre-Training via Language Instruction RelabelingIEEE International Conference on Robotics and Automation (ICRA), 2023 Jesse Zhang Karl Pertsch Jiahui Zhang Joseph J. Lim LM&Ro 415 22 0 20 Jun 2023
CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy EnvironmentsAAAI Conference on Artificial Intelligence (AAAI), 2023 Xiulong Liu Sudipta Paul Moitreya Chatterjee A. Cherian 187 11 0 06 Jun 2023
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023 Gengze Zhou Yicong Hong Qi Wu ELM LM&Ro LLMAG LRM 433 260 0 26 May 2023
R2H: Building Multimodal Navigation Helpers that Respond to Help RequestsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Yue Fan Jing Gu Kaizhi Zheng Xin Eric Wang 227 5 0 23 May 2023
Yes, this Way! Learning to Ground Referring Expressions into Actions with Intra-episodic Feedback from Supportive TeachersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 P. Sadler Sherzod Hakimov David Schlangen 263 3 0 22 May 2023
Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2 into a Robot Language Model for Grounded Task Planning Georgia Chalvatzaki A. Younes Daljeet Nandha An T. Le Leonardo F. R. Ribeiro Iryna Gurevych LM&Ro LRM LLMAG 140 39 0 12 May 2023
Multimodal Contextualized Plan Prediction for Embodied Task Completion Mert Inan Aishwarya Padmakumar Spandana Gella P. Lange Dilek Z. Hakkani-Tür LM&Ro 177 0 0 10 May 2023
Pretrained Language Models as Visual Planners for Human AssistanceIEEE International Conference on Computer Vision (ICCV), 2023 Dhruvesh Patel H. Eghbalzadeh Nitin Kamra Michael L. Iuzzolino Unnat Jain Ruta Desai LM&Ro 253 34 0 17 Apr 2023
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D ScenesIEEE International Conference on Computer Vision (ICCV), 2023 Ran Gong Jiangyong Huang Yizhou Zhao Haoran Geng Xiaofeng Gao ... Ziheng Zhou D. Terzopoulos Song-Chun Zhu Baoxiong Jia Siyuan Huang LM&Ro 273 67 0 09 Apr 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction FollowingConference on Robot Learning (CoRL), 2023 Mingyu Ding Yan Xu Zhenfang Chen David D. Cox Ping Luo J. Tenenbaum Chuang Gan LM&Ro 179 24 0 07 Apr 2023
Lana: A Language-Capable Navigator for Instruction Following and GenerationComputer Vision and Pattern Recognition (CVPR), 2023 Xiaohan Wang Wenguan Wang Jiayi Shao Yi Yang LLMAG LM&Ro 225 55 0 15 Mar 2023
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum GroundingComputer Vision and Pattern Recognition (CVPR), 2023 Minyoung Hwang Jaeyeon Jeong Minsoo Kim Yoonseon Oh Songhwai Oh 220 26 0 07 Mar 2023
Alexa Arena: A User-Centric Interactive Platform for Embodied AINeural Information Processing Systems (NeurIPS), 2023 Qiaozi Gao Govind Thattai Suhaila Shakiah Xiaofeng Gao Shreyas Pansare ... Michael Johnston R. Ghanadan Arindam Mandal Dilek Z. Hakkani-Tür Premkumar Natarajan 150 31 0 02 Mar 2023
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied AgentsNeural Information Processing Systems (NeurIPS), 2023 Wenlong Huang Fei Xia Dhruv Shah Danny Driess Andy Zeng ... Pete Florence Igor Mordatch Sergey Levine Karol Hausman Brian Ichter LM&Ro 216 74 0 01 Mar 2023
Multimodal Speech Recognition for Language-Guided Embodied AgentsInterspeech (Interspeech), 2023 Allen Chang Xiaoyuan Zhu Aarav Monga Seoho Ahn Tejas Srinivasan Jesse Thomason AuLLM 325 6 0 27 Feb 2023
Learning by Asking for Embodied Visual Navigation and Task CompletionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023 Ying Shen Ismini Lourentzou 251 1 0 09 Feb 2023
Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction ManualsNeural Information Processing Systems (NeurIPS), 2023 Yue Wu Yewen Fan Paul Pu Liang A. Azaria Yuan-Fang Li Tom Michael Mitchell OffRL 249 54 0 09 Feb 2023
On Transforming Reinforcement Learning by Transformer: The Development TrajectoryIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022 Shengchao Hu Li Shen Ya Zhang Yixin Chen Dacheng Tao OffRL 340 58 0 29 Dec 2022
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World EnvironmentsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Yu Gu Xiang Deng Yu-Chuan Su LLMAG 387 71 0 19 Dec 2022
RT-1: Robotics Transformer for Real-World Control at Scale Anthony Brohan Noah Brown Justice Carbajal Yevgen Chebotar Joseph Dabis ... Ted Xiao Peng Xu Sichun Xu Tianhe Yu Brianna Zitkovich LM&Ro 447 1,674 0 13 Dec 2022
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language ModelsIEEE International Conference on Computer Vision (ICCV), 2022 Chan Hee Song Jiaman Wu Clay Washington Brian M Sadler Wei-Lun Chao Yu-Chuan Su LLMAG LM&Ro 462 593 0 08 Dec 2022
Layout-aware Dreamer for Embodied Referring Expression Grounding Mingxiao Li Zehao Wang Tinne Tuytelaars Marie-Francine Moens LM&Ro 140 7 0 30 Nov 2022
Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following Y. Inoue Hiroki Ohashi LM&Ro 150 50 0 07 Nov 2022
Towards Versatile Embodied NavigationNeural Information Processing Systems (NeurIPS), 2022 Hongru Wang Wei Liang Luc Van Gool Wenguan Wang LM&Ro 185 29 0 30 Oct 2022
DOROTHIE: Spoken Dialogue for Handling Unexpected Situations in Interactive Autonomous Driving AgentsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Ziqiao Ma B. VanDerPloeg Cristian-Paul Bara Yidong Huang Eui-In Kim Felix Gervits M. Marge J. Chai 273 10 0 22 Oct 2022
DANLI: Deliberative Agent for Following Natural Language InstructionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Yichi Zhang Jianing Yang Jiayi Pan Shane Storks N. Devraj Ziqiao Ma Keunwoo Peter Yu Yuwei Bao J. Chai LM&Ro 316 20 0 22 Oct 2022
SQA3D: Situated Question Answering in 3D ScenesInternational Conference on Learning Representations (ICLR), 2022 Xiaojian Ma Silong Yong Zilong Zheng Qing Li Yitao Liang Song-Chun Zhu Siyuan Huang LM&Ro 430 238 0 14 Oct 2022
Retrospectives on the Embodied AI Workshop Matt Deitke Dhruv Batra Yonatan Bisk Tommaso Campari Angel X. Chang ... Jesse Thomason Alexander Toshev Joanne Truong Luca Weihs Jiajun Wu LM&Ro 333 53 0 13 Oct 2022
Multi-Object Navigation with dynamically learned neural implicit representationsIEEE International Conference on Computer Vision (ICCV), 2022 Pierre Marza L. Matignon Olivier Simonin Christian Wolf 224 29 0 11 Oct 2022
Generating Executable Action Plans with Environmentally-Aware Language ModelsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2022 Maitrey Gramopadhye D. Szafir LM&Ro LLMAG 265 37 0 10 Oct 2022