ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.07883
  4. Cited By
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning
  Tasks

Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks

18 November 2019
Fengda Zhu
Yi Zhu
Xiaojun Chang
Xiaodan Liang
    LRM
ArXivPDFHTML

Papers citing "Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks"

44 / 44 papers shown
Title
DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation
Yinfeng Yu
Dongsheng Yang
22
0
0
30 Apr 2025
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Think Hierarchically, Act Dynamically: Hierarchical Multi-modal Fusion and Reasoning for Vision-and-Language Navigation
Junrong Yue
Y. Zhang
Chuan Qin
Bo Li
Xiaomin Lie
Xinlei Yu
Wenxin Zhang
Zhendong Zhao
43
0
0
23 Apr 2025
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu
Yaojie Shen
Chenxi Luo
Tiejian Luo
Yan Huang
Yuewei Lin
Heng Fan
L. Zhang
55
1
0
16 Feb 2025
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models
Yue Zhang
Ziqiao Ma
Jialu Li
Yanyuan Qiao
Zun Wang
J. Chai
Qi Wu
Mohit Bansal
Parisa Kordjamshidi
LRM
51
18
0
31 Dec 2024
iWalker: Imperative Visual Planning for Walking Humanoid Robot
iWalker: Imperative Visual Planning for Walking Humanoid Robot
Xiao Lin
Yuhao Huang
Taimeng Fu
Xiaobin Xiong
Chen Wang
34
0
0
27 Sep 2024
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs
Yanyuan Qiao
Wenqi Lyu
Hui Wang
Zixu Wang
Zerui Li
Yuan Zhang
Mingkui Tan
Qi Wu
LRM
36
2
0
27 Sep 2024
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Bingqian Lin
Yunshuang Nie
Ziming Wei
Jiaqi Chen
Shikui Ma
Jianhua Han
Hang Xu
Xiaojun Chang
Xiaodan Liang
LM&Ro
LRM
60
19
0
12 Mar 2024
Towards Deviation-Robust Agent Navigation via Perturbation-Aware
  Contrastive Learning
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning
Bingqian Lin
Yanxin Long
Yi Zhu
Fengda Zhu
Xiaodan Liang
QiXiang Ye
Liang Lin
27
5
0
09 Mar 2024
Continual Referring Expression Comprehension via Dual Modular
  Memorization
Continual Referring Expression Comprehension via Dual Modular Memorization
Hengtao Shen
Cheng Chen
Peng Wang
Lianli Gao
M. Wang
Jingkuan Song
ObjD
25
3
0
25 Nov 2023
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
Hanqing Wang
Wei Liang
Luc Van Gool
Wenguan Wang
LM&Ro
17
28
0
14 Aug 2023
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot
  Attention for Vision-and-Language Navigation
GeoVLN: Learning Geometry-Enhanced Visual Representation with Slot Attention for Vision-and-Language Navigation
Jingyang Huo
Qiang Sun
Boyan Jiang
Haitao Lin
Yanwei Fu
27
18
0
26 May 2023
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large
  Language Models
NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models
Gengze Zhou
Yicong Hong
Qi Wu
ELM
LM&Ro
LLMAG
LRM
23
139
0
26 May 2023
Embodied Concept Learner: Self-supervised Learning of Concepts and
  Mapping through Instruction Following
Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
Mingyu Ding
Yan Xu
Zhenfang Chen
David D. Cox
Ping Luo
J. Tenenbaum
Chuang Gan
LM&Ro
49
21
0
07 Apr 2023
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
Xiangyang Li
Zihan Wang
Jiahao Yang
Yaowei Wang
Shuqiang Jiang
LM&Ro
13
35
0
28 Mar 2023
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation
  Using Scene Object Spectrum Grounding
Meta-Explore: Exploratory Hierarchical Vision-and-Language Navigation Using Scene Object Spectrum Grounding
Minyoung Hwang
Jaeyeon Jeong
Minsoo Kim
Yoonseon Oh
Songhwai Oh
17
19
0
07 Mar 2023
MLANet: Multi-Level Attention Network with Sub-instruction for
  Continuous Vision-and-Language Navigation
MLANet: Multi-Level Attention Network with Sub-instruction for Continuous Vision-and-Language Navigation
Zongtao He
Liuyi Wang
Shu Li
Qingqing Yan
Chengju Liu
Qi Chen
10
7
0
02 Mar 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language
  Navigation
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
29
3
0
13 Feb 2023
Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Diffusion-based Generation, Optimization, and Planning in 3D Scenes
Siyuan Huang
Zan Wang
Puhao Li
Baoxiong Jia
Tengyu Liu
Yixin Zhu
Wei Liang
Song-Chun Zhu
DiffM
39
199
0
15 Jan 2023
Graph based Environment Representation for Vision-and-Language
  Navigation in Continuous Environments
Graph based Environment Representation for Vision-and-Language Navigation in Continuous Environments
Ting Wang
Zongkai Wu
Feiyu Yao
Donglin Wang
35
5
0
11 Jan 2023
Bridging the visual gap in VLN via semantically richer instructions
Bridging the visual gap in VLN via semantically richer instructions
Joaquín Ossandón
Benjamín Earle
Alvaro Soto
17
0
0
27 Oct 2022
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language
  Navigation
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
Peihao Chen
Dongyu Ji
Kun-Li Channing Lin
Runhao Zeng
Thomas H. Li
Mingkui Tan
Chuang Gan
SSL
20
61
0
14 Oct 2022
Iterative Vision-and-Language Navigation
Iterative Vision-and-Language Navigation
Jacob Krantz
Shurjo Banerjee
Wang Zhu
Jason J. Corso
Peter Anderson
Stefan Lee
Jesse Thomason
LM&Ro
40
18
0
06 Oct 2022
Target-Driven Structured Transformer Planner for Vision-Language
  Navigation
Target-Driven Structured Transformer Planner for Vision-Language Navigation
Yusheng Zhao
Jinyu Chen
Chen Gao
Wenguan Wang
Lirong Yang
Haibing Ren
Huaxia Xia
Si Liu
LM&Ro
19
56
0
19 Jul 2022
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
FOAM: A Follower-aware Speaker Model For Vision-and-Language Navigation
Zi-Yi Dou
Nanyun Peng
17
22
0
09 Jun 2022
Multi-View Transformer for 3D Visual Grounding
Multi-View Transformer for 3D Visual Grounding
Shijia Huang
Yilun Chen
Jiaya Jia
Liwei Wang
17
112
0
05 Apr 2022
EnvEdit: Environment Editing for Vision-and-Language Navigation
EnvEdit: Environment Editing for Vision-and-Language Navigation
Jialu Li
Hao Tan
Mohit Bansal
27
79
0
29 Mar 2022
Visual-Language Navigation Pretraining via Prompt-based Environmental
  Self-exploration
Visual-Language Navigation Pretraining via Prompt-based Environmental Self-exploration
Xiwen Liang
Fengda Zhu
Lingling Li
Hang Xu
Xiaodan Liang
LM&Ro
VLM
22
29
0
08 Mar 2022
Think Global, Act Local: Dual-scale Graph Transformer for
  Vision-and-Language Navigation
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
LM&Ro
28
137
0
23 Feb 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
211
0
18 Feb 2022
Curriculum Learning for Vision-and-Language Navigation
Curriculum Learning for Vision-and-Language Navigation
Jiwen Zhang
Zhongyu Wei
Jianqing Fan
J. Peng
LM&Ro
21
20
0
14 Nov 2021
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language
  Navigation
SOAT: A Scene- and Object-Aware Transformer for Vision-and-Language Navigation
A. Moudgil
Arjun Majumdar
Harsh Agrawal
Stefan Lee
Dhruv Batra
LM&Ro
19
56
0
27 Oct 2021
ReaSCAN: Compositional Reasoning in Language Grounding
ReaSCAN: Compositional Reasoning in Language Grounding
Zhengxuan Wu
Elisa Kreiss
Desmond C. Ong
Christopher Potts
CoGe
LRM
21
22
0
18 Sep 2021
Procedures as Programs: Hierarchical Control of Situated Agents through
  Natural Language
Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language
Shuyan Zhou
Pengcheng Yin
Graham Neubig
LM&Ro
11
1
0
16 Sep 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
188
403
0
13 Jul 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
39
45
0
26 Jun 2021
Vision-Language Navigation with Random Environmental Mixup
Vision-Language Navigation with Random Environmental Mixup
Chong Liu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
Zongyuan Ge
Yi-Dong Shen
LM&Ro
48
85
0
15 Jun 2021
Hierarchical Cross-Modal Agent for Robotics Vision-and-Language
  Navigation
Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation
Muhammad Zubair Irshad
Chih-Yao Ma
Z. Kira
LM&Ro
16
49
0
21 Apr 2021
Diagnosing Vision-and-Language Navigation: What Really Matters
Diagnosing Vision-and-Language Navigation: What Really Matters
Wanrong Zhu
Yuankai Qi
P. Narayana
Kazoo Sone
Sugato Basu
X. Wang
Qi Wu
M. Eckstein
W. Wang
LM&Ro
22
50
0
30 Mar 2021
UPDeT: Universal Multi-agent Reinforcement Learning via Policy
  Decoupling with Transformers
UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers
Siyi Hu
Fengda Zhu
Xiaojun Chang
Xiaodan Liang
OffRL
10
71
0
20 Jan 2021
Are We There Yet? Learning to Localize in Embodied Instruction Following
Are We There Yet? Learning to Localize in Embodied Instruction Following
Shane Storks
Qiaozi Gao
Govind Thattai
Gökhan Tür
LM&Ro
37
11
0
09 Jan 2021
Language and Visual Entity Relationship Graph for Agent Navigation
Language and Visual Entity Relationship Graph for Agent Navigation
Yicong Hong
Cristian Rodriguez-Opazo
Yuankai Qi
Qi Wu
Stephen Gould
LM&Ro
171
131
0
19 Oct 2020
Evolving Graphical Planner: Contextual Global Planning for
  Vision-and-Language Navigation
Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation
Zhiwei Deng
Karthik Narasimhan
Olga Russakovsky
11
85
0
11 Jul 2020
Speaker-Follower Models for Vision-and-Language Navigation
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
246
495
0
07 Jun 2018
Conditional Image Synthesis With Auxiliary Classifier GANs
Conditional Image Synthesis With Auxiliary Classifier GANs
Augustus Odena
C. Olah
Jonathon Shlens
GAN
224
3,185
0
30 Oct 2016
1