Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.10972
Cited By
Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning
22 November 2020
Weixia Zhang
Chao Ma
Qi Wu
Xiaokang Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language-guided Navigation via Cross-Modal Grounding and Alternate Adversarial Learning"
21 / 21 papers shown
Title
Temporal-Enhanced Multimodal Transformer for Referring Multi-Object Tracking and Segmentation
Changcheng Xiao
Qiong Cao
Yujie Zhong
Xiang Zhang
Tao Wang
Canqun Yang
L. Lan
23
0
0
17 Oct 2024
Cognition Transferring and Decoupling for Text-supervised Egocentric Semantic Segmentation
Zhaofeng Shi
Heqian Qiu
Lanxiao Wang
Fanman Meng
Q. Wu
Hongliang Li
26
2
0
02 Oct 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Y. Liu
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&Ro
SyDa
AI4CE
51
47
0
09 Jul 2024
EAVL: Explicitly Align Vision and Language for Referring Image Segmentation
Yimin Yan
Xingjian He
Wenxuan Wang
Sihan Chen
J. Liu
ObjD
VLM
26
2
0
18 Aug 2023
PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation
Liuyi Wang
Chengju Liu
Zongtao He
Shu Li
Qingqing Yan
Huiyi Chen
Qi Chen
21
9
0
19 May 2023
Fully and Weakly Supervised Referring Expression Segmentation with End-to-End Learning
Hui Li
Mingjie Sun
Jimin Xiao
Eng Gee Lim
Yao-Min Zhao
29
19
0
17 Dec 2022
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models
Cheng Ma
Yang Liu
Jiankang Deng
Lingxi Xie
Weiming Dong
Changsheng Xu
VLM
VPVLM
26
43
0
04 Nov 2022
Unsupervised Visual Odometry and Action Integration for PointGoal Navigation in Indoor Environment
Yijun Cao
Xian-Shi Zhang
Fuya Luo
Chuan Lin
Yongjie Li
22
7
0
02 Oct 2022
Monocular Camera-based Complex Obstacle Avoidance via Efficient Deep Reinforcement Learning
Jianchuan Ding
Lingping Gao
Wenxi Liu
Haiyin Piao
Jia-Yu Pan
Z. Du
Xin Yang
Baocai Yin
9
12
0
01 Sep 2022
PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding
Zihan Ding
Zixiang Ding
Tianrui Hui
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
Si Liu
12
12
0
11 Aug 2022
Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions
Jing Gu
Eliana Stefani
Qi Wu
Jesse Thomason
X. Wang
LM&Ro
30
103
0
22 Mar 2022
Self-Training Vision Language BERTs with a Unified Conditional Model
Xiaofeng Yang
Fengmao Lv
Fayao Liu
Guosheng Lin
SSL
VLM
46
13
0
06 Jan 2022
MDFM: Multi-Decision Fusing Model for Few-Shot Learning
Shuai Shao
Lei Xing
Rui Xu
Weifeng Liu
Yanjiang Wang
Baodi Liu
33
30
0
01 Dec 2021
Agent-Centric Relation Graph for Object Visual Navigation
X. Hu
Youfang Lin
Shuo Wang
Zhihao Wu
Kai Lv
31
18
0
29 Nov 2021
Vision-Language Navigation: A Survey and Taxonomy
Wansen Wu
Tao Chang
Xinmeng Li
LM&Ro
13
19
0
26 Aug 2021
CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language Navigation
A. Magassouba
K. Sugiura
Hisashi Kawai
51
10
0
01 Mar 2021
Meta-Generating Deep Attentive Metric for Few-shot Classification
Lei Zhang
Fei Zhou
Wei Wei
Yanning Zhang
VLM
34
28
0
03 Dec 2020
Multimodal Attention Networks for Low-Level Vision-and-Language Navigation
Federico Landi
Lorenzo Baraldi
Marcella Cornia
M. Corsini
Rita Cucchiara
LM&Ro
10
27
0
27 Nov 2019
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
248
496
0
07 Jun 2018
VITAL: VIsual Tracking via Adversarial Learning
Yibing Song
Chao Ma
Xiaohe Wu
Lijun Gong
Linchao Bao
W. Zuo
Chunhua Shen
Rynson W. H. Lau
Ming-Hsuan Yang
GAN
AAML
60
501
0
12 Apr 2018
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
216
7,923
0
17 Aug 2015
1