ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.13948
  4. Cited By
Core Challenges in Embodied Vision-Language Planning

Core Challenges in Embodied Vision-Language Planning

26 June 2021
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
    LM&Ro
ArXivPDFHTML

Papers citing "Core Challenges in Embodied Vision-Language Planning"

42 / 42 papers shown
Title
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Online Reasoning Video Segmentation with Just-in-Time Digital Twins
Yiqing Shen
Bohan Liu
Chenjia Li
Lalithkumar Seenivasan
Mathias Unberath
VOS
69
2
0
27 Mar 2025
Synergistic Dual Spatial-aware Generation of Image-to-Text and
  Text-to-Image
Synergistic Dual Spatial-aware Generation of Image-to-Text and Text-to-Image
Yu Zhao
Hao Fei
Xiangtai Li
L. Qin
Jiayi Ji
Hongyuan Zhu
Meishan Zhang
M. Zhang
Jianguo Wei
DiffM
26
1
0
20 Oct 2024
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City
  Environment
EmbodiedCity: A Benchmark Platform for Embodied Agent in Real-world City Environment
Chen Gao
Baining Zhao
Weichen Zhang
Jinzhu Mao
Jun Zhang
...
Jianjie Fang
Zile Zhou
Jinqiang Cui
X. Chen
Yong Li
LM&Ro
25
10
0
12 Oct 2024
Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied
  Planning with Large Multimodal Models
Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models
Yew Ken Chia
Qi Sun
Lidong Bing
Soujanya Poria
LM&Ro
19
1
0
22 Sep 2024
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large
  Language Models for Manufacturing Decision-making
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making
Siyu Wu
A. Oltramari
Jonathan M Francis
C. L. Giles
Frank E. Ritter
38
0
0
17 Aug 2024
RoPotter: Toward Robotic Pottery and Deformable Object Manipulation with
  Structural Priors
RoPotter: Toward Robotic Pottery and Deformable Object Manipulation with Structural Priors
Uksang Yoo
Adam Hung
Jonathan M Francis
Jean Oh
Jeffrey Ichnowski
16
0
0
05 Aug 2024
NavHint: Vision and Language Navigation Agent with a Hint Generator
NavHint: Vision and Language Navigation Agent with a Hint Generator
Yue Zhang
Quan Guo
Parisa Kordjamshidi
LLMAG
10
9
0
04 Feb 2024
LLM-SAP: Large Language Models Situational Awareness Based Planning
LLM-SAP: Large Language Models Situational Awareness Based Planning
Liman Wang
Hanyang Zhong
LLMAG
10
2
0
26 Dec 2023
Toward General-Purpose Robots via Foundation Models: A Survey and
  Meta-Analysis
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
Yafei Hu
Quanting Xie
Vidhi Jain
Jonathan M Francis
Jay Patrikar
...
Xiaolong Wang
Sebastian A. Scherer
Z. Kira
Fei Xia
Yonatan Bisk
LM&Ro
AI4CE
21
54
0
14 Dec 2023
SafeShift: Safety-Informed Distribution Shifts for Robust Trajectory
  Prediction in Autonomous Driving
SafeShift: Safety-Informed Distribution Shifts for Robust Trajectory Prediction in Autonomous Driving
Ben Stoler
Ingrid Navarro
Meghdeep Jana
Soonmin Hwang
Jonathan M Francis
Jean Oh
8
8
0
16 Sep 2023
MOSAIC: Learning Unified Multi-Sensory Object Property Representations
  for Robot Learning via Interactive Perception
MOSAIC: Learning Unified Multi-Sensory Object Property Representations for Robot Learning via Interactive Perception
Gyan Tatiya
Jonathan M Francis
Ho-Hsiang Wu
Yonatan Bisk
Jivko Sinapov
16
1
0
15 Sep 2023
Planning with Logical Graph-based Language Model for Instruction
  Generation
Planning with Logical Graph-based Language Model for Instruction Generation
Fan Zhang
Kebing Jin
H. Zhuo
LRM
14
2
0
26 Aug 2023
What Went Wrong? Closing the Sim-to-Real Gap via Differentiable Causal
  Discovery
What Went Wrong? Closing the Sim-to-Real Gap via Differentiable Causal Discovery
Peide Huang
Xilun Zhang
Ziang Cao
Shiqi Liu
Mengdi Xu
Wenhao Ding
Jonathan M Francis
Bingqing Chen
Ding Zhao
26
15
0
28 Jun 2023
Knowledge-enhanced Agents for Interactive Text Games
Knowledge-enhanced Agents for Interactive Text Games
P. Chhikara
Jiarui Zhang
Filip Ilievski
Jonathan M Francis
Kaixin Ma
LLMAG
21
8
0
08 May 2023
Cross-Tool and Cross-Behavior Perceptual Knowledge Transfer for Grounded
  Object Recognition
Cross-Tool and Cross-Behavior Perceptual Knowledge Transfer for Grounded Object Recognition
Gyan Tatiya
Jonathan M Francis
Jivko Sinapov
15
4
0
07 Mar 2023
VLN-Trans: Translator for the Vision and Language Navigation Agent
VLN-Trans: Translator for the Vision and Language Navigation Agent
Yue Zhang
Parisa Kordjamshidi
14
16
0
18 Feb 2023
Learning by Asking for Embodied Visual Navigation and Task Completion
Learning by Asking for Embodied Visual Navigation and Task Completion
Ying Shen
Ismini Lourentzou
10
2
0
09 Feb 2023
Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied
  Navigation
Knowledge-driven Scene Priors for Semantic Audio-Visual Embodied Navigation
Gyan Tatiya
Jonathan M Francis
Luca Bondi
Ingrid Navarro
Eric Nyberg
Jivko Sinapov
Jean Oh
9
8
0
21 Dec 2022
Distribution-aware Goal Prediction and Conformant Model-based Planning
  for Safe Autonomous Driving
Distribution-aware Goal Prediction and Conformant Model-based Planning for Safe Autonomous Driving
Jonathan M Francis
Bingqing Chen
Weiran Yao
Eric Nyberg
Jean Oh
OOD
14
5
0
16 Dec 2022
Automaton-Based Representations of Task Knowledge from Generative
  Language Models
Automaton-Based Representations of Task Knowledge from Generative Language Models
Yunhao Yang
Jean-Raphael Gaglione
Cyrus Neary
Ufuk Topcu
11
11
0
04 Dec 2022
ViLPAct: A Benchmark for Compositional Generalization on Multimodal
  Human Activities
ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities
Terry Yue Zhuo
Yaqing Liao
Yuecheng Lei
Lizhen Qu
Gerard de Melo
Xiaojun Chang
Yazhou Ren
Zenglin Xu
19
2
0
11 Oct 2022
Transferring Implicit Knowledge of Non-Visual Object Properties Across
  Heterogeneous Robot Morphologies
Transferring Implicit Knowledge of Non-Visual Object Properties Across Heterogeneous Robot Morphologies
Gyan Tatiya
Jonathan M Francis
Jivko Sinapov
18
13
0
14 Sep 2022
A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic
  Search
A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search
Brandon Trabucco
Gunnar A. Sigurdsson
Robinson Piramuthu
Gaurav Sukhatme
Ruslan Salakhutdinov
OCL
15
7
0
21 Jun 2022
Learn-to-Race Challenge 2022: Benchmarking Safe Learning and
  Cross-domain Generalisation in Autonomous Racing
Learn-to-Race Challenge 2022: Benchmarking Safe Learning and Cross-domain Generalisation in Autonomous Racing
Jonathan M Francis
Bingqing Chen
Siddha Ganju
Sidharth Kathpal
Jyotish Poonganam
...
Ivan Zhukov
Max Kumskoy
Anirudh Koul
Jean Oh
Eric Nyberg
6
11
0
05 May 2022
Generalizable Neuro-symbolic Systems for Commonsense Question Answering
Generalizable Neuro-symbolic Systems for Commonsense Question Answering
A. Oltramari
Jonathan M Francis
Filip Ilievski
Kaixin Ma
Roshanak Mirzaee
NAI
11
8
0
17 Jan 2022
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Bingqing Chen
Jonathan M Francis
Jean Oh
Eric Nyberg
Sylvia L. Herbert
26
14
0
14 Oct 2021
Waypoint Models for Instruction-guided Navigation in Continuous
  Environments
Waypoint Models for Instruction-guided Navigation in Continuous Environments
Jacob Krantz
Aaron Gokaslan
Dhruv Batra
Stefan Lee
Oleksandr Maksymets
LM&Ro
120
76
0
05 Oct 2021
Skill Induction and Planning with Latent Language
Skill Induction and Planning with Latent Language
Pratyusha Sharma
Antonio Torralba
Jacob Andreas
LM&Ro
178
108
0
04 Oct 2021
TEACh: Task-driven Embodied Agents that Chat
TEACh: Task-driven Embodied Agents that Chat
Aishwarya Padmakumar
Jesse Thomason
Ayush Shrivastava
P. Lange
Anjali Narayan-Chen
Spandana Gella
Robinson Piramithu
Gökhan Tür
Dilek Z. Hakkani-Tür
LM&Ro
142
179
0
01 Oct 2021
Reference-Centric Models for Grounded Collaborative Dialogue
Reference-Centric Models for Grounded Collaborative Dialogue
Daniel Fried
Justin T. Chiu
Dan Klein
37
19
0
10 Sep 2021
iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday
  Household Tasks
iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks
Chengshu Li
Fei Xia
Roberto Martín-Martín
Michael Lingelbach
S. Srivastava
...
Karen Liu
H. Gweon
Jiajun Wu
Li Fei-Fei
Silvio Savarese
LM&Ro
139
219
0
06 Aug 2021
How Much Can CLIP Benefit Vision-and-Language Tasks?
How Much Can CLIP Benefit Vision-and-Language Tasks?
Sheng Shen
Liunian Harold Li
Hao Tan
Mohit Bansal
Anna Rohrbach
Kai-Wei Chang
Z. Yao
Kurt Keutzer
CLIP
VLM
MLLM
180
342
0
13 Jul 2021
ManipulaTHOR: A Framework for Visual Object Manipulation
ManipulaTHOR: A Framework for Visual Object Manipulation
Kiana Ehsani
Winson Han
Alvaro Herrasti
Eli VanderBilt
Luca Weihs
Eric Kolve
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
153
99
0
22 Apr 2021
Learn-to-Race: A Multimodal Control Environment for Autonomous Racing
Learn-to-Race: A Multimodal Control Environment for Autonomous Racing
James Herman
Jonathan M Francis
Siddha Ganju
Bingqing Chen
Anirudh Koul
Abhinav Gupta
Alexey Skabelkin
Ivan Zhukov
Max Kumskoy
Eric Nyberg
10
29
0
22 Mar 2021
On the Evaluation of Vision-and-Language Navigation Instructions
On the Evaluation of Vision-and-Language Navigation Instructions
Mingde Zhao
Peter Anderson
Vihan Jain
Su Wang
Alexander Ku
Jason Baldridge
Eugene Ie
231
49
0
26 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
F. Khan
M. Shah
ViT
216
2,404
0
04 Jan 2021
Multimodal Research in Vision and Language: A Review of Current and
  Emerging Trends
Multimodal Research in Vision and Language: A Review of Current and Emerging Trends
Shagun Uppal
Sarthak Bhagat
Devamanyu Hazarika
Navonil Majumdar
Soujanya Poria
Roger Zimmermann
Amir Zadeh
11
6
0
19 Oct 2020
SAPIEN: A SimulAted Part-based Interactive ENvironment
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang
Yuzhe Qin
Kaichun Mo
Yikuan Xia
Hao Zhu
...
He-Nan Wang
Li Yi
Angel X. Chang
Leonidas J. Guibas
Hao Su
198
482
0
19 Mar 2020
Diverse and Admissible Trajectory Forecasting through Multimodal Context
  Understanding
Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding
Seonguk Park
Gyubok Lee
Manoj Bhat
Jimin Seo
Minseok Kang
Jonathan M Francis
Ashwin R. Jadhav
Paul Pu Liang
Louis-Philippe Morency
113
117
0
06 Mar 2020
Help, Anna! Visual Navigation with Natural Multimodal Assistance via
  Retrospective Curiosity-Encouraging Imitation Learning
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&Ro
EgoV
167
148
0
04 Sep 2019
Neural Modular Control for Embodied Question Answering
Neural Modular Control for Embodied Question Answering
Abhishek Das
Georgia Gkioxari
Stefan Lee
Devi Parikh
Dhruv Batra
LM&Ro
117
126
0
26 Oct 2018
Speaker-Follower Models for Vision-and-Language Navigation
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried
Ronghang Hu
Volkan Cirik
Anna Rohrbach
Jacob Andreas
Louis-Philippe Morency
Taylor Berg-Kirkpatrick
Kate Saenko
Dan Klein
Trevor Darrell
LM&Ro
LRM
237
444
0
07 Jun 2018
1