Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2111.09888
Cited By
v1
v2 (latest)
Simple but Effective: CLIP Embeddings for Embodied AI
18 November 2021
Apoorv Khandelwal
Luca Weihs
Roozbeh Mottaghi
Aniruddha Kembhavi
VLM
LM&Ro
Re-assign community
ArXiv (abs)
PDF
HTML
Github (126★)
Papers citing
"Simple but Effective: CLIP Embeddings for Embodied AI"
50 / 190 papers shown
Title
Human-Centric Open-Future Task Discovery: Formulation, Benchmark, and Scalable Tree-Based Search
Zijian Song
Xiaoxin Lin
Tao Pu
Zhenlong Yuan
Guangrun Wang
Liang Lin
105
0
0
24 Nov 2025
AVERY: Adaptive VLM Split Computing through Embodied Self-Awareness for Efficient Disaster Response Systems
Rajat Bhattacharjya
Sing-Yao Wu
Hyunwoo Oh
Chaewon Nam
Suyeon Koo
Mohsen Imani
Elaheh Bozorgzadeh
N. Dutt
VLM
70
0
0
22 Nov 2025
A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models
Shihab Aaqil Ahamed
Udaya S.K.P. Miriya Thanthrige
Ranga Rodrigo
Muhammad Haris Khan
VLM
146
0
0
30 Oct 2025
C-NAV: Towards Self-Evolving Continual Object Navigation in Open World
Ming-Ming Yu
Fei Zhu
Wenzhuo Liu
Y. Yang
Qunbo Wang
Wenjun Wu
Jing Liu
130
1
0
23 Oct 2025
Exploring Conditions for Diffusion models in Robotic Control
Heeseong Shin
Byeongho Heo
Dongyoon Han
Seungryong Kim
Taekyung Kim
140
0
0
17 Oct 2025
What Matters in RL-Based Methods for Object-Goal Navigation? An Empirical Study and A Unified Framework
Hongze Wang
Boyang Sun
Jiaxu Xing
Fan Yang
Marco Hutter
Dhruv Shah
Davide Scaramuzza
Marc Pollefeys
48
0
0
02 Oct 2025
LAGEA: Language Guided Embodied Agents for Robotic Manipulation
Abdul Monaf Chowdhury
Akm Moshiur Rahman Mazumder
Rabeya Akter
S. Arib
LM&Ro
80
0
0
27 Sep 2025
Revealing Multimodal Causality with Large Language Models
Jin Li
Shoujin Wang
Qi Zhang
Feng Liu
Tongliang Liu
LongBing Cao
Shui Yu
F. Chen
116
0
0
22 Sep 2025
Agentic Aerial Cinematography: From Dialogue Cues to Cinematic Trajectories
Yifan Lin
Sophie Ziyu Liu
Ran Qi
George Z. Xue
Xinping Song
Chao Qin
Hugh H. T. Liu
VGen
93
0
0
19 Sep 2025
Object Detection with Multimodal Large Vision-Language Models: An In-depth Review
Information Fusion (Inf. Fusion), 2025
Ranjan Sapkota
Manoj Karkee
ObjD
VLM
239
9
0
25 Aug 2025
Imaginative World Modeling with Scene Graphs for Embodied Agent Navigation
Yue Hu
Junzhe Wu
Ruihan Xu
Hang Liu
Avery Xi
Henry X. Liu
Ram Vasudevan
Maani Ghaffari
LM&Ro
92
2
0
09 Aug 2025
MAG-Nav: Language-Driven Object Navigation Leveraging Memory-Reserved Active Grounding
Weifan Zhang
Tingguang Li
Yuzhen Liu
LM&Ro
64
1
0
07 Aug 2025
X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention
International Conference on Learning Representations (ICLR), 2025
Xiaochen Zhao
Hongyi Xu
Guoxian Song
You Xie
Chenxu Zhang
Xiu Li
Linjie Luo
J. Suo
Yebin Liu
VGen
128
10
0
30 Jul 2025
Efficient and Generalizable Environmental Understanding for Visual Navigation
Ruoyu Wang
Xinshu Li
Chen Wang
Lina Yao
CML
184
0
0
18 Jun 2025
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
IEEE International Conference on Robotics and Automation (ICRA), 2025
Yihe Tang
Wenlong Huang
Yingke Wang
Chengshu Li
Roy Yuan
Ruohan Zhang
Jiajun Wu
Li Fei-Fei
212
0
0
10 Jun 2025
MapBERT: Bitwise Masked Modeling for Real-Time Semantic Mapping Generation
Yijie Deng
Shuaihang Yuan
Congcong Wen
Niraj Pudasaini
Anthony Tzes
Geeta Chandra Raju Bethala
Yi Fang
111
0
0
09 Jun 2025
RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Junjie Li
Nan Zhang
Xiaoyang Qu
Kai Lu
Guokuan Li
Jiguang Wan
Jianzong Wang
213
1
0
03 Jun 2025
DORAEMON: Decentralized Ontology-aware Reliable Agent with Enhanced Memory Oriented Navigation
Tianjun Gu
Linfeng Li
Xuhong Wang
Chenghua Gong
Jingyu Gong
Zhizhong Zhang
Yuan Xie
Lizhuang Ma
Xin Tan
LM&Ro
392
0
0
28 May 2025
SD-OVON: A Semantics-aware Dataset and Benchmark Generation Pipeline for Open-Vocabulary Object Navigation in Dynamic Scenes
Dicong Qiu
Jiadi You
Zeying Gong
Ronghe Qiu
Hui Xiong
Junwei Liang
136
0
0
24 May 2025
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Pouya Bashivan
KELM
156
0
0
19 May 2025
A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI
Lik Hang Kenny Wong
Xueyang Kang
Kaixin Bai
Jianwei Zhang
298
9
0
01 May 2025
Multimodal Perception for Goal-oriented Navigation: A Survey
I-Tak Ieong
Hao Tang
LM&Ro
LRM
261
0
0
22 Apr 2025
CL-CoTNav: Closed-Loop Hierarchical Chain-of-Thought for Zero-Shot Object-Goal Navigation with Vision-Language Models
Yuxin Cai
Xiangkun He
Maonan Wang
Hongliang Guo
W. Yau
Chen Lv
LM&Ro
LRM
287
6
0
11 Apr 2025
FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation
Xianqi Zhang
Hongliang Wei
Wenrui Wang
Xingtao Wang
Xiaopeng Fan
Debin Zhao
179
1
0
28 Mar 2025
Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification
Computer Vision and Pattern Recognition (CVPR), 2025
Dongseob Kim
Hyunjung Shim
VLM
271
0
0
21 Mar 2025
Open-World Skill Discovery from Unsegmented Demonstrations
Jingwen Deng
Zihao Wang
Shaofei Cai
Hoang Trung-Dung
Yitao Liang
167
3
0
11 Mar 2025
WMNav: Integrating Vision-Language Models into World Models for Object Goal Navigation
Dujun Nie
Xianda Guo
Yiqun Duan
Ruijun Zhang
Long Chen
LM&Ro
581
18
0
04 Mar 2025
CuriousBot: Interactive Mobile Exploration via Actionable 3D Relational Object Graph
Yixuan Wang
Leonor Fermoselle
Tarik Kelestemur
Jiuguang Wang
Yunzhu Li
189
4
0
23 Jan 2025
Visual Semantic Navigation with Real Robots
Carlos Gutiérrez-Álvarez
Pablo Ríos-Navarro
Rafael Flor-Rodríguez
Francisco Javier Acevedo-Rodríguez
Roberto J. López-Sastre
354
4
0
10 Jan 2025
Efficient Policy Adaptation with Contrastive Prompt Ensemble for Embodied Agents
Neural Information Processing Systems (NeurIPS), 2024
Wonje Choi
Woo Kyung Kim
SeungHyun Kim
Honguk Woo
271
12
0
16 Dec 2024
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Yueru Jia
Jiaming Liu
Sixiang Chen
Chenyang Gu
Zihan Wang
...
Lily Lee
Pengwei Wang
Zhongyuan Wang
Renrui Zhang
Shanghang Zhang
325
38
0
27 Nov 2024
Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jiajun Xi
Yinong He
Jianing Yang
Yinpei Dai
Joyce Chai
LM&Ro
257
9
0
31 Oct 2024
Reliable Semantic Understanding for Real World Zero-shot Object Goal Navigation
International Conference on Pattern Recognition (ICPR), 2024
Halil Utku Unlu
Shuaihang Yuan
Congcong Wen
Niraj Pudasaini
Anthony Tzes
Yi Fang
142
1
0
29 Oct 2024
Zero-shot Object Navigation with Vision-Language Models Reasoning
International Conference on Pattern Recognition (ICPR), 2024
Congcong Wen
Yisiyuan Huang
Niraj Pudasaini
Yanjia Huang
Shuaihang Yuan
Yu Hao
Hui Lin
Yu-Shen Liu
Yi Fang
LM&Ro
188
20
0
24 Oct 2024
ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination
International Conference on Learning Representations (ICLR), 2024
Xinxin Zhao
Wenzhe Cai
Likun Tang
Teng Wang
LM&Ro
182
19
0
13 Oct 2024
SG-Nav: Online 3D Scene Graph Prompting for LLM-based Zero-shot Object Navigation
Neural Information Processing Systems (NeurIPS), 2024
Hang Yin
Xiuwei Xu
Zhenyu Wu
Jie Zhou
Jiwen Lu
191
64
0
10 Oct 2024
Positive-Augmented Contrastive Learning for Vision-and-Language Evaluation and Training
International Journal of Computer Vision (IJCV), 2024
Sara Sarto
Nicholas Moratelli
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
207
8
0
09 Oct 2024
PREDICT: Preference Reasoning by Evaluating Decomposed preferences Inferred from Candidate Trajectories
Stephane Aroca-Ouellette
Natalie Mackraz
B. Theobald
Katherine Metcalf
137
0
0
08 Oct 2024
The Wallpaper is Ugly: Indoor Localization using Vision and Language
IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2023
Seth Pate
Lawson L. S. Wong
163
4
0
04 Oct 2024
ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI
Ahmad Elawady
Gunjan Chhablani
Ram Ramrakhya
Karmesh Yadav
Dhruv Batra
Z. Kira
Andrew Szot
OffRL
268
2
0
03 Oct 2024
DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
Zhaowei Wang
Hongming Zhang
Tianqing Fang
Ye Tian
Yue Yang
Kaixin Ma
Xiaoman Pan
Yangqiu Song
Dong Yu
LM&Ro
327
4
0
03 Oct 2024
Robo-MUTUAL: Robotic Multimodal Task Specification via Unimodal Learning
IEEE International Conference on Robotics and Automation (ICRA), 2024
Jianxiong Li
Zhihao Wang
Jinliang Zheng
Xiaoai Zhou
Guanming Wang
...
Yu Liu
Jingjing Liu
Ya-Qin Zhang
Junzhi Yu
Xianyuan Zhan
187
4
0
02 Oct 2024
Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies
IEEE International Conference on Robotics and Automation (ICRA), 2024
Ruiyu Wang
Zheyu Zhuang
Shutong Jin
Nils Ingelhag
Danica Kragic
Florian T. Pokorny
282
0
0
30 Sep 2024
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
IEEE International Conference on Robotics and Automation (ICRA), 2024
Jiaheng Hu
Rose Hendrix
Ali Farhadi
Aniruddha Kembhavi
Roberto Martín-Martín
Peter Stone
Kuo-Hao Zeng
Kiana Ehsani
274
38
0
25 Sep 2024
HM3D-OVON: A Dataset and Benchmark for Open-Vocabulary Object Goal Navigation
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Naoki Yokoyama
Ram Ramrakhya
Abhishek Das
Dhruv Batra
Sehoon Ha
183
38
0
22 Sep 2024
Automatic Scene Generation: State-of-the-Art Techniques, Models, Datasets, Challenges, and Future Prospects
IEEE Access (IEEE Access), 2024
Awal Ahmed Fime
Saifuddin Mahmud
Arpita Das
Md. Sunzidul Islam
Hong-Hoon Kim
VGen
3DV
183
2
0
14 Sep 2024
SOOD-ImageNet: a Large-Scale Dataset for Semantic Out-Of-Distribution Image Classification and Semantic Segmentation
Alberto Bacchin
Davide Allegro
Stefano Ghidoni
Emanuele Menegatti
170
1
0
02 Sep 2024
VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps
IEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2024
Senthil Hariharan Arul
Dhruva Kumar
Vivek Sugirtharaj
Richard Kim
Xuewei
Qi
R. Madhivanan
Arnie Sen
Dinesh Manocha
55
2
0
15 Aug 2024
Visual Grounding for Object-Level Generalization in Reinforcement Learning
European Conference on Computer Vision (ECCV), 2024
Haobin Jiang
Zongqing Lu
LM&Ro
169
3
0
04 Aug 2024
NOLO: Navigate Only Look Once
Mengyu Bu
Shuhao Gu
Yang Feng
EgoV
283
1
0
02 Aug 2024
1
2
3
4
Next