Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2111.09888
Cited By
v1
v2 (latest)
Simple but Effective: CLIP Embeddings for Embodied AI
18 November 2021
Apoorv Khandelwal
Luca Weihs
Roozbeh Mottaghi
Aniruddha Kembhavi
VLM
LM&Ro
Re-assign community
ArXiv (abs)
PDF
HTML
Github (126★)
Papers citing
"Simple but Effective: CLIP Embeddings for Embodied AI"
50 / 190 papers shown
Title
OVExp: Open Vocabulary Exploration for Object-Oriented Navigation
Meng Wei
Tai Wang
Yilun Chen
Hanqing Wang
Jiangmiao Pang
Xihui Liu
VLM
173
7
0
12 Jul 2024
Open Scene Graphs for Open World Object-Goal Navigation
Joel Loo
Zhanxin Wu
David Hsu
LM&Ro
210
12
0
02 Jul 2024
PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators
Kuo-Hao Zeng
Zichen Zhang
Kiana Ehsani
Rose Hendrix
Jordi Salvador
Alvaro Herrasti
Ross Girshick
Aniruddha Kembhavi
Luca Weihs
LM&Ro
OffRL
175
49
0
28 Jun 2024
ET tu, CLIP? Addressing Common Object Errors for Unseen Environments
Ye Won Byun
Cathy Jiao
Shahriar Noroozizadeh
Jimin Sun
Rosa Vitiello
VLM
201
1
0
25 Jun 2024
Center-Sensitive Kernel Optimization for Efficient On-Device Incremental Learning
Xiaoxu Feng
Yan Li
De Cheng
N. Wang
Jiawei Han
CLL
158
2
0
13 Jun 2024
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
International Conference on Machine Learning (ICML), 2024
Dongyoon Hwang
ByungKun Lee
Hojoon Lee
Hyunseung Kim
Jaegul Choo
228
0
0
10 Jun 2024
Learning Manipulation by Predicting Interaction
Jia Zeng
Qingwen Bu
Bangjun Wang
Wenke Xia
Li Chen
...
Heming Cui
Bin Zhao
Xuelong Li
Yu Qiao
Hongyang Li
330
38
0
01 Jun 2024
SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation
Junjie Zhang
Fuchun Sun
Haoran He
Wenke Xia
Zhigang Wang
Bin Zhao
Xiu Li
Xuelong Li
280
25
0
30 May 2024
Leveraging Unknown Objects to Construct Labeled-Unlabeled Meta-Relationships for Zero-Shot Object Navigation
Yanwei Zheng
Changrui Li
Chuanlin Lan
Yaling Li
Xiao Zhang
Yifei Zou
Dongxiao Yu
Zhipeng Cai
197
0
0
24 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
772
152
0
23 May 2024
Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control
Neural Information Processing Systems (NeurIPS), 2024
Gunshi Gupta
Karmesh Yadav
Y. Gal
Dhruv Batra
Z. Kira
Cong Lu
Tim G. J. Rudner
200
11
0
09 May 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Yong A
Hongze Yu
...
Huaping Liu
Gang Hua
F. Sun
Jianwei Zhang
Bin Fang
AI4CE
LM&Ro
763
24
0
28 Apr 2024
Unified Scene Representation and Reconstruction for 3D Large Language Models
Tao Chu
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Qiong Liu
Yuan Liu
206
4
0
19 Apr 2024
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
Fangwei Zhong
Kui Wu
Hai Ci
Churan Wang
Hao Chen
OffRL
180
10
0
15 Apr 2024
TDANet: Target-Directed Attention Network For Object-Goal Visual Navigation With Zero-Shot Ability
Shiwei Lian
Feitian Zhang
238
9
0
12 Apr 2024
Reflectance Estimation for Proximity Sensing by Vision-Language Models: Utilizing Distributional Semantics for Low-Level Cognition in Robotics
Masashi Osada
G. A. G. Ricardez
Yosuke Suzuki
Tadahiro Taniguchi
209
4
0
11 Apr 2024
GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation
Computer Vision and Pattern Recognition (CVPR), 2024
Mukul Khanna
Ram Ramrakhya
Gunjan Chhablani
Sriram Yenamandra
Théophile Gervet
Matthew Chang
Z. Kira
Devendra Singh Chaplot
Dhruv Batra
Roozbeh Mottaghi
LM&Ro
216
60
0
09 Apr 2024
SUGAR: Pre-training 3D Visual Representations for Robotics
Computer Vision and Pattern Recognition (CVPR), 2024
Shizhe Chen
Ricardo Garcia Pinel
Ivan Laptev
Cordelia Schmid
209
32
0
01 Apr 2024
Online Embedding Multi-Scale CLIP Features into 3D Maps
Shun Taguchi
Hideki Deguchi
117
0
0
27 Mar 2024
C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion
Hee Suk Yoon
Eunseop Yoon
Joshua Tian Jin Tee
M. Hasegawa-Johnson
Yingzhen Li
C. Yoo
VLM
353
59
0
21 Mar 2024
Aligning Knowledge Graph with Visual Perception for Object-goal Navigation
Nuo Xu
Wen Wang
Rong Yang
Mengjie Qin
Zheyuan Lin
Wei Song
Chunlong Zhang
J. Gu
Chao Li
285
14
0
29 Feb 2024
DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning
Jianxiong Li
Jinliang Zheng
Yinan Zheng
Liyuan Mao
Xiaoming Hu
...
Jihao Liu
Yu Liu
Jingjing Liu
Ya Zhang
Xianyuan Zhan
LM&Ro
OffRL
237
14
0
28 Feb 2024
RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
Hanxiao Jiang
Binghao Huang
Ruihai Wu
Zhuoran Li
Shubham Garg
H. Nayyeri
Shenlong Wang
Yunzhu Li
243
45
0
23 Feb 2024
BBSEA: An Exploration of Brain-Body Synchronization for Embodied Agents
Sizhe Yang
Qian Luo
Anumpam Pani
Yanchao Yang
175
4
0
13 Feb 2024
Towards Explainable, Safe Autonomous Driving with Language Embeddings for Novelty Identification and Active Learning: Framework and Experimental Analysis with Real-World Data Sets
Ross Greer
Mohan M. Trivedi
216
28
0
11 Feb 2024
Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation
Dennis Hoftijzer
Gertjan J. Burghouts
Luuk J. Spreeuwers
201
3
0
07 Feb 2024
The Essential Role of Causality in Foundation World Models for Embodied AI
Tarun Gupta
Wenbo Gong
Chao Ma
Nick Pawlowski
Agrin Hilmkil
...
Jianfeng Gao
Stefan Bauer
Danica Kragic
Bernhard Schölkopf
Cheng Zhang
231
28
0
06 Feb 2024
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
Haoyi Zhu
Yating Wang
Di Huang
Weicai Ye
Wanli Ouyang
Tong He
SSL
3DPC
298
46
0
04 Feb 2024
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
Weihao Tan
Wentao Zhang
Shanqi Liu
Longtao Zheng
Xinrun Wang
Rui Hu
OffRL
204
34
0
25 Jan 2024
CLIP feature-based randomized control using images and text for multiple tasks and robots
Kazuki Shibata
Hideki Deguchi
Shun Taguchi
243
3
0
18 Jan 2024
VoroNav: Voronoi-based Zero-shot Object Navigation with Large Language Model
Pengying Wu
Yao Mu
Bingxian Wu
Yi Hou
Ji Ma
Shanghang Zhang
Chang-rui Liu
LM&Ro
190
59
0
05 Jan 2024
Holodeck: Language Guided Generation of 3D Embodied AI Environments
Computer Vision and Pattern Recognition (CVPR), 2023
Yue Yang
Fan-Yun Sun
Luca Weihs
Eli VanderBilt
Alvaro Herrasti
...
Lingjie Liu
Chris Callison-Burch
Mark Yatskar
Aniruddha Kembhavi
Christopher Clark
LM&Ro
371
173
0
14 Dec 2023
Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation
Shaopeng Zhai
Jie Wang
Tianyi Zhang
Fuxian Huang
Tao Gui
Ming Zhou
Jing Hou
Yu Qiao
Yu Liu
LLMAG
LM&Ro
403
4
0
12 Dec 2023
Harmonic Mobile Manipulation
Ruihan Yang
Yejin Kim
Aniruddha Kembhavi
Xiaolong Wang
Kiana Ehsani
166
25
0
11 Dec 2023
FoMo Rewards: Can we cast foundation models as reward functions?
Ekdeep Singh Lubana
Johann Brehmer
P. D. Haan
Taco S. Cohen
OffRL
LRM
207
4
0
06 Dec 2023
Understanding Representations Pretrained with Auxiliary Losses for Embodied Agent Planning
Samrudhdhi B. Rangrej
James J. Clark
SSL
207
0
0
06 Dec 2023
DGMem: Learning Visual Navigation Policy without Any Labels by Dynamic Graph Memory
Wenzhe Cai
Teng Wang
Guangran Cheng
Lele Xu
Changyin Sun
257
1
0
30 Nov 2023
Transfer Learning in Robotics: An Upcoming Breakthrough? A Review of Promises and Challenges
Noémie Jaquier
Michael C. Welle
A. Gams
Kunpeng Yao
Bernardo Fichera
A. Billard
Aleš Ude
Tamim Asfour
Danica Kragic
221
32
0
29 Nov 2023
Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations
Computer Vision and Pattern Recognition (CVPR), 2023
Lei Fan
Jianxiong Zhou
Xiaoying Xing
Ying Wu
VLM
195
7
0
28 Nov 2023
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
364
45
0
24 Nov 2023
Selective Visual Representations Improve Convergence and Generalization for Embodied AI
Ainaz Eftekhar
Kuo-Hao Zeng
Jiafei Duan
Ali Farhadi
Aniruddha Kembhavi
Ranjay Krishna
290
23
0
07 Nov 2023
Scene-Driven Multimodal Knowledge Graph Construction for Embodied AI
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Yaoxian Song
Yixiang Chen
Haoyu Liu
Li Zhixu
Wei Song
Yanghua Xiao
Xiaofang Zhou
LM&Ro
194
30
0
07 Nov 2023
Exploitation-Guided Exploration for Semantic Embodied Navigation
IEEE International Conference on Robotics and Automation (ICRA), 2023
Justin Wasserman
Girish Chowdhary
Abhinav Gupta
Unnat Jain
199
11
0
06 Nov 2023
Can Foundation Models Watch, Talk and Guide You Step by Step to Make a Cake?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yuwei Bao
Keunwoo Peter Yu
Yichi Zhang
Shane Storks
Itamar Bar-Yossef
Alexander De La Iglesia
Megan Su
Xiao Lin Zheng
Joyce Chai
189
16
0
01 Nov 2023
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Juan Rocamonde
Victoriano Montesinos
Elvis Nava
Ethan Perez
David Lindner
VLM
297
127
0
19 Oct 2023
Zero-Shot Object Goal Visual Navigation With Class-Independent Relationship Network
Xinting Li
Shizhou Zhang
Yue Lu
Kerry Dan
Lingyan Ran
205
4
0
15 Oct 2023
An Unbiased Look at Datasets for Visuo-Motor Pre-Training
Sudeep Dasari
Mohan Kumar Srirama
Unnat Jain
Abhinav Gupta
SSL
233
51
0
13 Oct 2023
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy
IEEE International Conference on Robotics and Automation (ICRA), 2023
Zichen Zhang
Yunshuang Li
Osbert Bastani
Abhishek Gupta
Dinesh Jayaraman
Yecheng Jason Ma
Luca Weihs
218
25
0
12 Oct 2023
GROOT: Learning to Follow Instructions by Watching Gameplay Videos
International Conference on Learning Representations (ICLR), 2023
Shaofei Cai
Bowei Zhang
Zihao Wang
Xiaojian Ma
Hoang Trung-Dung
Yitao Liang
272
37
0
12 Oct 2023
Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation Using Vision Language Models
Bangguo Yu
Qihao Yuan
Kailai Li
Hamidreza Kasaei
Ming Cao
LM&Ro
257
28
0
11 Oct 2023
Previous
1
2
3
4
Next