ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09888
  4. Cited By
Simple but Effective: CLIP Embeddings for Embodied AI
v1v2 (latest)

Simple but Effective: CLIP Embeddings for Embodied AI

18 November 2021
Apoorv Khandelwal
Luca Weihs
Roozbeh Mottaghi
Aniruddha Kembhavi
    VLMLM&Ro
ArXiv (abs)PDFHTMLGithub (126★)

Papers citing "Simple but Effective: CLIP Embeddings for Embodied AI"

50 / 190 papers shown
Title
Human-oriented Representation Learning for Robotic Manipulation
Human-oriented Representation Learning for Robotic Manipulation
Mingxiao Huo
Mingyu Ding
Chenfeng Xu
Thomas Tian
Xinghao Zhu
Yao Mu
Lingfeng Sun
Masayoshi Tomizuka
Wei Zhan
SSL
238
13
0
04 Oct 2023
What do we learn from a large-scale study of pre-trained visual
  representations in sim and real environments?
What do we learn from a large-scale study of pre-trained visual representations in sim and real environments?IEEE International Conference on Robotics and Automation (ICRA), 2023
Sneha Silwal
Karmesh Yadav
Tingfan Wu
Jay Vakil
Arjun Majumdar
...
Dhruv Batra
Aravind Rajeswaran
Mrinal Kalakrishnan
Franziska Meier
Oleksandr Maksymets
SSLLM&Ro
266
15
0
03 Oct 2023
Learning to Terminate in Object Navigation
Learning to Terminate in Object NavigationAsian Conference on Machine Learning (ACML), 2023
Yuhang Song
Anh Nguyen
Chun-Yi Lee
173
4
0
28 Sep 2023
An In-depth Survey of Large Language Model-based Artificial Intelligence
  Agents
An In-depth Survey of Large Language Model-based Artificial Intelligence Agents
Pengyu Zhao
Zijian Jin
Ning Cheng
LLMAG
173
34
0
23 Sep 2023
Bridging Zero-shot Object Navigation and Foundation Models through
  Pixel-Guided Navigation Skill
Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation SkillIEEE International Conference on Robotics and Automation (ICRA), 2023
Wenzhe Cai
Siyuan Huang
Guangran Cheng
Yuxing Long
Shiyang Feng
Changyin Sun
Hao Dong
LM&Ro
242
86
0
19 Sep 2023
Find What You Want: Learning Demand-conditioned Object Attribute Space
  for Demand-driven Navigation
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven NavigationNeural Information Processing Systems (NeurIPS), 2023
Hongchen Wang
Andy Guan Hong Chen
Xiaoqi Li
Mingdong Wu
Hao Dong
343
24
0
15 Sep 2023
SayNav: Grounding Large Language Models for Dynamic Planning to
  Navigation in New Environments
SayNav: Grounding Large Language Models for Dynamic Planning to Navigation in New EnvironmentsInternational Conference on Automated Planning and Scheduling (ICAPS), 2023
Abhinav Rajvanshi
Karan Sikka
Xiao Lin
Bhoram Lee
Han-Pang Chiu
Alvaro Velasquez
LM&RoLRMLLMAG
547
85
0
08 Sep 2023
Object Goal Navigation with Recursive Implicit Maps
Object Goal Navigation with Recursive Implicit MapsIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Shizhe Chen
Thomas Chabal
Ivan Laptev
Cordelia Schmid
186
31
0
10 Aug 2023
Robust Visual Sim-to-Real Transfer for Robotic Manipulation
Robust Visual Sim-to-Real Transfer for Robotic ManipulationIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Ricardo Garcia Pinel
Robin Strudel
Shizhe Chen
Etienne Arlaud
Ivan Laptev
Cordelia Schmid
OffRL
232
8
0
28 Jul 2023
SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic
  Spaces
SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic SpacesEuropean Conference on Artificial Intelligence (ECAI), 2023
Iván Vallés-Pérez
Grzegorz Beringer
Piotr Bilinski
G. Cook
Roberto Barra-Chicote
142
1
0
23 Jul 2023
Learning Navigational Visual Representations with Semantic Map
  Supervision
Learning Navigational Visual Representations with Semantic Map SupervisionIEEE International Conference on Computer Vision (ICCV), 2023
Yicong Hong
Yang Zhou
Ruiyi Zhang
Franck Dernoncourt
Trung Bui
Stephen Gould
Hao Tan
SSL
199
42
0
23 Jul 2023
Is Imitation All You Need? Generalized Decision-Making with Dual-Phase
  Training
Is Imitation All You Need? Generalized Decision-Making with Dual-Phase TrainingIEEE International Conference on Computer Vision (ICCV), 2023
Yao Wei
Yanchao Sun
Ruijie Zheng
Sai H. Vemprala
Rogerio Bonatti
Shuhang Chen
Ratnesh Madaan
Zhongjie Ba
Ashish Kapoor
Shuang Ma
OffRL
234
20
0
16 Jul 2023
Switching Head-Tail Funnel UNITER for Dual Referring Expression
  Comprehension with Fetch-and-Carry Tasks
Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry TasksIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Ryosuke Korekata
Motonari Kambara
Yusuke Yoshida
Shintaro Ishikawa
Yosuke Kawasaki
Masaki Takahashi
K. Sugiura
LM&Ro
168
8
0
14 Jul 2023
Decomposing the Generalization Gap in Imitation Learning for Visual
  Robotic Manipulation
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic ManipulationIEEE International Conference on Robotics and Automation (ICRA), 2023
Annie Xie
Lisa Lee
Ted Xiao
Chelsea Finn
232
88
0
07 Jul 2023
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained
  Networks
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained NetworksIEEE International Conference on Robotics and Automation (ICRA), 2023
Xingyu Lin
John So
Sashwat Mahalingam
Fangchen Liu
Pieter Abbeel
SSL
284
34
0
07 Jul 2023
DoReMi: Grounding Language Model by Detecting and Recovering from
  Plan-Execution Misalignment
DoReMi: Grounding Language Model by Detecting and Recovering from Plan-Execution MisalignmentIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Yanjiang Guo
Yen-Jen Wang
Lihan Zha
Zheyuan Jiang
Jianyu Chen
LM&Ro
415
60
0
01 Jul 2023
HabiCrowd: A High Performance Simulator for Crowd-Aware Visual
  Navigation
HabiCrowd: A High Performance Simulator for Crowd-Aware Visual NavigationIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Vuong Dinh An
Toan Tien Nguyen
Minh Nhat Vu
Baoru Huang
Dzung Nguyen
H. Binh
T. Vo
Anh Nguyen
168
12
0
20 Jun 2023
Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene
  Scale and Realism Tradeoffs for ObjectGoal Navigation
Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal NavigationComputer Vision and Pattern Recognition (CVPR), 2023
Mukul Khanna
Yongsen Mao
Hanxiao Jiang
Sanjay Haresh
Brennan Schacklett
Dhruv Batra
Alexander Clegg
Eric Undersander
Angel X. Chang
Manolis Savva
3DV
391
123
0
20 Jun 2023
A Universal Semantic-Geometric Representation for Robotic Manipulation
A Universal Semantic-Geometric Representation for Robotic ManipulationConference on Robot Learning (CoRL), 2023
Tong Zhang
Yingdong Hu
Hanchen Cui
Hang Zhao
Yang Gao
204
25
0
18 Jun 2023
ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in
  Artistic Creations
ArtWhisperer: A Dataset for Characterizing Human-AI Interactions in Artistic CreationsInternational Conference on Machine Learning (ICML), 2023
Kailas Vodrahalli
James Zou
214
9
0
13 Jun 2023
Embodied Executable Policy Learning with Language-based Scene
  Summarization
Embodied Executable Policy Learning with Language-based Scene SummarizationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jielin Qiu
Mengdi Xu
William Jongwon Han
Seungwhan Moon
Ding Zhao
LM&Ro
139
9
0
09 Jun 2023
CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities
CLIPGraphs: Multimodal Graph Networks to Infer Object-Room AffinitiesIEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 2023
A. Agrawal
Raghav Arora
Ahana Datta
Snehasis Banerjee
Brojeshwar Bhowmick
Krishna Murthy Jatavallabhula
Mohan Sridharan
Madhava Krishna
247
4
0
02 Jun 2023
LIV: Language-Image Representations and Rewards for Robotic Control
LIV: Language-Image Representations and Rewards for Robotic ControlInternational Conference on Machine Learning (ICML), 2023
Yecheng Jason Ma
William Liang
Vaidehi Som
Vikash Kumar
Amy Zhang
Osbert Bastani
Dinesh Jayaraman
LM&Ro
194
174
0
01 Jun 2023
Pre-training Contextualized World Models with In-the-wild Videos for
  Reinforcement Learning
Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023
Jialong Wu
Haoyu Ma
Chao Deng
Mingsheng Long
OffRL
265
43
0
29 May 2023
Masked Path Modeling for Vision-and-Language Navigation
Masked Path Modeling for Vision-and-Language NavigationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zi-Yi Dou
Feng Gao
Nanyun Peng
LM&Ro
170
4
0
23 May 2023
Pick your Poison: Undetectability versus Robustness in Data Poisoning
  Attacks
Pick your Poison: Undetectability versus Robustness in Data Poisoning Attacks
Nils Lukas
Florian Kerschbaum
239
1
0
07 May 2023
Programmatically Grounded, Compositionally Generalizable Robotic
  Manipulation
Programmatically Grounded, Compositionally Generalizable Robotic ManipulationInternational Conference on Learning Representations (ICLR), 2023
Renhao Wang
Jiayuan Mao
Joy Hsu
Hang Zhao
Jiajun Wu
Yang Gao
LM&Ro
314
41
0
26 Apr 2023
Multimodal Grounding for Embodied AI via Augmented Reality Headsets for
  Natural Language Driven Task Planning
Multimodal Grounding for Embodied AI via Augmented Reality Headsets for Natural Language Driven Task Planning
Selma Wanna
Fabian Parra
R. Valner
Karl Kruusamäe
Mitch Pryor
LM&Ro
154
3
0
26 Apr 2023
Moving Forward by Moving Backward: Embedding Action Impact over Action
  Semantics
Moving Forward by Moving Backward: Embedding Action Impact over Action SemanticsInternational Conference on Learning Representations (ICLR), 2023
Kuo-Hao Zeng
Luca Weihs
Roozbeh Mottaghi
Ali Farhadi
157
3
0
24 Apr 2023
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Lossless Adaptation of Pretrained Vision Models For Robotic ManipulationInternational Conference on Learning Representations (ICLR), 2023
Mohit Sharma
Claudio Fantacci
Yuxiang Zhou
Skanda Koppula
N. Heess
Jonathan Scholz
Y. Aytar
VLM
226
37
0
13 Apr 2023
L3MVN: Leveraging Large Language Models for Visual Target Navigation
L3MVN: Leveraging Large Language Models for Visual Target NavigationIEEE/RJS International Conference on Intelligent RObots and Systems (IROS), 2023
Bangguo Yu
Hamidreza Kasaei
M. Cao
LM&Ro
221
159
0
11 Apr 2023
MOPA: Modular Object Navigation with PointGoal Agents
MOPA: Modular Object Navigation with PointGoal AgentsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Sonia Raychaudhuri
Tommaso Campari
Unnat Jain
Manolis Savva
Angel X. Chang
3DPC
353
13
0
07 Apr 2023
ENTL: Embodied Navigation Trajectory Learner
ENTL: Embodied Navigation Trajectory LearnerIEEE International Conference on Computer Vision (ICCV), 2023
Klemen Kotar
Aaron Walsman
Roozbeh Mottaghi
331
12
0
05 Apr 2023
Where are we in the search for an Artificial Visual Cortex for Embodied
  Intelligence?
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?Neural Information Processing Systems (NeurIPS), 2023
Arjun Majumdar
Karmesh Yadav
Sergio Arnaud
Yecheng Jason Ma
Claire Chen
...
Dhruv Batra
Yixin Lin
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
LM&Ro
349
240
0
31 Mar 2023
When Learning Is Out of Reach, Reset: Generalization in Autonomous
  Visuomotor Reinforcement Learning
When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning
Zichen Zhang
Luca Weihs
OffRL
191
5
0
30 Mar 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning EvaluationComputer Vision and Pattern Recognition (CVPR), 2023
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
292
84
0
21 Mar 2023
OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
Karmesh Yadav
Arjun Majumdar
Ram Ramrakhya
Naoki Yokoyama
Alexei Baevski
Z. Kira
Oleksandr Maksymets
Dhruv Batra
ViT
289
71
0
14 Mar 2023
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D
  Dense CLIP
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
Junbo Zhang
Runpei Dong
Kaisheng Ma
CLIPVLM
215
106
0
08 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and
  Opportunities
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&RoOffRLLRMAI4CE
368
206
0
07 Mar 2023
ReorientDiff: Diffusion Model based Reorientation for Object
  Manipulation
ReorientDiff: Diffusion Model based Reorientation for Object ManipulationIEEE International Conference on Robotics and Automation (ICRA), 2023
Utkarsh Aashu Mishra
Yongxin Chen
217
24
0
28 Feb 2023
Language-Driven Representation Learning for Robotics
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti
Suraj Nair
Annie S. Chen
Thomas Kollar
Chelsea Finn
Dorsa Sadigh
Abigail Z. Jacobs
LM&RoSSL
250
189
0
24 Feb 2023
Paparazzi: A Deep Dive into the Capabilities of Language and Vision
  Models for Grounding Viewpoint Descriptions
Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint DescriptionsFindings (Findings), 2023
Henrik Voigt
J. Hombeck
M. Meuschke
K. Lawonn
Sina Zarrieß
VLM
205
1
0
13 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language
  Navigation
Actional Atomic-Concept Learning for Demystifying Vision-Language NavigationAAAI Conference on Artificial Intelligence (AAAI), 2023
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGeLM&Ro
259
5
0
13 Feb 2023
SOCRATES: Text-based Human Search and Approach using a Robot Dog
SOCRATES: Text-based Human Search and Approach using a Robot Dog
Jeongeun Park
Jefferson Silveria
Matthew K. X. J. Pan
Sungjoon Choi
110
0
0
10 Feb 2023
Multiple Thinking Achieving Meta-Ability Decoupling for Object
  Navigation
Multiple Thinking Achieving Meta-Ability Decoupling for Object NavigationInternational Conference on Machine Learning (ICML), 2023
Ronghao Dang
Lu Chen
Liuyi Wang
Zongtao He
Chengju Liu
Qi Chen
LRM
121
15
0
03 Feb 2023
Emergence of Maps in the Memories of Blind Navigation Agents
Emergence of Maps in the Memories of Blind Navigation AgentsInternational Conference on Learning Representations (ICLR), 2023
Erik Wijmans
Manolis Savva
Irfan Essa
Stefan Lee
Ari S. Morcos
Dhruv Batra
191
40
0
30 Jan 2023
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object
  Navigation
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object NavigationInternational Conference on Machine Learning (ICML), 2023
KAI-QING Zhou
Kai Zheng
Connor Pryor
Yilin Shen
Hongxia Jin
Lise Getoor
Xinze Wang
339
171
0
30 Jan 2023
Transfer Knowledge from Natural Language to Electrocardiography: Can We
  Detect Cardiovascular Disease Through Language Models?
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?Findings (Findings), 2023
Jielin Qiu
William Jongwon Han
Jiacheng Zhu
Mengdi Xu
Michael A. Rosenberg
Emerson Liu
Douglas Weber
Ding Zhao
190
26
0
21 Jan 2023
Distilling Vision-Language Pre-training to Collaborate with
  Weakly-Supervised Temporal Action Localization
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action LocalizationComputer Vision and Pattern Recognition (CVPR), 2022
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
172
17
0
19 Dec 2022
Pre-Trained Image Encoder for Generalizable Visual Reinforcement
  Learning
Pre-Trained Image Encoder for Generalizable Visual Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Zhecheng Yuan
Zhengrong Xue
Bo Yuan
Xueqian Wang
Yi Wu
Yang Gao
Huazhe Xu
SSLOffRL
234
92
0
17 Dec 2022
Previous
1234
Next