ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09888
  4. Cited By
Simple but Effective: CLIP Embeddings for Embodied AI

Simple but Effective: CLIP Embeddings for Embodied AI

18 November 2021
Apoorv Khandelwal
Luca Weihs
Roozbeh Mottaghi
Aniruddha Kembhavi
    VLM
    LM&Ro
ArXivPDFHTML

Papers citing "Simple but Effective: CLIP Embeddings for Embodied AI"

50 / 173 papers shown
Title
Masked Path Modeling for Vision-and-Language Navigation
Masked Path Modeling for Vision-and-Language Navigation
Zi-Yi Dou
Feng Gao
Nanyun Peng
LM&Ro
26
3
0
23 May 2023
Pick your Poison: Undetectability versus Robustness in Data Poisoning
  Attacks
Pick your Poison: Undetectability versus Robustness in Data Poisoning Attacks
Nils Lukas
Florian Kerschbaum
20
1
0
07 May 2023
Programmatically Grounded, Compositionally Generalizable Robotic
  Manipulation
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
Renhao Wang
Jiayuan Mao
Joy Hsu
Hang Zhao
Jiajun Wu
Yang Gao
LM&Ro
114
30
0
26 Apr 2023
Multimodal Grounding for Embodied AI via Augmented Reality Headsets for
  Natural Language Driven Task Planning
Multimodal Grounding for Embodied AI via Augmented Reality Headsets for Natural Language Driven Task Planning
Selma Wanna
Fabian Parra
R. Valner
Karl Kruusamäe
Mitch Pryor
LM&Ro
22
2
0
26 Apr 2023
Moving Forward by Moving Backward: Embedding Action Impact over Action
  Semantics
Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
Kuo-Hao Zeng
Luca Weihs
Roozbeh Mottaghi
Ali Farhadi
22
3
0
24 Apr 2023
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Mohit Sharma
Claudio Fantacci
Yuxiang Zhou
Skanda Koppula
N. Heess
Jonathan Scholz
Y. Aytar
VLM
42
28
0
13 Apr 2023
L3MVN: Leveraging Large Language Models for Visual Target Navigation
L3MVN: Leveraging Large Language Models for Visual Target Navigation
Bangguo Yu
H. Kasaei
M. Cao
LM&Ro
52
84
0
11 Apr 2023
MOPA: Modular Object Navigation with PointGoal Agents
MOPA: Modular Object Navigation with PointGoal Agents
Sonia Raychaudhuri
Tommaso Campari
Unnat Jain
Manolis Savva
Angel X. Chang
3DPC
24
8
0
07 Apr 2023
ENTL: Embodied Navigation Trajectory Learner
ENTL: Embodied Navigation Trajectory Learner
Klemen Kotar
Aaron Walsman
Roozbeh Mottaghi
8
6
0
05 Apr 2023
Where are we in the search for an Artificial Visual Cortex for Embodied
  Intelligence?
Where are we in the search for an Artificial Visual Cortex for Embodied Intelligence?
Arjun Majumdar
Karmesh Yadav
Sergio Arnaud
Yecheng Jason Ma
Claire Chen
...
Dhruv Batra
Yixin Lin
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
LM&Ro
14
172
0
31 Mar 2023
When Learning Is Out of Reach, Reset: Generalization in Autonomous
  Visuomotor Reinforcement Learning
When Learning Is Out of Reach, Reset: Generalization in Autonomous Visuomotor Reinforcement Learning
Zichen Zhang
Luca Weihs
OffRL
16
5
0
30 Mar 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
13
55
0
21 Mar 2023
OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
OVRL-V2: A simple state-of-art baseline for ImageNav and ObjectNav
Karmesh Yadav
Arjun Majumdar
Ram Ramrakhya
Naoki Yokoyama
Alexei Baevski
Z. Kira
Oleksandr Maksymets
Dhruv Batra
ViT
8
45
0
14 Mar 2023
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D
  Dense CLIP
CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
Junbo Zhang
Runpei Dong
Kaisheng Ma
CLIP
VLM
24
77
0
08 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and
  Opportunities
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&Ro
OffRL
LRM
AI4CE
90
154
0
07 Mar 2023
ReorientDiff: Diffusion Model based Reorientation for Object
  Manipulation
ReorientDiff: Diffusion Model based Reorientation for Object Manipulation
Utkarsh Aashu Mishra
Yongxin Chen
16
20
0
28 Feb 2023
Language-Driven Representation Learning for Robotics
Language-Driven Representation Learning for Robotics
Siddharth Karamcheti
Suraj Nair
Annie S. Chen
Thomas Kollar
Chelsea Finn
Dorsa Sadigh
Percy Liang
LM&Ro
SSL
31
145
0
24 Feb 2023
Paparazzi: A Deep Dive into the Capabilities of Language and Vision
  Models for Grounding Viewpoint Descriptions
Paparazzi: A Deep Dive into the Capabilities of Language and Vision Models for Grounding Viewpoint Descriptions
Henrik Voigt
J. Hombeck
M. Meuschke
K. Lawonn
Sina Zarrieß
VLM
25
1
0
13 Feb 2023
Actional Atomic-Concept Learning for Demystifying Vision-Language
  Navigation
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
Bingqian Lin
Yi Zhu
Xiaodan Liang
Liang Lin
Jian-zhuo Liu
CoGe
LM&Ro
31
3
0
13 Feb 2023
SOCRATES: Text-based Human Search and Approach using a Robot Dog
SOCRATES: Text-based Human Search and Approach using a Robot Dog
Jeongeun Park
Jefferson Silveria
Matthew K. X. J. Pan
Sungjoon Choi
11
0
0
10 Feb 2023
Multiple Thinking Achieving Meta-Ability Decoupling for Object
  Navigation
Multiple Thinking Achieving Meta-Ability Decoupling for Object Navigation
Ronghao Dang
Lu Chen
Liuyi Wang
Zongtao He
Chengju Liu
Qi Chen
LRM
19
8
0
03 Feb 2023
Emergence of Maps in the Memories of Blind Navigation Agents
Emergence of Maps in the Memories of Blind Navigation Agents
Erik Wijmans
Manolis Savva
Irfan Essa
Stefan Lee
Ari S. Morcos
Dhruv Batra
24
27
0
30 Jan 2023
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object
  Navigation
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation
KAI-QING Zhou
Kai Zheng
Connor Pryor
Yilin Shen
Hongxia Jin
Lise Getoor
X. Wang
18
107
0
30 Jan 2023
Transfer Knowledge from Natural Language to Electrocardiography: Can We
  Detect Cardiovascular Disease Through Language Models?
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Jielin Qiu
William Jongwon Han
Jiacheng Zhu
Mengdi Xu
Michael A. Rosenberg
Emerson Liu
Douglas Weber
Ding Zhao
19
21
0
21 Jan 2023
Distilling Vision-Language Pre-training to Collaborate with
  Weakly-Supervised Temporal Action Localization
Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Chen Ju
Kunhao Zheng
Jinxian Liu
Peisen Zhao
Ya-Qin Zhang
Jianlong Chang
Yanfeng Wang
Qi Tian
13
11
0
19 Dec 2022
Pre-Trained Image Encoder for Generalizable Visual Reinforcement
  Learning
Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning
Zhecheng Yuan
Zhengrong Xue
Bo Yuan
Xueqian Wang
Yi Wu
Yang Gao
Huazhe Xu
SSL
OffRL
33
70
0
17 Dec 2022
Objaverse: A Universe of Annotated 3D Objects
Objaverse: A Universe of Annotated 3D Objects
Matt Deitke
Dustin Schwenk
Jordi Salvador
Luca Weihs
Oscar Michel
Eli VanderBilt
Ludwig Schmidt
Kiana Ehsani
Aniruddha Kembhavi
Ali Farhadi
22
882
0
15 Dec 2022
Self-Supervised Object Goal Navigation with In-Situ Finetuning
Self-Supervised Object Goal Navigation with In-Situ Finetuning
So Yeon Min
Yao-Hung Hubert Tsai
Wei Ding
Ali Farhadi
Ruslan Salakhutdinov
Yonatan Bisk
Jian Zhang
SSL
29
7
0
09 Dec 2022
Phone2Proc: Bringing Robust Robots Into Our Chaotic World
Phone2Proc: Bringing Robust Robots Into Our Chaotic World
Matt Deitke
Rose Hendrix
Luca Weihs
Ali Farhadi
Kiana Ehsani
Aniruddha Kembhavi
LM&Ro
21
17
0
08 Dec 2022
Task Bias in Vision-Language Models
Task Bias in Vision-Language Models
Sachit Menon
I. Chandratreya
Carl Vondrick
VLM
SSL
12
6
0
08 Dec 2022
PEANUT: Predicting and Navigating to Unseen Targets
PEANUT: Predicting and Navigating to Unseen Targets
Albert J. Zhai
Shenlong Wang
22
19
0
05 Dec 2022
Navigating to Objects in the Real World
Navigating to Objects in the Real World
Théophile Gervet
Soumith Chintala
Dhruv Batra
Jitendra Malik
Devendra Singh Chaplot
27
122
0
02 Dec 2022
A General Purpose Supervisory Signal for Embodied Agents
A General Purpose Supervisory Signal for Embodied Agents
Kunal Pratap Singh
Jordi Salvador
Luca Weihs
Aniruddha Kembhavi
SSL
21
3
0
01 Dec 2022
Time-Aware Datasets are Adaptive Knowledgebases for the New Normal
Time-Aware Datasets are Adaptive Knowledgebases for the New Normal
Abhijit Suprem
Sanjyot Vaidya
J. Ferreira
C. Pu
24
2
0
22 Nov 2022
Last-Mile Embodied Visual Navigation
Last-Mile Embodied Visual Navigation
Justin Wasserman
Karmesh Yadav
Girish Chowdhary
Abhi Gupta
Unnat Jain
28
34
0
21 Nov 2022
Ask4Help: Learning to Leverage an Expert for Embodied Tasks
Ask4Help: Learning to Leverage an Expert for Embodied Tasks
Kunal Pratap Singh
Luca Weihs
Alvaro Herrasti
Jonghyun Choi
Aniruddha Kemhavi
Roozbeh Mottaghi
11
19
0
18 Nov 2022
I Can't Believe There's No Images! Learning Visual Tasks Using only
  Language Supervision
I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision
Sophia Gu
Christopher Clark
Aniruddha Kembhavi
VLM
14
24
0
17 Nov 2022
Foundation Models for Semantic Novelty in Reinforcement Learning
Foundation Models for Semantic Novelty in Reinforcement Learning
Tarun Gupta
Peter Karkus
Tong Che
Danfei Xu
Marco Pavone
VLM
OffRL
LRM
24
7
0
09 Nov 2022
Text-Only Training for Image Captioning using Noise-Injected CLIP
Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai
Ron Mokady
Amir Globerson
VLM
CLIP
49
69
0
01 Nov 2022
Instruction-Following Agents with Multimodal Transformer
Instruction-Following Agents with Multimodal Transformer
Hao Liu
Lisa Lee
Kimin Lee
Pieter Abbeel
LM&Ro
30
10
0
24 Oct 2022
A Visual Tour Of Current Challenges In Multimodal Language Models
A Visual Tour Of Current Challenges In Multimodal Language Models
Shashank Sonkar
Naiming Liu
Richard G. Baraniuk
DiffM
6
1
0
22 Oct 2022
DANLI: Deliberative Agent for Following Natural Language Instructions
DANLI: Deliberative Agent for Following Natural Language Instructions
Yichi Zhang
Jianing Yang
Jiayi Pan
Shane Storks
N. Devraj
Ziqiao Ma
Keunwoo Peter Yu
Yuwei Bao
J. Chai
LM&Ro
48
16
0
22 Oct 2022
LAION-5B: An open large-scale dataset for training next generation
  image-text models
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
43
3,247
0
16 Oct 2022
Retrospectives on the Embodied AI Workshop
Retrospectives on the Embodied AI Workshop
Matt Deitke
Dhruv Batra
Yonatan Bisk
Tommaso Campari
Angel X. Chang
...
Jesse Thomason
Alexander Toshev
Joanne Truong
Luca Weihs
Jiajun Wu
LM&Ro
35
50
0
13 Oct 2022
NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills
  using Neural Radiance Fields
NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields
Arunkumar Byravan
Jan Humplik
Leonard Hasenclever
Arthur Brussee
F. Nori
...
Ben Moran
Steven Bohez
Fereshteh Sadeghi
Bojan Vujatovic
N. Heess
54
53
0
10 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
18
334
0
06 Oct 2022
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual
  Grounding
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu
Xinhua Cheng
Renrui Zhang
Zesen Cheng
Jian Zhang
51
62
0
29 Sep 2022
GAMA: Generative Adversarial Multi-Object Scene Attacks
GAMA: Generative Adversarial Multi-Object Scene Attacks
Abhishek Aich
Calvin-Khang Ta
Akash Gupta
Chengyu Song
S. Krishnamurthy
M. Salman Asif
A. Roy-Chowdhury
AAML
36
17
0
20 Sep 2022
Zero-shot Active Visual Search (ZAVIS): Intelligent Object Search for
  Robotic Assistants
Zero-shot Active Visual Search (ZAVIS): Intelligent Object Search for Robotic Assistants
Jeongeun Park
Taerim Yoon
Jejoon Hong
Youngjae Yu
Matthew K. X. J. Pan
Sungjoon Choi
35
12
0
19 Sep 2022
Instruction-driven history-aware policies for robotic manipulations
Instruction-driven history-aware policies for robotic manipulations
Pierre-Louis Guhur
Shizhe Chen
Ricardo Garcia Pinel
Makarand Tapaswi
Ivan Laptev
Cordelia Schmid
LM&Ro
102
102
0
11 Sep 2022
Previous
1234
Next