ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.18115
  4. Cited By
Open-vocabulary Mobile Manipulation in Unseen Dynamic Environments with
  3D Semantic Maps

Open-vocabulary Mobile Manipulation in Unseen Dynamic Environments with 3D Semantic Maps

26 June 2024
Dicong Qiu
Wenzong Ma
Zhenfu Pan
Hui Xiong
Junwei Liang
    LM&Ro
ArXivPDFHTML

Papers citing "Open-vocabulary Mobile Manipulation in Unseen Dynamic Environments with 3D Semantic Maps"

17 / 17 papers shown
Title
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
Tianxing Chen
Yao Mu
Zhixuan Liang
Z. Chen
Shijia Peng
...
Mingkun Xu
R. Hu
H. Zhang
Xuelong Li
Ping Luo
AI4CE
99
8
0
27 Nov 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRM
LM&Ro
38
28
0
16 Apr 2024
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic
  Manipulation
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
Guanxing Lu
Shiyi Zhang
Ziwei Wang
Changliu Liu
Jiwen Lu
Yansong Tang
41
49
0
13 Mar 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Lihe Yang
Bingyi Kang
Zilong Huang
Xiaogang Xu
Jiashi Feng
Hengshuang Zhao
VLM
139
681
0
19 Jan 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
156
895
0
21 Dec 2023
Multimodal Foundation Models: From Specialists to General-Purpose
  Assistants
Multimodal Foundation Models: From Specialists to General-Purpose Assistants
Chunyuan Li
Zhe Gan
Zhengyuan Yang
Jianwei Yang
Linjie Li
Lijuan Wang
Jianfeng Gao
MLLM
110
221
0
18 Sep 2023
Tag2Text: Guiding Vision-Language Model via Image Tagging
Tag2Text: Guiding Vision-Language Model via Image Tagging
Xinyu Huang
Youcai Zhang
Jinyu Ma
Weiwei Tian
Rui Feng
Yuejie Zhang
Yaqian Li
Yandong Guo
Lei Zhang
CLIP
MLLM
VLM
3DV
61
73
0
10 Mar 2023
Open-World Object Manipulation using Pre-trained Vision-Language Models
Open-World Object Manipulation using Pre-trained Vision-Language Models
Austin Stone
Ted Xiao
Yao Lu
K. Gopalakrishnan
Kuang-Huei Lee
...
Sean Kirmani
Brianna Zitkovich
F. Xia
Chelsea Finn
Karol Hausman
LM&Ro
142
144
0
02 Mar 2023
Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit
  Representation
Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation
Xingrui Yang
Hai Li
Hongjia Zhai
Yuhang Ming
Yuqian Liu
Guofeng Zhang
159
168
0
28 Oct 2022
Visual Language Maps for Robot Navigation
Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
LM&Ro
145
337
0
11 Oct 2022
Open-vocabulary Queryable Scene Representations for Real World Planning
Open-vocabulary Queryable Scene Representations for Real World Planning
Boyuan Chen
F. Xia
Brian Ichter
Kanishka Rao
K. Gopalakrishnan
Michael S. Ryoo
Austin Stone
Daniel Kappler
LM&Ro
144
179
0
20 Sep 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
139
430
0
10 Jul 2022
ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings
ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings
Arjun Majumdar
Gunjan Aggarwal
Bhavika Devnani
Judy Hoffman
Dhruv Batra
LM&Ro
147
148
0
24 Jun 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
388
4,010
0
28 Jan 2022
LENS: Localization enhanced by NeRF synthesis
LENS: Localization enhanced by NeRF synthesis
Arthur Moreau
Nathan Piasco
D. Tsishkou
B. Stanciulescu
A. de La Fortelle
32
121
0
13 Oct 2021
Vision-Only Robot Navigation in a Neural Radiance World
Vision-Only Robot Navigation in a Neural Radiance World
M. Adamkiewicz
Timothy Chen
Adam Caccavale
Rachel Gardner
Preston Culbertson
Jeannette Bohg
Mac Schwager
153
227
0
01 Oct 2021
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D
  Cameras
ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras
Raul Mur-Artal
Juan D. Tardós
201
5,352
0
20 Oct 2016
1