Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.12015
Cited By
GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration
20 November 2023
Naoki Wake
Atsushi Kanehira
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
LM&Ro
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration"
19 / 19 papers shown
Title
Multi-Agent Systems for Robotic Autonomy with LLMs
Junhong Chen
Ziqi Yang
Haoyuan G Xu
Dandan Zhang
George Mylonas
LLMAG
43
0
0
09 May 2025
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
W. Zhang
Mengna Wang
Gangao Liu
Xu Huixin
Yiwei Jiang
...
Hang Zhang
Xin Li
Weiming Lu
Peng Li
Y. Zhuang
LM&Ro
LRM
65
2
0
27 Mar 2025
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter
Kechun Xu
Xunlong Xia
Kaixuan Wang
Yifei Yang
Yunxuan Mao
Bing Deng
R. Xiong
Y. Wang
OffRL
64
0
0
12 Mar 2025
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu
J. Li
Yichen Jiang
Niranjan Sujay
Z. Yang
Juexiao Zhang
John Abanes
Jing Zhang
Chen Feng
102
1
0
26 Nov 2024
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Chan Hee Song
Valts Blukis
Jonathan Tremblay
Stephen Tyree
Yu-Chuan Su
Stan Birchfield
83
4
0
25 Nov 2024
Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation
Seulbi Lee
J. Kim
Sangheum Hwang
LRM
38
0
0
19 Oct 2024
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
P. Krishnamurthy
Ramesh Karri
Farshad Khorrami
42
3
0
16 Sep 2024
VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation
Wentao Zhao
Jiaming Chen
Ziyu Meng
Donghui Mao
Ran Song
Wei Zhang
33
8
0
13 Jul 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRM
LM&Ro
38
28
0
16 Apr 2024
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Zeren Chen
Zhelun Shi
Xiaoya Lu
Lehan He
Sucheng Qian
...
Zhen-fei Yin
Jing Shao
Jing Shao
Cewu Lu
Cewu Lu
31
5
0
28 Mar 2024
ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models
Siyuan Huang
Iaroslav Ponomarenko
Zhengkai Jiang
Xiaoqi Li
Xiaobin Hu
Peng Gao
Hongsheng Li
Hao Dong
LM&Ro
32
16
0
17 Mar 2024
MOKA: Open-Vocabulary Robotic Manipulation through Mark-Based Visual Prompting
Fangchen Liu
Kuan Fang
Pieter Abbeel
Sergey Levine
LM&Ro
40
23
0
05 Mar 2024
Human Demonstrations are Generalizable Knowledge for Robots
Te Cui
Guangyan Chen
Tianxing Zhou
Zicai Peng
Mengxiao Hu
Haoyang Lu
Haizhou Li
Meiling Wang
Yi Yang
Yufeng Yue
LM&Ro
19
6
0
05 Dec 2023
Transferring Foundation Models for Generalizable Robotic Manipulation
Jiange Yang
Wenhui Tan
Chuhao Jin
Keling Yao
Bei Liu
Jianlong Fu
Ruihua Song
Gangshan Wu
Limin Wang
LM&Ro
45
6
0
09 Jun 2023
Task-sequencing Simulator: Integrated Machine Learning to Execution Simulation for Robot Manipulation
Kazuhiro Sasabuchi
Daichi Saito
Atsushi Kanehira
Naoki Wake
Jun Takamatsu
Katsushi Ikeuchi
16
7
0
03 Jan 2023
Grounding Language with Visual Affordances over Unstructured Data
Oier Mees
Jessica Borja-Diaz
Wolfram Burgard
LM&Ro
121
106
0
04 Oct 2022
Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement
Zirui Zhao
W. Lee
David Hsu
OOD
30
9
0
01 Oct 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
112
616
0
22 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
143
449
0
12 Sep 2022
1