Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.15818
Cited By
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
28 July 2023
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
K. Choromanski
Tianli Ding
Danny Driess
Kumar Avinava Dubey
Chelsea Finn
Peter R. Florence
Chuyuan Fu
Montse Gonzalez Arenas
K. Gopalakrishnan
Kehang Han
Karol Hausman
Alexander Herzog
Jasmine Hsu
Brian Ichter
A. Irpan
Nikhil J. Joshi
Ryan C. Julian
Dmitry Kalashnikov
Yuheng Kuang
Isabel Leal
Lisa Lee
Tsang-Wei Edward Lee
Sergey Levine
Yao Lu
Henryk Michalewski
Igor Mordatch
Karl Pertsch
Kanishka Rao
Krista Reymann
Michael S. Ryoo
Grecia Salazar
Pannag R. Sanketi
P. Sermanet
Jaspiar Singh
Anika Singh
Radu Soricut
Huong Tran
Vincent Vanhoucke
Q. Vuong
Ayzaan Wahid
Stefan Welker
Paul Wohlhart
Jialin Wu
Fei Xia
Ted Xiao
Peng-Tao Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control"
50 / 194 papers shown
Title
Pixel Motion as Universal Representation for Robot Control
Kanchana Ranasinghe
Xiang Li
Cristina Mata
J. Park
Michael S. Ryoo
VGen
22
0
0
12 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Qingwen Bu
Y. Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
45
0
0
09 May 2025
3D CAVLA: Leveraging Depth and 3D Context to Generalize Vision Language Action Models for Unseen Tasks
V. Bhat
Yu-Hsiang Lan
P. Krishnamurthy
Ramesh Karri
Farshad Khorrami
45
0
0
09 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
LM&Ro
VLM
73
0
0
08 May 2025
SITE: towards Spatial Intelligence Thorough Evaluation
W. Wang
Reuben Tan
Pengyue Zhu
Jianwei Yang
Zhengyuan Yang
Lijuan Wang
Andrey Kolobov
Jianfeng Gao
Boqing Gong
43
0
0
08 May 2025
Meta-Optimization and Program Search using Language Models for Task and Motion Planning
Denis Shcherba
Eckart Cobo-Briesewitz
Cornelius V. Braun
Marc Toussaint
LM&Ro
LRM
29
0
0
06 May 2025
Task Reconstruction and Extrapolation for
π
0
π_0
π
0
using Text Latent
Quanyi Li
28
0
0
06 May 2025
RoboOS: A Hierarchical Embodied Framework for Cross-Embodiment and Multi-Agent Collaboration
Huajie Tan
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Yaoxu Lyu
Mingyu Cao
Zhongyuan Wang
S. Zhang
LM&Ro
36
0
0
06 May 2025
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
Xiaoqi Li
Lingyun Xu
M. Zhang
Jiaming Liu
Yan Shen
...
Jiahui Xu
Liang Heng
Siyuan Huang
S. Zhang
Hao Dong
LM&Ro
39
0
0
04 May 2025
Interleave-VLA: Enhancing Robot Manipulation with Interleaved Image-Text Instructions
Cunxin Fan
Xiaosong Jia
Yihang Sun
Yixiao Wang
Jianglan Wei
...
Xiangyu Zhao
M. Tomizuka
Xue Yang
Junchi Yan
Mingyu Ding
LM&Ro
VLM
62
2
0
04 May 2025
Prompt-responsive Object Retrieval with Memory-augmented Student-Teacher Learning
Malte Mosbach
Sven Behnke
29
0
0
04 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
45
0
0
03 May 2025
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Kai Hu
Weichen Yu
L. Zhang
Alexander Robey
Andy Zou
Chengming Xu
Haoqi Hu
Matt Fredrikson
AAML
VLM
49
0
0
02 May 2025
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow
Changhe Chen
Quantao Yang
Xiaohao Xu
Nima Fazeli
Olov Andersson
22
0
0
02 May 2025
Robotic Visual Instruction
Y. Li
Ziyang Gong
H. Li
Xiaoqi Huang
Haolan Kang
Guangping Bai
Xianzheng Ma
LM&Ro
66
0
0
01 May 2025
A Survey of Robotic Navigation and Manipulation with Physics Simulators in the Era of Embodied AI
Lik Hang Kenny Wong
Xueyang Kang
Kaixin Bai
Jianwei Zhang
45
0
0
01 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
M. Yan
Fei Huang
Bo An
20
0
0
01 May 2025
RoboGround: Robotic Manipulation with Grounded Vision-Language Priors
Haifeng Huang
Xinyi Chen
Y. Chen
H. Li
Xiaoshen Han
Z. Wang
Tai Wang
Jiangmiao Pang
Zhou Zhao
LM&Ro
75
0
0
30 Apr 2025
A Survey of Interactive Generative Video
Jiwen Yu
Yiran Qin
Haoxuan Che
Quande Liu
X. Wang
Pengfei Wan
Di Zhang
Kun Gai
Hao Chen
Xihui Liu
VGen
53
0
0
30 Apr 2025
GPA-RAM: Grasp-Pretraining Augmented Robotic Attention Mamba for Spatial Task Learning
Juyi Sheng
Yangjun Liu
Sheng Xu
Zhixin Yang
Mengyuan Liu
51
0
0
28 Apr 2025
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Chia-Yu Hung
Qi Sun
Pengfei Hong
Amir Zadeh
Chuan Li
U-Xuan Tan
Navonil Majumder
Soujanya Poria
LM&Ro
37
1
0
28 Apr 2025
Generative AI in Embodied Systems: System-Level Analysis of Performance, Efficiency and Scalability
Zishen Wan
Jiayi Qian
Yuhang Du
Jason J. Jabbour
Yilun Du
Yang Katie Zhao
A. Raychowdhury
Tushar Krishna
Vijay Janapa Reddi
LM&Ro
86
0
0
26 Apr 2025
RL-Driven Data Generation for Robust Vision-Based Dexterous Grasping
Atsushi Kanehira
Naoki Wake
Kazuhiro Sasabuchi
Jun Takamatsu
Katsushi Ikeuchi
42
0
0
25 Apr 2025
CIVIL: Causal and Intuitive Visual Imitation Learning
Yinlong Dai
Robert Ramirez Sanchez
Ryan Jeronimus
Shahabedin Sagheb
Cara M. Nunez
Heramb Nemlekar
Dylan P. Losey
58
0
0
24 Apr 2025
Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration
Tyler Ga Wei Lum
Olivia Y. Lee
C. Karen Liu
Jeannette Bohg
33
1
0
17 Apr 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Rongtao Xu
J. Zhang
Minghao Guo
Youpeng Wen
H. Yang
...
Liqiong Wang
Yuxuan Kuang
Meng Cao
Feng Zheng
Xiaodan Liang
37
3
0
17 Apr 2025
Perception-R1: Pioneering Perception Policy with Reinforcement Learning
En Yu
Kangheng Lin
Liang Zhao
Jisheng Yin
Yana Wei
...
Zheng Ge
Xiangyu Zhang
Daxin Jiang
Jingyu Wang
Wenbing Tao
VLM
OffRL
LRM
35
0
0
10 Apr 2025
ViTaMIn: Learning Contact-Rich Tasks Through Robot-Free Visuo-Tactile Manipulation Interface
Fangchen Liu
Chuanyu Li
Yihua Qin
Ankit Shaw
J. Xu
Pieter Abbeel
Rui Chen
38
1
0
08 Apr 2025
STI-Bench: Are MLLMs Ready for Precise Spatial-Temporal World Understanding?
Y. Li
Y. Zhang
Tao Lin
Xiangrui Liu
Wenxiao Cai
Zheng Liu
Bo Zhao
LRM
56
0
0
31 Mar 2025
Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation
Abhiram Maddukuri
Z. L. Jiang
L. Chen
Soroush Nasiriany
Yuqi Xie
...
Scott Reed
Ken Goldberg
Ajay Mandlekar
Linxi Fan
Yuke Zhu
59
1
0
31 Mar 2025
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
W. Zhang
Mengna Wang
Gangao Liu
Xu Huixin
Yiwei Jiang
...
Hang Zhang
Xin Li
Weiming Lu
Peng Li
Y. Zhuang
LM&Ro
LRM
65
2
0
27 Mar 2025
MoLe-VLA: Dynamic Layer-skipping Vision Language Action Model via Mixture-of-Layers for Efficient Robot Manipulation
Rongyu Zhang
Menghang Dong
Yuan Zhang
Liang Heng
Xiaowei Chi
Gaole Dai
Li Du
Dan Wang
Yuan Du
MoE
81
0
0
26 Mar 2025
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization
Liang Pan
Zeshi Yang
Zhiyang Dou
Wenjia Wang
Buzhen Huang
Bo Dai
Taku Komura
Jingbo Wang
45
1
0
25 Mar 2025
Efficient Continual Adaptation of Pretrained Robotic Policy with Online Meta-Learned Adapters
Ruiqi Zhu
Endong Sun
Guanhe Huang
Oya Celiktutan
CLL
OnRL
59
0
0
24 Mar 2025
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning
Nvidia
A. Azzolini
Hannah Brandon
Prithvijit Chattopadhyay
Huayu Chen
...
Yao Xu
X. Yang
Zhuolin Yang
Xiaohui Zeng
Z. Zhang
LM&Ro
LRM
AI4CE
52
5
0
18 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
59
15
0
18 Mar 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
84
7
0
16 Mar 2025
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills
Haoqi Yuan
Yu Bai
Yuhui Fu
Bohan Zhou
Yicheng Feng
Xinrun Xu
Yi Zhan
Börje F. Karlsson
Zongqing Lu
LM&Ro
74
0
0
16 Mar 2025
Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping
David Snyder
Asher Hancock
Apurva Badithela
Emma Dixon
Patrick "Tree" Miller
Rares Ambrus
Anirudha Majumdar
Masha Itkina
Haruki Nishimura
OffRL
82
1
0
14 Mar 2025
Masked Sensory-Temporal Attention for Sensor Generalization in Quadruped Locomotion
Dikai Liu
Tianwei Zhang
Jianxiong Yin
Simon See
82
1
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
S. Zhang
64
6
0
13 Mar 2025
EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments
Dongping Li
Tielong Cai
Tianci Tang
Wenhao Chai
Katherine Rose Driggs-Campbell
Gaoang Wang
LM&Ro
56
0
0
11 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
39
0
0
10 Mar 2025
MatchMaker: Automated Asset Generation for Robotic Assembly
Yian Wang
Bingjie Tang
Chuang Gan
Dieter Fox
Kaichun Mo
Yashraj S. Narang
Iretiayo Akinola
50
0
0
07 Mar 2025
VLA Model-Expert Collaboration for Bi-directional Manipulation Learning
Tian-Yu Xiang
Ao-Qun Jin
Xiao-Hu Zhou
Mei-Jiang Gui
Xiao-Liang Xie
...
Shuang-Yi Wang
Sheng-Bin Duang
Si-Cheng Wang
Zheng Lei
Z. Hou
58
1
0
06 Mar 2025
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang
Fangchen Liu
Letian Fu
Tingfan Wu
Mustafa Mukadam
Jitendra Malik
Ken Goldberg
Pieter Abbeel
LM&Ro
VLM
74
5
0
05 Mar 2025
AirExo-2: Scaling up Generalizable Robotic Imitation Learning with Low-Cost Exoskeletons
Hongjie Fang
Chenxi Wang
Yiming Wang
J. Chen
Shangning Xia
...
Xinyu Zhan
Lixin Yang
Weiming Wang
Cewu Lu
Hao-Shu Fang
80
1
0
05 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei K. Zhang
Bo Yang
Hua Chen
59
1
0
05 Mar 2025
UAV-VLRR: Vision-Language Informed NMPC for Rapid Response in UAV Search and Rescue
Yasheerah Yaqoot
Muhammad Ahsan Mustafa
Oleg Sautenkov
Dzmitry Tsetserukou
Valerii Serpiva
Dzmitry Tsetserukou
47
1
0
04 Mar 2025
UAV-VLPA*: A Vision-Language-Path-Action System for Optimal Route Generation on a Large Scales
Oleg Sautenkov
Aibek Akhmetkazy
Yasheerah Yaqoot
Muhammad Ahsan Mustafa
Grik Tadevosyan
Artem Lykov
Dzmitry Tsetserukou
60
0
0
04 Mar 2025
1
2
3
4
Next