Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.09246
Cited By
v1
v2 (latest)
OpenVLA: An Open-Source Vision-Language-Action Model
13 June 2024
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
Suraj Nair
Rafael Rafailov
Ethan P. Foster
Grace Lam
Pannag R Sanketi
Quan Vuong
Thomas Kollar
Benjamin Burchfiel
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&Ro
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (40 upvotes)
Papers citing
"OpenVLA: An Open-Source Vision-Language-Action Model"
50 / 723 papers shown
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
Y. Fu
Ning Chen
Junkai Zhao
Shaozhe Shan
Guocai Yao
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
226
2
0
21 Nov 2025
Stable Offline Hand-Eye Calibration for any Robot with Just One Mark
Sicheng Xie
Lingchen Meng
Zhiying Du
Shuyuan Tu
Haidong Cao
Jiaqi Leng
Z. F. Wu
Yu-Gang Jiang
189
0
0
21 Nov 2025
RynnVLA-002: A Unified Vision-Language-Action and World Model
Jun Cen
Siteng Huang
Yuqian Yuan
Kehan Li
Hangjie Yuan
...
Xin Li
Hao Luo
Fan Wang
Deli Zhao
H. Chen
VGen
SyDa
325
2
0
21 Nov 2025
H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation
Yijie Zhu
Rui Shao
Ziyang Liu
Jie He
Jizhihui Liu
Jiuru Wang
Zitong Yu
222
1
0
21 Nov 2025
SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding
Nikolay Nikolov
Giuliano Albanese
Sombit Dey
Aleksandar Yanev
Luc Van Gool
Jan-Nico Zaech
D. Paudel
LM&Ro
429
0
0
21 Nov 2025
IndustryNav: Exploring Spatial Reasoning of Embodied Agents in Dynamic Industrial Navigation
Y. Li
Lichi Li
Anh Dao
Xinyu Zhou
Yicheng Qiao
...
Daeun Lee
Z. Chen
Zhen Tan
Mohit Bansal
Yu Kong
164
0
0
21 Nov 2025
RoboCOIN: An Open-Sourced Bimanual Robotic Data COllection for INtegrated Manipulation
Shihan Wu
Xuecheng Liu
Shaoxuan Xie
Pengwei Wang
Xinghang Li
...
Hao Zhao
Tiejun Huang
Shanghang Zhang
Yonghua Lin
Zhongyuan Wang
205
5
0
21 Nov 2025
Learning Diffusion Policies for Robotic Manipulation of Timber Joinery under Fabrication Uncertainty
Salma Mozaffari
Daniel Ruan
W. V. D. Bogert
Nima Fazeli
Sigrid Adriaenssens
Arash Adel
107
0
0
21 Nov 2025
VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation
Hanyu Zhou
Chuanhao Ma
Gim Hee Lee
193
0
0
21 Nov 2025
BOP-ASK: Object-Interaction Reasoning for Vision-Language Models
V. Bhat
Sungsu Kim
Valts Blukis
Greg Heinrich
Prashanth Krishnamurthy
Ramesh Karri
Stan Birchfield
Farshad Khorrami
Jonathan Tremblay
VLM
253
1
0
20 Nov 2025
FT-NCFM: An Influence-Aware Data Distillation Framework for Efficient VLA Models
Kewei Chen
Yayu Long
Shuai Li
Mingsheng Shang
86
0
0
20 Nov 2025
When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models
Yuping Yan
Yuhan Xie
Yinxin Zhang
Lingjuan Lyu
Yaochu Jin
Yaochu Jin
AAML
333
1
0
20 Nov 2025
InternData-A1: Pioneering High-Fidelity Synthetic Data for Pre-training Generalist Policy
Yang Tian
Yuyin Yang
Yiman Xie
Zetao Cai
Xu Shi
...
Ping Wang
Junhao Cai
Jia Zeng
Hao Dong
Jiangmiao Pang
136
4
0
20 Nov 2025
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
Yi Yang
X. Li
Yiyang Chen
Jin Song
Yihan Wang
Zipeng Xiao
Jiadi Su
You Qiaoben
Pengfei Liu
Zhijie Deng
VLM
207
1
0
20 Nov 2025
VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference
Ziyan Liu
Y. Chen
Hongyi Cai
Tao Lin
Shuo Yang
Zheng Liu
Bo Zhao
VLM
327
1
0
20 Nov 2025
In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data
Xiongyi Cai
Ri-Zhao Qiu
Geng Chen
Lai Wei
Isabella Liu
Tianshu Huang
Xuxin Cheng
Xiaolong Wang
EgoV
366
1
0
19 Nov 2025
SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models
Senyu Fei
Siyin Wang
Li Ji
Ao Li
Shiduo Zhang
Liming Liu
Jinlong Hou
Jingjing Gong
Xianzhong Zhao
Xipeng Qiu
132
0
0
19 Nov 2025
Theoretical Closed-loop Stability Bounds for Dynamical System Coupled with Diffusion Policies
Gabriel Lauzier
Alexandre Girard
François Ferland
93
0
0
19 Nov 2025
HMC: Learning Heterogeneous Meta-Control for Contact-Rich Loco-Manipulation
Lai Wei
Xuanbin Peng
Ri-Zhao Qiu
Tianshu Huang
Xuxin Cheng
Xiaolong Wang
100
2
0
18 Nov 2025
FlexiCup: Wireless Multimodal Suction Cup with Dual-Zone Vision-Tactile Sensing
Junhao Gong
Shoujie Li
Kit-Wa Sou
Changqing Guo
Hourong Huang
...
Yifan Xie
Chenxin Liang
Chuqiao Lyu
Xiaojun Liang
Wenbo Ding
153
2
0
18 Nov 2025
VLA-R: Vision-Language Action Retrieval toward Open-World End-to-End Autonomous Driving
Hyunki Seong
Seongwoo Moon
Hojin Ahn
Jehun Kang
David Hyunchul Shim
VLM
202
1
0
16 Nov 2025
Decoupled Action Head: Confining Task Knowledge to Conditioning Layers
Jian Zhou
Sihao Lin
Shuai Fu
Qi Wu
OffRL
115
0
0
15 Nov 2025
AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
Jiayu Li
Yunhan Zhao
Xiang Zheng
Zonghuan Xu
Yige Li
Xingjun Ma
Yu-Gang Jiang
AAML
361
0
0
15 Nov 2025
Audio-VLA: Adding Contact Audio Perception to Vision-Language-Action Model for Robotic Manipulation
Xiangyi Wei
Haotian Zhang
Xinyi Cao
Siyu Xie
Weifeng Ge
Yang Li
C. Wang
246
0
0
13 Nov 2025
SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
W. Li
Renshan Zhang
Rui Shao
Zhijian Fang
Kaiwen Zhou
Zhuotao Tian
Liqiang Nie
370
2
0
13 Nov 2025
Learning a Thousand Tasks in a Day
Science Robotics (Sci. Robot.), 2025
Kamil Dreczkowski
Pietro Vitiello
Vitalis Vosylius
Edward Johns
OffRL
409
2
0
13 Nov 2025
ViPRA: Video Prediction for Robot Actions
Sandeep Routray
Hengkai Pan
Unnat Jain
Shikhar Bahl
Deepak Pathak
242
2
0
11 Nov 2025
SlotVLA: Towards Modeling of Object-Relation Representations in Robotic Manipulation
Taisei Hanyu
Nhat Chung
H. Le
T. Nguyen
Yuki Ikebe
...
Tung Kieu
Kashu Yamazaki
Chase Rainwater
A. Nguyen
Ngan Le
278
1
0
10 Nov 2025
How Do VLAs Effectively Inherit from VLMs?
Chuheng Zhang
Rushuai Yang
Xiaoyu Chen
Kaixin Wang
Li Zhao
Yi-Ling Chen
Jiang Bian
LM&Ro
298
1
0
10 Nov 2025
ExpReS-VLA: Specializing Vision-Language-Action Models Through Experience Replay and Retrieval
Shahram Najam Syed
Yatharth Ahuja
Arthur Jakobsson
Jeff Ichnowski
VLM
110
3
0
09 Nov 2025
From Words to Safety: Language-Conditioned Safety Filtering for Robot Navigation
Zeyuan Feng
Haimingyue Zhang
Somil Bansal
87
0
0
08 Nov 2025
10 Open Challenges Steering the Future of Vision-Language-Action Models
Soujanya Poria
Navonil Majumder
Chia-Yu Hung
Amir Ali Bagherzadeh
Chuan Li
Kenneth Kwok
Z. Wang
Cheston Tan
Jiajun Wu
David Hsu
LM&Ro
VLM
329
0
0
08 Nov 2025
Towards Human-AI-Robot Collaboration and AI-Agent based Digital Twins for Parkinson's Disease Management: Review and Outlook
Hassan Hizeh
Rim Chighri
Muhammad Mahboob Ur Rahman
Mohamed A. Bahloul
Ali Muqaibel
Tareq Y. Al-Naffouri
125
0
0
08 Nov 2025
Let Me Show You: Learning by Retrieving from Egocentric Video for Robotic Manipulation
Yichen Zhu
Feifei Feng
121
2
0
07 Nov 2025
Visual Spatial Tuning
Rui Yang
Ziyu Zhu
Yanwei Li
Jingjia Huang
Shen Yan
...
Xiangtai Li
S. Li
Wenqian Wang
Yi Lin
Hengshuang Zhao
VLM
347
7
0
07 Nov 2025
EveryDayVLA: A Vision-Language-Action Model for Affordable Robotic Manipulation
Samarth Chopra
Alex McMoil
Ben Carnovale
Evan Sokolson
Rajkumar Kubendran
Samuel Dickerson
96
0
0
07 Nov 2025
Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment
Tao Lin
Yilei Zhong
Yuxin Du
Jingjing Zhang
Jiting Liu
...
Yanwen Zou
Lixing Zou
Zhaoye Zhou
Gen Li
Bo Zhao
VLM
170
4
0
06 Nov 2025
Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions
Kaifeng Zhang
Shuo Sha
Hanxiao Jiang
M. Loper
Hyunjong Song
Guangyan Cai
Zhuo Xu
Xiaochen Hu
Changxi Zheng
Yunzhu Li
324
4
0
06 Nov 2025
Cambrian-S: Towards Spatial Supersensing in Video
Shusheng Yang
J. Yang
Pinzhi Huang
Ellis L Brown
Zihao Yang
...
Daohan Lu
Rob Fergus
Yann LeCun
Li Fei-Fei
Saining Xie
178
19
0
06 Nov 2025
GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies
Maelic Neau
Zoe Falomir
Paulo E. Santos
Anne-Gwenn Bosser
Cédric Buche
93
1
0
06 Nov 2025
GUIDES: Guidance Using Instructor-Distilled Embeddings for Pre-trained Robot Policy Enhancement
Minquan Gao
Xinyi Li
Qing Yan
Xiaojian Sun
Xiaopan Zhang
Chien-Ming Huang
Jiachen Li
191
0
0
05 Nov 2025
LACY: A Vision-Language Model-based Language-Action Cycle for Self-Improving Robotic Manipulation
Youngjin Hong
Houjian Yu
Mingen Li
Changhyun Choi
LM&Ro
230
0
0
04 Nov 2025
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
Shichao Fan
K. Wu
Zhengping Che
X. Wang
Di Wu
...
M. M. Li
Qingjie Liu
Shanghang Zhang
Min Wan
Yong Dai
269
2
0
04 Nov 2025
Learning Interactive World Model for Object-Centric Reinforcement Learning
Fan Feng
Phillip Lippe
Sara Magliacane
OffRL
OCL
318
0
0
04 Nov 2025
PixelVLA: Advancing Pixel-level Understanding in Vision-Language-Action Model
Wenqi Liang
Gan Sun
Yao He
Jiahua Dong
Suyan Dai
Ivan Laptev
Salman Khan
Yang Cong
LM&Ro
3DV
VLM
217
3
0
03 Nov 2025
Scaling Cross-Embodiment World Models for Dexterous Manipulation
Zihao He
Bo Ai
Tongzhou Mu
Yulin Liu
Weikang Wan
Jiawei Fu
Yilun Du
Henrik I. Christensen
H. Su
200
3
0
03 Nov 2025
RobustVLA: Robustness-Aware Reinforcement Post-Training for Vision-Language-Action Models
Hongyin Zhang
Shuo Zhang
Junxi Jin
Qixin Zeng
Runze Li
Donglin Wang
VLM
353
2
0
03 Nov 2025
EgoMI: Learning Active Vision and Whole-Body Manipulation from Egocentric Human Demonstrations
Justin Yu
Yide Shentu
Di Wu
Pieter Abbeel
Ken Goldberg
Philipp Wu
92
2
0
31 Oct 2025
A Step Toward World Models: A Survey on Robotic Manipulation
Peng-Fei Zhang
Ying Cheng
Xiaofan Sun
S. Wang
Lei Zhu
Lei Zhu
Heng Tao Shen
LM&Ro
757
3
0
31 Oct 2025
DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models
Cheng Yin
Yankai Lin
Wang Xu
Sikyuen Tam
Xiangrui Zeng
Zhiyuan Liu
Zhouping Yin
LRM
188
1
0
31 Oct 2025
Previous
1
2
3
4
5
...
13
14
15
Next
Page 2 of 15
Page
of 15
Go