Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.09246
Cited By
v1
v2 (latest)
OpenVLA: An Open-Source Vision-Language-Action Model
13 June 2024
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
Suraj Nair
Rafael Rafailov
Ethan P. Foster
Grace Lam
Pannag R Sanketi
Quan Vuong
Thomas Kollar
Benjamin Burchfiel
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&Ro
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (40 upvotes)
Papers citing
"OpenVLA: An Open-Source Vision-Language-Action Model"
50 / 710 papers shown
SAGE: Scene Graph-Aware Guidance and Execution for Long-Horizon Manipulation Tasks
Jialiang Li
Wenzheng Wu
Gaojing Zhang
Yifan Han
Wenzhao Lian
LM&Ro
129
0
0
26 Sep 2025
VLBiMan: Vision-Language Anchored One-Shot Demonstration Enables Generalizable Bimanual Robotic Manipulation
Huayi Zhou
Kui Jia
LM&Ro
190
0
0
26 Sep 2025
Developing Vision-Language-Action Model from Egocentric Videos
Tomoya Yoshida
Shuhei Kurita
Taichi Nishimura
Shinsuke Mori
104
1
0
26 Sep 2025
On Robustness of Vision-Language-Action Model against Multi-Modal Perturbations
Jianing Guo
Z. Wu
Chang Tu
Yiyao Ma
Xiangqi Kong
...
Qi Dou
Yaodong Yang
Huijie Zhao
Weifeng Lv
Simin Li
AAML
VLM
272
3
0
26 Sep 2025
WoW: Towards a World omniscient World model Through Embodied Interaction
Xiaowei Chi
Peidong Jia
Chun-Kai Fan
Xiaozhu Ju
Weishi Mi
...
Wei Xue
Sirui Han
Yike Guo
Shanghang Zhang
Yong Dai
VGen
160
2
0
26 Sep 2025
From Watch to Imagine: Steering Long-horizon Manipulation via Human Demonstration and Future Envisionment
Ke Ye
Jiaming Zhou
Yuanfeng Qiu
Jiayi Liu
Shihui Zhou
Kun-Yu Lin
Junwei Liang
VGen
182
1
0
26 Sep 2025
ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation
Nan Tang
Jing-Cheng Pang
Guanlin Li
Chao Qian
Yang Yu
156
0
0
26 Sep 2025
Pixel Motion Diffusion is What We Need for Robot Control
E-Ro Nguyen
Y. Zhang
Kanchana Ranasinghe
Xiang Li
Michael S. Ryoo
DiffM
137
0
0
26 Sep 2025
RetoVLA: Reusing Register Tokens for Spatial Reasoning in Vision-Language-Action Models
Jiyeon Koo
Taewan Cho
Hyunjoon Kang
Eunseom Pyo
Tae Gyun Oh
Taeryang Kim
Andrew Jaeyong Choi
70
1
0
25 Sep 2025
AnywhereVLA: Language-Conditioned Exploration and Mobile Manipulation
Konstantin Gubernatorov
Artem Voronov
Roman Voronov
Sergei Pasynkov
S. Perminov
Ziang Guo
Dzmitry Tsetserukou
LM&Ro
109
0
0
25 Sep 2025
Diffusion-Based Impedance Learning for Contact-Rich Manipulation Tasks
Noah Geiger
Tamim Asfour
Neville Hogan
Johannes Lachner
180
0
0
24 Sep 2025
RoboSSM: Scalable In-context Imitation Learning via State-Space Models
Youngju Yoo
Jiaheng Hu
Yifeng Zhu
Bo Liu
Qiang Liu
Roberto Martín-Martín
Peter Stone
108
0
0
24 Sep 2025
mindmap: Spatial Memory in Deep Feature Maps for 3D Action Policies
Remo Steiner
A. Millane
David Tingdahl
Clemens Volk
Vikram Ramasamy
Xinjie Yao
Peter Du
Soha Pouya
Shiwei Sheng
191
1
0
24 Sep 2025
One Filters All: A Generalist Filter for State Estimation
Shiqi Liu
Wenhan Cao
Chang Liu
Zeyu He
Tianyi Zhang
Jingliang Duan
OffRL
160
1
0
24 Sep 2025
Embodied AI: From LLMs to World Models
Tongtong Feng
Xin Wang
Yu Jiang
Wenwu Zhu
LM&Ro
329
8
0
24 Sep 2025
Parse-Augment-Distill: Learning Generalizable Bimanual Visuomotor Policies from Single Human Video
Georgios Tziafas
Jiayun Zhang
Hamidreza Kasaei
148
0
0
24 Sep 2025
Beyond Human Demonstrations: Diffusion-Based Reinforcement Learning to Generate Data for VLA Training
Rushuai Yang
Hangxing Wei
Ran Zhang
Zhiyuan Feng
Xiaoyu Chen
...
Chuheng Zhang
Li Zhao
Jiang Bian
Xiu Su
Yi-Ling Chen
250
2
0
24 Sep 2025
FreezeVLA: Action-Freezing Attacks against Vision-Language-Action Models
Xin Wang
Jie Li
Zejia Weng
Yixu Wang
Yifeng Gao
...
Yan Teng
Yingchun Wang
Zuxuan Wu
Jiabo He
Yu Jiang
AAML
170
1
0
24 Sep 2025
Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving
Pengxiang Li
Yinan Zheng
Y. Wang
Huimin Wang
Hang Zhao
Jingjing Liu
Xianyuan Zhan
Kun Zhan
Xianpeng Lang
112
7
0
24 Sep 2025
3D Flow Diffusion Policy: Visuomotor Policy Learning via Generating Flow in 3D Space
Sangjun Noh
Dongwoo Nam
Kangmin Kim
Geonhyup Lee
Yeonguk Yu
Raeyoung Kang
K. Lee
VGen
98
1
0
23 Sep 2025
Agentic Scene Policies: Unifying Space, Semantics, and Affordances for Robot Action
Sacha Morin
Kumaraditya Gupta
Mahtab Sandhu
Charlie Gauthier
F. Argenziano
Kirsty Ellis
Liam Paull
LM&Ro
133
0
0
23 Sep 2025
Do You Need Proprioceptive States in Visuomotor Policies?
Juntu Zhao
Wenbo Lu
Di Zhang
Y. Liu
Yushen Liang
...
Yingdong Hu
Shengjie Wang
Junliang Guo
Yi Xu
Yang Gao
177
1
0
23 Sep 2025
How Far are VLMs from Visual Spatial Intelligence? A Benchmark-Driven Perspective
S. Yu
Yuxin Chen
Hao Ju
Lianjie Jia
Fuxi Zhang
...
Lin Song
Lijun Wang
Yanwei Li
Y. Shan
Huchuan Lu
LRM
319
9
0
23 Sep 2025
Residual Off-Policy RL for Finetuning Behavior Cloning Policies
Lars Ankile
Zhenyu Jiang
Rocky Duan
Guanya Shi
Pieter Abbeel
Anusha Nagabandi
OffRL
221
2
0
23 Sep 2025
Pure Vision Language Action (VLA) Models: A Comprehensive Survey
Dapeng Zhang
Jin Sun
Chenghui Hu
Xiaoyan Wu
Zhenlong Yuan
R. Zhou
Fei Shen
Qingguo Zhou
LM&Ro
294
15
0
23 Sep 2025
OmniVLA: An Omni-Modal Vision-Language-Action Model for Robot Navigation
Noriaki Hirose
Catherine Glossop
Dhruv Shah
Sergey Levine
LM&Ro
188
3
0
23 Sep 2025
SOE: Sample-Efficient Robot Policy Self-Improvement via On-Manifold Exploration
Yang Jin
Jun Lv
Han Xue
Wendi Chen
Chuan Wen
Cewu Lu
176
0
0
23 Sep 2025
VGGT-DP: Generalizable Robot Control via Vision Foundation Models
Shijia Ge
Yinxin Zhang
Shuzhao Xie
Weixiang Zhang
Mingcai Zhou
Zhi Wang
84
0
0
23 Sep 2025
EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data
Ryan Punamiya
Dhruv Patel
Patcharapong Aphiwetsa
Pranav Kuppili
Lawrence Y. Zhu
Simar Kareer
Judy Hoffman
Danfei Xu
221
3
0
23 Sep 2025
Growing with Your Embodied Agent: A Human-in-the-Loop Lifelong Code Generation Framework for Long-Horizon Manipulation Skills
Y. Meng
Zhenguo Sun
Max Fest
Xukun Li
Zhenshan Bing
Alois Knoll
LM&Ro
162
0
0
23 Sep 2025
OpenGVL -- Benchmarking Visual Temporal Progress for Data Curation
Paweł Budzianowski
Emilia Wisnios
Gracjan Góral
Igor Kulakov
Viktor Petrenko
Krzysztof Walas
165
0
0
22 Sep 2025
PEEK: Guiding and Minimal Image Representations for Zero-Shot Generalization of Robot Manipulation Policies
Jesse Zhang
Marius Memmel
Kevin Kim
Dieter Fox
Jesse Thomason
Fabio Ramos
Erdem Bıyık
Abhishek Gupta
Anqi Li
LM&Ro
129
1
0
22 Sep 2025
UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning
Ye Liu
Zongyang Ma
Junfu Pu
Zhongang Qi
Yang Wu
Mingyu Ding
Chang Wen Chen
MLLM
ObjD
LRM
368
2
0
22 Sep 2025
Latent Action Pretraining Through World Modeling
Bahey Tharwat
Yara Nasser
Ali Abouzeid
Ian Reid
LM&Ro
SSL
VLM
211
1
0
22 Sep 2025
History-Aware Visuomotor Policy Learning via Point Tracking
Jingjing Chen
Hongjie Fang
Chenxi Wang
Shiquan Wang
Cewu Lu
152
2
0
21 Sep 2025
FILIC: Dual-Loop Force-Guided Imitation Learning with Impedance Torque Control for Contact-Rich Manipulation Tasks
Haizhou Ge
Ruixiang Wang
Zheng Li
Yue Li
Zhixing Chen
Ruqi Huang
Longhua Ma
92
0
0
21 Sep 2025
TranTac: Leveraging Transient Tactile Signals for Contact-Rich Robotic Manipulation
Yinghao Wu
Shuhong Hou
Haowen Zheng
Yichen Li
Weiyi Lu
Xun Zhou
Yitian Shao
132
0
0
20 Sep 2025
CoReVLA: A Dual-Stage End-to-End Autonomous Driving Framework for Long-Tail Scenarios via Collect-and-Refine
Shiyu Fang
Yiming Cui
Haoyang Liang
Chen Lv
Peng Hang
Jian Sun
151
4
0
19 Sep 2025
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
Shaopeng Zhai
Qi Zhang
Tianyi Zhang
Fuxian Huang
Haoran Zhang
Ming Zhou
Shengzhe Zhang
Litao Liu
Sixu Lin
Jiangmiao Pang
OffRL
192
10
0
19 Sep 2025
See&Trek: Training-Free Spatial Prompting for Multimodal Large Language Model
Pengteng Li
Pinhao Song
Wuyang Li
Weiyu Guo
Huizai Yao
Ziyang Chen
Dugang Liu
Hui Xiong
LRM
VLM
120
1
0
19 Sep 2025
Compose by Focus: Scene Graph-based Atomic Skills
Han Qi
Changhe Chen
Heng Yang
OCL
CoGe
260
1
0
19 Sep 2025
I-FailSense: Towards General Robotic Failure Detection with Vision-Language Models
Clemence Grislain
Hamed Rahimi
Olivier Sigaud
Mohamed Chetouani
170
0
0
19 Sep 2025
GP3: A 3D Geometry-Aware Policy with Multi-View Images for Robotic Manipulation
Quanhao Qian
Guoyang Zhao
Gongjie Zhang
Jiuniu Wang
Ran Xu
Junlong Gao
Deli Zhao
129
3
0
19 Sep 2025
RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
Chao Yu
Y. Wang
Zhen Guo
Hao Lin
Si Xu
...
Z. Yang
Guohao Dai
Yu Wang
Guohao Dai
Yu Wang
AI4CE
124
3
0
19 Sep 2025
ExT: Towards Scalable Autonomous Excavation via Large-Scale Multi-Task Pretraining and Fine-Tuning
Yifan Zhai
Lorenzo Terenzi
Patrick Frey
Diego Garcia Soto
Pascal Egli
Marco Hutter
175
0
0
18 Sep 2025
CollabVLA: Self-Reflective Vision-Language-Action Model Dreaming Together with Human
Nan Sun
Yongchang Li
Chenxu Wang
Huiying Li
Huaping Liu
LM&Ro
VLM
118
1
0
18 Sep 2025
COMPASS: Confined-space Manipulation Planning with Active Sensing Strategy
Qixuan Li
Chen Le
Dongyue Huang
Jincheng Yu
Xinlei Chen
100
0
0
18 Sep 2025
Ask-to-Clarify: Resolving Instruction Ambiguity through Multi-turn Dialogue
Xingyao Lin
Xinghao Zhu
Tianyi Lu
Sicheng Xie
Hui Zhang
Xipeng Qiu
Zuxuan Wu
Yu-Gang Jiang
242
0
0
18 Sep 2025
RealMirror: A Comprehensive, Open-Source Vision-Language-Action Platform for Embodied AI
Cong Tai
Zhaoyu Zheng
Haixu Long
Hansheng Wu
Haodong Xiang
...
Ruifeng Li
Jun Huang
Bin Chang
Shuai Feng
Tao Shen
VLM
152
1
0
18 Sep 2025
Self-Improving Embodied Foundation Models
Seyed Kamyar Seyed Ghasemipour
Ayzaan Wahid
Jonathan Tompson
Pannag R Sanketi
Igor Mordatch
LM&Ro
LRM
144
5
0
18 Sep 2025
Previous
1
2
3
4
5
6
...
13
14
15
Next