ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09246
  4. Cited By
OpenVLA: An Open-Source Vision-Language-Action Model
v1v2 (latest)

OpenVLA: An Open-Source Vision-Language-Action Model

13 June 2024
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
Suraj Nair
Rafael Rafailov
Ethan P. Foster
Grace Lam
Pannag R Sanketi
Quan Vuong
Thomas Kollar
Benjamin Burchfiel
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
    LM&RoVLM
ArXiv (abs)PDFHTMLHuggingFace (40 upvotes)

Papers citing "OpenVLA: An Open-Source Vision-Language-Action Model"

50 / 723 papers shown
FALCON: Actively Decoupled Visuomotor Policies for Loco-Manipulation with Foundation-Model-Based Coordination
FALCON: Actively Decoupled Visuomotor Policies for Loco-Manipulation with Foundation-Model-Based Coordination
Chengyang He
Ge Sun
Yue Bai
Junkai Lu
Jiadong Zhao
Guillaume Sartoretti
160
0
0
04 Dec 2025
Vision-Language-Action Models for Selective Robotic Disassembly: A Case Study on Critical Component Extraction from Desktops
Vision-Language-Action Models for Selective Robotic Disassembly: A Case Study on Critical Component Extraction from Desktops
Chang Liu
Sibo Tian
Sara Behdad
Xiao Liang
Minghui Zheng
46
0
0
04 Dec 2025
MOVE: A Simple Motion-Based Data Collection Paradigm for Spatial Generalization in Robotic Manipulation
MOVE: A Simple Motion-Based Data Collection Paradigm for Spatial Generalization in Robotic Manipulation
Huanqian Wang
C. Chen
Yang Yue
Danhua Tao
Tong Guo
Shaoxuan Xie
Denghang Huang
Shiji Song
Guocai Yao
Gao Huang
68
0
0
04 Dec 2025
SIMA 2: A Generalist Embodied Agent for Virtual Worlds
SIMA 2: A Generalist Embodied Agent for Virtual Worlds
Sima Team
Adrian Bolton
Alexander Lerchner
Alexandra Cordell
Alexandre Moufarek
...
Tyson Roberts
Volodymyr Mnih
Y. Liu
Z. Wang
Zoubin Ghahramani
LLMAGLM&Ro
254
2
0
04 Dec 2025
Hierarchical Vision Language Action Model Using Success and Failure Demonstrations
Hierarchical Vision Language Action Model Using Success and Failure Demonstrations
Jeongeun Park
Jihwan Yoon
Byungwoo Jeon
Juhan Park
Jinwoo Shin
Namhoon Cho
Kyungjae Lee
Sangdoo Yun
Sungjoon Choi
OffRL
212
0
0
03 Dec 2025
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL
Siyi Chen
Mikaela Angelina Uy
Chan Hee Song
Faisal Ladhak
Adithyavairavan Murali
Qing Qu
Stan Birchfield
Valts Blukis
Jonathan Tremblay
OffRLLRM
163
0
0
03 Dec 2025
Multimodal Reinforcement Learning with Agentic Verifier for AI Agents
Multimodal Reinforcement Learning with Agentic Verifier for AI Agents
Reuben Tan
Baolin Peng
Zhengyuan Yang
Hao Cheng
Oier Mees
...
Xiaodong Liu
Lijuan Wang
Marc Pollefeys
Yong Jae Lee
Jianfeng Gao
OffRLLRM
195
1
0
03 Dec 2025
RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL
RoboScape-R: Unified Reward-Observation World Models for Generalizable Robotics Training via RL
Yinzhou Tang
Yu Shang
Yinuo Chen
Bingwen Wei
Xin Zhang
...
Liangzhi Shi
Chao Yu
Chen Gao
Wei Wu
Yong Li
118
0
0
03 Dec 2025
Diagnose, Correct, and Learn from Manipulation Failures via Visual Symbols
Diagnose, Correct, and Learn from Manipulation Failures via Visual Symbols
Xianchao Zeng
Xinyu Zhou
Youcheng Li
Jiayou Shi
Tianle Li
L. Chen
Lei Ren
Y. Li
104
0
0
02 Dec 2025
SAM2Grasp: Resolve Multi-modal Grasping via Prompt-conditioned Temporal Action Prediction
SAM2Grasp: Resolve Multi-modal Grasping via Prompt-conditioned Temporal Action Prediction
Shengkai Wu
Jinrong Yang
Wenqiu Luo
Linfeng Gao
Chaohui Shang
Meiyu Zhi
Mingshan Sun
Fangping Yang
Liangliang Ren
Yong Zhao
132
0
0
02 Dec 2025
Video2Act: A Dual-System Video Diffusion Policy with Robotic Spatio-Motional Modeling
Video2Act: A Dual-System Video Diffusion Policy with Robotic Spatio-Motional Modeling
Yueru Jia
Jiaming Liu
Shengbang Liu
Rui Zhou
W. Yu
Yuyang Yan
Xiaowei Chi
Yandong Guo
Boxin Shi
Shanghang Zhang
VGen
312
2
0
02 Dec 2025
IGen: Scalable Data Generation for Robot Learning from Open-World Images
IGen: Scalable Data Generation for Robot Learning from Open-World Images
Chenghao Gu
Haolan Kang
Junchao Lin
Jinghe Wang
Duo Wu
...
Ziyang Gong
Letian Li
Hongying Zheng
Changwei Lv
Zhi Wang
VGenLM&Ro
163
0
0
01 Dec 2025
DiG-Flow: Discrepancy-Guided Flow Matching for Robust VLA Models
Wanpeng Zhang
Ye Wang
Hao Luo
Haoqi Yuan
Yicheng Feng
Sipeng Zheng
Qin Jin
Zongqing Lu
173
1
0
01 Dec 2025
ManualVLA: A Unified VLA Model for Chain-of-Thought Manual Generation and Robotic Manipulation
Chenyang Gu
Jiaming Liu
Hao Chen
Runzhong Huang
Qingpo Wuwu
...
Ying Li
Renrui Zhang
Peng Jia
Pheng-Ann Heng
Shanghang Zhang
LM&Ro
161
1
0
01 Dec 2025
CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding
CycleManip: Enabling Cyclic Task Manipulation via Effective Historical Perception and Understanding
Yi-Lin Wei
Haoran Liao
Yuhao Lin
Pengyue Wang
Zhizhao Liang
Guiliang Liu
Wei-Shi Zheng
57
0
0
30 Nov 2025
Transforming Monolithic Foundation Models into Embodied Multi-Agent Architectures for Human-Robot Collaboration
Nan Sun
Bo Mao
Yongchang Li
Chenxu Wang
Di Guo
Huaping Liu
LM&Ro
113
0
0
30 Nov 2025
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference
VLASH: Real-Time VLAs via Future-State-Aware Asynchronous Inference
Jiaming Tang
Yufei Sun
Yilong Zhao
Shang Yang
Yujun Lin
Zhuoyang Zhang
James Hou
Yao Lu
Zhijian Liu
Song Han
58
5
0
30 Nov 2025
Sigma: The Key for Vision-Language-Action Models toward Telepathic Alignment
Sigma: The Key for Vision-Language-Action Models toward Telepathic Alignment
Libo Wang
135
0
0
30 Nov 2025
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead
SwiftVLA: Unlocking Spatiotemporal Dynamics for Lightweight VLA Models at Minimal Overhead
Chaojun Ni
Cheng Chen
Xiaofeng Wang
Zheng Zhu
Wenzhao Zheng
...
Qiang Zhang
Yun Ye
Yang Wang
Guan Huang
Wenjun Mei
117
0
0
30 Nov 2025
RealAppliance: Let High-fidelity Appliance Assets Controllable and Workable as Aligned Real Manuals
RealAppliance: Let High-fidelity Appliance Assets Controllable and Workable as Aligned Real Manuals
Yuzheng Gao
Yuxing Long
Lei Kang
Yuchong Guo
Ziyan Yu
...
Jiyao Zhang
Ruihai Wu
Dongjiang Li
Hui Shen
Hao Dong
30
0
0
29 Nov 2025
LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models
LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models
Zuolei Li
Xingyu Gao
Xiaofan Wang
Jianlong Fu
LM&Ro
157
0
0
28 Nov 2025
SafeHumanoid: VLM-RAG-driven Control of Upper Body Impedance for Humanoid Robot
SafeHumanoid: VLM-RAG-driven Control of Upper Body Impedance for Humanoid Robot
Yara Mahmoud
Jeffrin Sam
Nguyen Khang
Marcelino Fernando
Issatay Tokmurziyev
Miguel Altamirano Cabrera
Muhammad Haris Khan
Artem Lykov
Dzmitry Tsetserukou
118
0
0
28 Nov 2025
CoT4AD: A Vision-Language-Action Model with Explicit Chain-of-Thought Reasoning for Autonomous Driving
CoT4AD: A Vision-Language-Action Model with Explicit Chain-of-Thought Reasoning for Autonomous Driving
Zhaohui Wang
Tengbo Yu
Hao Tang
LRM
171
0
0
27 Nov 2025
Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
Amir Rasouli
Montgomery Alban
Sajjad Pakdamansavoji
Zhiyuan Li
Zhanguang Zhang
Aaron Wu
Xuan Zhao
74
0
0
27 Nov 2025
Mechanistic Finetuning of Vision-Language-Action Models via Few-Shot Demonstrations
Mechanistic Finetuning of Vision-Language-Action Models via Few-Shot Demonstrations
Chancharik Mitra
Yusen Luo
Raj Saravanan
Dantong Niu
Anirudh Pai
Jesse Thomason
Trevor Darrell
Abrar Anwar
Deva Ramanan
Roei Herzig
62
0
0
27 Nov 2025
LLM-Based Generalizable Hierarchical Task Planning and Execution for Heterogeneous Robot Teams with Event-Driven Replanning
LLM-Based Generalizable Hierarchical Task Planning and Execution for Heterogeneous Robot Teams with Event-Driven Replanning
Suraj Borate
Bhavish Rai B
Vipul Pardeshi
Madhu Vadali
60
0
0
27 Nov 2025
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action
Zhen Fang
Zhuoyang Liu
Jiaming Liu
Hao Chen
Y. Zeng
Shiting Huang
Zehui Chen
L. Chen
Shanghang Zhang
Feng Zhao
LRM
112
3
0
27 Nov 2025
From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings
From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings
Jiajie Zhang
Sören Schwertfeger
Alexander Kleiner
103
0
0
26 Nov 2025
VacuumVLA: Boosting VLA Capabilities via a Unified Suction and Gripping Tool for Complex Robotic Manipulation
VacuumVLA: Boosting VLA Capabilities via a Unified Suction and Gripping Tool for Complex Robotic Manipulation
Hui Zhou
Siyuan Huang
Minxing Li
Hao Zhang
Lue Fan
Shaoshuai Shi
190
0
0
26 Nov 2025
Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models
Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models
Naifu Zhang
Wei Tao
Xi Xiao
Qianpu Sun
Yuxin Zheng
Wentao Mo
Peiqiang Wang
Nan Zhang
AAMLVLM
797
0
0
26 Nov 2025
Hyper-GoalNet: Goal-Conditioned Manipulation Policy Learning with HyperNetworks
Hyper-GoalNet: Goal-Conditioned Manipulation Policy Learning with HyperNetworks
Pei Zhou
Wanting Yao
Qian Luo
Xunzhe Zhou
Yanchao Yang
86
1
0
26 Nov 2025
$\mathcal{E}_0$: Enhancing Generalization and Fine-Grained Control in VLA Models via Continuized Discrete Diffusion
E0\mathcal{E}_0E0​: Enhancing Generalization and Fine-Grained Control in VLA Models via Continuized Discrete Diffusion
Zhihao Zhan
Jiaying Zhou
Likui Zhang
Qinhan Lv
Hao Liu
...
Ziliang Chen
Tianshui Chen
Keze Wang
Liang Lin
Guangrun Wang
VGenVLM
213
1
0
26 Nov 2025
TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos
TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos
Seungjae Lee
Yoonkyo Jung
Inkook Chun
Yao-Chih Lee
Zikui Cai
...
Aayush Talreja
Tan Dat Dao
Yongyuan Liang
Jia-Bin Huang
Furong Huang
109
0
0
26 Nov 2025
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models
When Robots Obey the Patch: Universal Transferable Patch Attacks on Vision-Language-Action Models
Hui Lu
Yi Yu
Yiming Yang
Chenyu Yi
Qixin Zhang
Bingquan Shen
Alex Chichung Kot
Xudong Jiang
AAML
488
0
0
26 Nov 2025
Unifying Perception and Action: A Hybrid-Modality Pipeline with Implicit Visual Chain-of-Thought for Robotic Action Generation
Unifying Perception and Action: A Hybrid-Modality Pipeline with Implicit Visual Chain-of-Thought for Robotic Action Generation
Xiangkai Ma
Lekai Xing
Han Zhang
Wenzhong Li
Sanglu Lu
LM&RoVGen
213
0
0
25 Nov 2025
DeeAD: Dynamic Early Exit of Vision-Language Action for Efficient Autonomous Driving
DeeAD: Dynamic Early Exit of Vision-Language Action for Efficient Autonomous Driving
Haibo Hu
Lianming Huang
Nan Guan
Chun Jason Xue
VLM
213
0
0
25 Nov 2025
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
Yunze Man
S. S. Wang
Guowen Zhang
Johan Bjorck
Zhiqi Li
Liang-Yan Gui
Jim Fan
Jan Kautz
Yu Wang
Zhiding Yu
132
0
0
25 Nov 2025
Arcadia: Toward a Full-Lifecycle Framework for Embodied Lifelong Learning
Minghe Gao
Juncheng Billy Li
Yuze Lin
Xuqi Liu
Jiaming Ji
...
Kai Shen
Jun Xiao
Qi Wu
Siliang Tang
Yueting Zhuang
AI4CE
102
1
0
25 Nov 2025
Semantic Router: On the Feasibility of Hijacking MLLMs via a Single Adversarial Perturbation
Semantic Router: On the Feasibility of Hijacking MLLMs via a Single Adversarial Perturbation
Changyue Li
Jiaying Li
Youliang Yuan
Jiaming He
Zhicong Huang
Pinjia He
AAML
250
0
0
25 Nov 2025
Dynamic Test-Time Compute Scaling in Control Policy: Difficulty-Aware Stochastic Interpolant Policy
Dynamic Test-Time Compute Scaling in Control Policy: Difficulty-Aware Stochastic Interpolant Policy
Inkook Chun
Seungjae Lee
M. S. Albergo
Saining Xie
Eric Vanden-Eijnden
143
0
0
25 Nov 2025
Reinforcing Action Policies by Prophesying
Reinforcing Action Policies by Prophesying
Jiahui Zhang
Ze Huang
Chun Gu
Zipei Ma
Li Zhang
233
1
0
25 Nov 2025
Percept-WAM: Perception-Enhanced World-Awareness-Action Model for Robust End-to-End Autonomous Driving
Percept-WAM: Perception-Enhanced World-Awareness-Action Model for Robust End-to-End Autonomous Driving
J. N. Han
Meng Tian
Jiangtong Zhu
Fan He
Huixin Zhang
...
Siyuan Dong
Lu Hou
Qingqiu Huang
Xiaosong Jia
H. Xu
VLM
160
1
0
24 Nov 2025
Mixture of Horizons in Action Chunking
Mixture of Horizons in Action Chunking
Dong Jing
Gang Wang
Jiaqi Liu
Weiliang Tang
Zelong Sun
Yunchao Yao
Zhenyu Wei
Y. Liu
Zhiwu Lu
Mingyu Ding
247
1
0
24 Nov 2025
Discover, Learn, and Reinforce: Scaling Vision-Language-Action Pretraining with Diverse RL-Generated Trajectories
Discover, Learn, and Reinforce: Scaling Vision-Language-Action Pretraining with Diverse RL-Generated Trajectories
Rushuai Yang
Zhiyuan Feng
Tianxiang Zhang
Kaixin Wang
Chuheng Zhang
Li Zhao
Xiu Su
Yi-Ling Chen
Jiang Bian
OffRL
209
0
0
24 Nov 2025
MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
Yuxia Fu
Zhizhen Zhang
Y. Zhang
Zijian Wang
Zi-Rui Huang
Yadan Luo
MoMe
315
1
0
24 Nov 2025
Compressor-VLA: Instruction-Guided Visual Token Compression for Efficient Robotic Manipulation
Compressor-VLA: Instruction-Guided Visual Token Compression for Efficient Robotic Manipulation
Juntao Gao
Feiyang Ye
Jing Zhang
Wenjing Qian
75
0
0
24 Nov 2025
AVA-VLA: Improving Vision-Language-Action models with Active Visual Attention
AVA-VLA: Improving Vision-Language-Action models with Active Visual Attention
Lei Xiao
Jifeng Li
Juntao Gao
Feiyang Ye
Yan Jin
Jingjing Qian
Jing Zhang
Y. Wu
Xiaoyuan Yu
353
0
0
24 Nov 2025
ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models
ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models
Wencheng Ye
Tianshi Wang
Lei Zhu
Fengling Li
G. Yang
VLM
170
0
0
22 Nov 2025
EchoVLA: Robotic Vision-Language-Action Model with Synergistic Declarative Memory for Mobile Manipulation
EchoVLA: Robotic Vision-Language-Action Model with Synergistic Declarative Memory for Mobile Manipulation
Min Lin
Xiwen Liang
Bingqian Lin
Liu Jingzhi
Zijian Jiao
...
Yuhan Ma
Yuecheng Liu
Shen Zhao
Yuzheng Zhuang
Xiaodan Liang
LM&Ro
238
1
0
22 Nov 2025
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
Ting Huang
Dongjian Li
Rui Yang
Zeyu Zhang
Zida Yang
Hao Tang
LRM
128
4
0
22 Nov 2025
1234...131415
Next
Page 1 of 15
Pageof 15