ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09246
  4. Cited By
OpenVLA: An Open-Source Vision-Language-Action Model
v1v2 (latest)

OpenVLA: An Open-Source Vision-Language-Action Model

13 June 2024
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
Suraj Nair
Rafael Rafailov
Ethan P. Foster
Grace Lam
Pannag R Sanketi
Quan Vuong
Thomas Kollar
Benjamin Burchfiel
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
    LM&RoVLM
ArXiv (abs)PDFHTMLHuggingFace (40 upvotes)

Papers citing "OpenVLA: An Open-Source Vision-Language-Action Model"

50 / 727 papers shown
The AI Agent Index
The AI Agent Index
Stephen Casper
Luke Bailey
Rosco Hunter
Carson Ezell
Emma Cabalé
...
Phillip J. K. Christoffersen
A. Pinar Ozisik
Rakshit Trivedi
Dylan Hadfield-Menell
Noam Kolt
516
24
0
03 Feb 2025
Inference-Time Enhancement of Generative Robot Policies via Predictive World Modeling
Inference-Time Enhancement of Generative Robot Policies via Predictive World Modeling
Han Qi
Haocheng Yin
Aris Zhu
Yilun Du
Heng Yang
708
22
0
02 Feb 2025
Embodied Scene Understanding for Vision Language Models via MetaVQA
Embodied Scene Understanding for Vision Language Models via MetaVQAComputer Vision and Pattern Recognition (CVPR), 2025
Weizhen Wang
Chenda Duan
Zhenghao Peng
Yuxin Liu
Bolei Zhou
LM&Ro
350
11
0
17 Jan 2025
Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation Learning
Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation LearningIEEE International Conference on Robotics and Automation (ICRA), 2025
Juntao Ren
Priya Sundaresan
Dorsa Sadigh
Sanjiban Choudhury
Jeannette Bohg
387
63
0
13 Jan 2025
Whole-Body Integrated Motion Planning for Aerial Manipulators
Whole-Body Integrated Motion Planning for Aerial Manipulators
Weiliang Deng
Hongming Chen
Biyu Ye
Haoran Chen
Ximin Lyu
X. Lyu
374
7
0
11 Jan 2025
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial ConstraintsComputer Vision and Pattern Recognition (CVPR), 2025
Mingjie Pan
Jiyao Zhang
Tianshu Wu
Yinghao Zhao
Wenlong Gao
Hao Dong
LM&Ro
298
56
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Jiayi Zhang
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
499
35
0
06 Jan 2025
T-DOM: A Taxonomy for Robotic Manipulation of Deformable Objects
T-DOM: A Taxonomy for Robotic Manipulation of Deformable Objects
David Blanco Mulero
Yifei Dong
J´ulia Borras
Florian T. Pokorny
Carme Torras
305
8
0
31 Dec 2024
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
Fangwei Zhong
Kui Wu
Churan Wang
Hao Chen
Hai Ci
Zhoujun Li
Yizhou Wang
VGen
348
14
0
30 Dec 2024
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall SpacesComputer Vision and Pattern Recognition (CVPR), 2024
Jihan Yang
Shusheng Yang
Anjali W. Gupta
Rilyn Han
Li Fei-Fei
Saining Xie
LRM
576
428
0
18 Dec 2024
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace ProjectionComputer Vision and Pattern Recognition (CVPR), 2024
Le Yang
Ziwei Zheng
Boxu Chen
Subrat Kishore Dutta
Chenhao Lin
Chao Shen
VLM
705
32
0
18 Dec 2024
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
Kun Wu
Chengkai Hou
Jiaming Liu
Zhengping Che
Xiaozhu Ju
...
Zhenyu Wang
Pengju An
Siyuan Qian
Shanghang Zhang
Jian Tang
LM&Ro
652
116
0
18 Dec 2024
Efficient Diffusion Transformer Policies with Mixture of Expert
  Denoisers for Multitask Learning
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask LearningInternational Conference on Learning Representations (ICLR), 2024
Moritz Reuss
Jyothish Pari
Pulkit Agrawal
Rudolf Lioutikov
DiffMMoE
341
39
0
17 Dec 2024
AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation
AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation
Guanxing Lu
Tengbo Yu
Haoyuan Deng
Season Si Chen
Yansong Tang
Ziwei Wang
502
11
0
09 Dec 2024
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2024
Ola Shorinwa
Zhiting Mei
Justin Lidard
Allen Z. Ren
Anirudha Majumdar
HILMLRM
488
19
0
07 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic
  Control
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
365
16
0
02 Dec 2024
Robot Learning with Super-Linear Scaling
Robot Learning with Super-Linear Scaling
M. Torné
Arhan Jain
Jiayi Yuan
Vidaaranya Macha
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Abhishek Gupta
OffRLLM&Ro
348
12
0
02 Dec 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing
  Cognition and Action in Robotic Manipulation
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
388
242
0
29 Nov 2024
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Weixin Mao
Weiheng Zhong
Zhou Jiang
Dong Fang
Zhongyue Zhang
...
Fan Jia
Tiancai Wang
Haoqiang Fan
Osamu Yoshie
Osamu Yoshie
655
21
0
29 Nov 2024
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation via Safety-as-Policy
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation via Safety-as-PolicyIEEE Robotics and Automation Letters (RA-L), 2024
Minheng Ni
Lei Zhang
Zhaoyu Chen
Guang Dai
Wangmeng Zuo
Jianwei Zhang
Lei Zhang
W. Zuo
443
1
0
27 Nov 2024
Inference-Time Policy Steering through Human Interactions
Inference-Time Policy Steering through Human InteractionsIEEE International Conference on Robotics and Automation (ICRA), 2024
Yanwei Wang
Lirui Wang
Yilun Du
Balakumar Sundaralingam
Xuning Yang
Yu-Wei Chao
Claudia Pérez-DÁrpino
Dieter Fox
Julie Shah
VGen
552
31
0
25 Nov 2024
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for RoboticsComputer Vision and Pattern Recognition (CVPR), 2024
Chan Hee Song
Valts Blukis
Jonathan Tremblay
Stephen Tyree
Yu-Chuan Su
Stan Birchfield
993
102
0
25 Nov 2024
Iris: Integrating Language into Diffusion-based Monocular Depth Estimation
Iris: Integrating Language into Diffusion-based Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffMMDE
820
5
0
24 Nov 2024
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy ConditioningComputer Vision and Pattern Recognition (CVPR), 2024
Jiange Yang
Haoyi Zhu
Yanjie Wang
Gangshan Wu
Tong He
Limin Wang
473
13
0
21 Nov 2024
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Taowen Wang
Cheng Han
James Liang
Wenhao Yang
Dongfang Liu
Luna Xinyu Zhang
Qifan Wang
Jiebo Luo
Ruixiang Tang
AAML
682
35
0
18 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling
Few-Shot Task Learning through Inverse Generative ModelingNeural Information Processing Systems (NeurIPS), 2024
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
520
5
0
07 Nov 2024
STEER: Flexible Robotic Manipulation via Dense Language Grounding
STEER: Flexible Robotic Manipulation via Dense Language GroundingIEEE International Conference on Robotics and Automation (ICRA), 2024
Laura Smith
A. Irpan
Montserrat Gonzalez Arenas
Sean Kirmani
Dmitry Kalashnikov
Dhruv Shah
Ted Xiao
LLMSV
316
9
0
05 Nov 2024
Addressing Failures in Robotics using Vision-Based Language Models
  (VLMs) and Behavior Trees (BT)
Addressing Failures in Robotics using Vision-Based Language Models (VLMs) and Behavior Trees (BT)
Faseeh Ahmad
Jonathan Styrud
Volker Krueger
302
5
0
03 Nov 2024
CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
Gi-Cheon Kang
Junghyun Kim
Kyuhwan Shim
Jun Ki Lee
Byoung-Tak Zhang
LM&Ro
969
15
1
01 Nov 2024
Local Policies Enable Zero-shot Long-horizon Manipulation
Local Policies Enable Zero-shot Long-horizon ManipulationIEEE International Conference on Robotics and Automation (ICRA), 2024
Murtaza Dalal
Min Liu
Walter Talbott
Chen Chen
Deepak Pathak
Jian Zhang
Ruslan Salakhutdinov
503
30
0
29 Oct 2024
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
Günter Klambauer
Razvan Pascanu
Sepp Hochreiter
823
12
0
29 Oct 2024
HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
HOVER: Versatile Neural Whole-Body Controller for Humanoid RobotsIEEE International Conference on Robotics and Automation (ICRA), 2024
Tairan He
Wenli Xiao
Toru Lin
Zhengyi Luo
Zhenjia Xu
...
Changliu Liu
Guanya Shi
Xiaolong Wang
Linxi Fan
Yuke Zhu
379
112
0
28 Oct 2024
MotionGlot: A Multi-Embodied Motion Generation Model
MotionGlot: A Multi-Embodied Motion Generation ModelIEEE International Conference on Robotics and Automation (ICRA), 2024
Sudarshan Harithas
Srinath Sridhar
429
3
0
22 Oct 2024
A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM
A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM
ByungOk Han
Jaehong Kim
Jinhyeok Jang
406
34
0
21 Oct 2024
VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making
VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making
Zuojin Tang
Bin-Bin Hu
Chenyang Zhao
De Ma
Gang Pan
Yinan Han
393
1
0
21 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Steering Your Generalists: Improving Robotic Foundation Models via Value GuidanceConference on Robot Learning (CoRL), 2024
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
387
62
0
17 Oct 2024
The State of Robot Motion Generation
The State of Robot Motion Generation
Kostas E. Bekris
Joe H. Doerr
Patrick Meng
Sumanth Tangirala
3DV
411
5
0
16 Oct 2024
In-Context Learning Enables Robot Action Prediction in LLMs
In-Context Learning Enables Robot Action Prediction in LLMsIEEE International Conference on Robotics and Automation (ICRA), 2024
Yida Yin
Zekai Wang
Yuvan Sharma
Dantong Niu
Trevor Darrell
Roei Herzig
LM&Ro
605
17
0
16 Oct 2024
Latent Action Pretraining from Videos
Latent Action Pretraining from VideosInternational Conference on Learning Representations (ICLR), 2024
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
522
191
0
15 Oct 2024
Zero-Shot Offline Imitation Learning via Optimal Transport
Zero-Shot Offline Imitation Learning via Optimal Transport
Thomas Rupf
Marco Bagatella
Nico Gürtler
Jonas Frey
Georg Martius
OffRL
1.2K
4
0
11 Oct 2024
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Michelle Zhao
Reid G. Simmons
H. Admoni
Aaditya Ramdas
Andrea Bajcsy
725
9
0
11 Oct 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
RDT-1B: a Diffusion Foundation Model for Bimanual ManipulationInternational Conference on Learning Representations (ICLR), 2024
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
415
463
0
10 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
430
50
0
10 Oct 2024
Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning
Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning
Mariano Ramírez Montero
Ebrahim Shahabi
Giovanni Franzese
Jens Kober
Barbara Mazzolai
Cosimo Della Santina
335
1
0
10 Oct 2024
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation
Sumeet Batra
Gaurav Sukhatme
OffRLDRL
309
4
0
09 Oct 2024
LADEV: A Language-Driven Testing and Evaluation Platform for
  Vision-Language-Action Models in Robotic Manipulation
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Zhijie Wang
Zhehua Zhou
Jiayang Song
Yuheng Huang
Zhan Shu
Lei Ma
268
7
0
07 Oct 2024
Control-oriented Clustering of Visual Latent Representation
Control-oriented Clustering of Visual Latent RepresentationInternational Conference on Learning Representations (ICLR), 2024
Han Qi
Haocheng Yin
Heng Yang
SSL
583
7
0
07 Oct 2024
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Haibo Wang
Zhiyang Xu
Yu Cheng
Shizhe Diao
Jiuxiang Gu
Yixin Cao
Qifan Wang
Weifeng Ge
Lifu Huang
304
63
0
04 Oct 2024
Autoregressive Action Sequence Learning for Robotic Manipulation
Autoregressive Action Sequence Learning for Robotic ManipulationIEEE Robotics and Automation Letters (RA-L), 2024
Xinyu Zhang
Yuhan Liu
Haonan Chang
Liam Schramm
Abdeslam Boularias
524
35
0
04 Oct 2024
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D PolicyIEEE International Conference on Robotics and Automation (ICRA), 2024
Ricardo Garcia
Shizhe Chen
Cordelia Schmid
LM&Ro
374
39
0
02 Oct 2024
Previous
123...131415
Next
Page 14 of 15
Pageof 15