Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2406.09246
Cited By
v1
v2 (latest)
OpenVLA: An Open-Source Vision-Language-Action Model
13 June 2024
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
Suraj Nair
Rafael Rafailov
Ethan P. Foster
Grace Lam
Pannag R Sanketi
Quan Vuong
Thomas Kollar
Benjamin Burchfiel
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&Ro
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (40 upvotes)
Papers citing
"OpenVLA: An Open-Source Vision-Language-Action Model"
50 / 723 papers shown
Embodied Scene Understanding for Vision Language Models via MetaVQA
Computer Vision and Pattern Recognition (CVPR), 2025
Weizhen Wang
Chenda Duan
Zhenghao Peng
Yuxin Liu
Bolei Zhou
LM&Ro
330
9
0
17 Jan 2025
Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation Learning
IEEE International Conference on Robotics and Automation (ICRA), 2025
Juntao Ren
Priya Sundaresan
Dorsa Sadigh
Sanjiban Choudhury
Jeannette Bohg
316
50
0
13 Jan 2025
Whole-Body Integrated Motion Planning for Aerial Manipulators
Weiliang Deng
Hongming Chen
Biyu Ye
Haoran Chen
Ximin Lyu
X. Lyu
307
7
0
11 Jan 2025
OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints
Computer Vision and Pattern Recognition (CVPR), 2025
Mingjie Pan
Jiyao Zhang
Tianshu Wu
Yinghao Zhao
Wenlong Gao
Hao Dong
LM&Ro
267
45
0
08 Jan 2025
Visual Large Language Models for Generalized and Specialized Applications
Jiayi Zhang
Zhixin Lai
Wentao Bao
Zhen Tan
Anh Dao
Kewei Sui
Jiayi Shen
Dong Liu
Huan Liu
Yu Kong
VLM
465
33
0
06 Jan 2025
T-DOM: A Taxonomy for Robotic Manipulation of Deformable Objects
David Blanco Mulero
Yifei Dong
J´ulia Borras
Florian T. Pokorny
Carme Torras
255
7
0
31 Dec 2024
UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
Fangwei Zhong
Kui Wu
Churan Wang
Hao Chen
Hai Ci
Zhoujun Li
Yizhou Wang
VGen
335
11
0
30 Dec 2024
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
Computer Vision and Pattern Recognition (CVPR), 2024
Jihan Yang
Shusheng Yang
Anjali W. Gupta
Rilyn Han
Li Fei-Fei
Saining Xie
LRM
528
349
0
18 Dec 2024
Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection
Computer Vision and Pattern Recognition (CVPR), 2024
Le Yang
Ziwei Zheng
Boxu Chen
Subrat Kishore Dutta
Chenhao Lin
Chao Shen
VLM
612
23
0
18 Dec 2024
RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation
Kun Wu
Chengkai Hou
Jiaming Liu
Zhengping Che
Xiaozhu Ju
...
Zhenyu Wang
Pengju An
Siyuan Qian
Shanghang Zhang
Jian Tang
LM&Ro
582
98
0
18 Dec 2024
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning
International Conference on Learning Representations (ICLR), 2024
Moritz Reuss
Jyothish Pari
Pulkit Agrawal
Rudolf Lioutikov
DiffM
MoE
298
30
0
17 Dec 2024
AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation
Guanxing Lu
Tengbo Yu
Haoyuan Deng
Season Si Chen
Yansong Tang
Ziwei Wang
460
9
0
09 Dec 2024
A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
ACM Computing Surveys (ACM CSUR), 2024
Ola Shorinwa
Zhiting Mei
Justin Lidard
Allen Z. Ren
Anirudha Majumdar
HILM
LRM
438
19
0
07 Dec 2024
Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control
Seongmin Park
Hyungmin Kim
Wonseok Jeon
Juyoung Yang
Byeongwook Jeon
Yoonseon Oh
Jungwook Choi
348
12
0
02 Dec 2024
Robot Learning with Super-Linear Scaling
M. Torné
Arhan Jain
Jiayi Yuan
Vidaaranya Macha
Lars L. Ankile
Anthony Simeonov
Pulkit Agrawal
Abhishek Gupta
OffRL
LM&Ro
310
11
0
02 Dec 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
358
195
0
29 Nov 2024
RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World
Weixin Mao
Weiheng Zhong
Zhou Jiang
Dong Fang
Zhongyue Zhang
...
Fan Jia
Tiancai Wang
Haoqiang Fan
Osamu Yoshie
Osamu Yoshie
587
16
0
29 Nov 2024
Don't Let Your Robot be Harmful: Responsible Robotic Manipulation via Safety-as-Policy
IEEE Robotics and Automation Letters (RA-L), 2024
Minheng Ni
Lei Zhang
Zhaoyu Chen
Guang Dai
Wangmeng Zuo
Jianwei Zhang
Lei Zhang
W. Zuo
432
1
0
27 Nov 2024
RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics
Computer Vision and Pattern Recognition (CVPR), 2024
Chan Hee Song
Valts Blukis
Jonathan Tremblay
Stephen Tyree
Yu-Chuan Su
Stan Birchfield
889
84
0
25 Nov 2024
Inference-Time Policy Steering through Human Interactions
IEEE International Conference on Robotics and Automation (ICRA), 2024
Yanwei Wang
Lirui Wang
Yilun Du
Balakumar Sundaralingam
Xuning Yang
Yu-Wei Chao
Claudia Pérez-DÁrpino
Dieter Fox
Julie Shah
VGen
506
25
0
25 Nov 2024
Iris: Integrating Language into Diffusion-based Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
757
5
0
24 Nov 2024
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Computer Vision and Pattern Recognition (CVPR), 2024
Jiange Yang
Haoyi Zhu
Yanjie Wang
Gangshan Wu
Tong He
Limin Wang
451
11
0
21 Nov 2024
Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Taowen Wang
Cheng Han
James Liang
Wenhao Yang
Dongfang Liu
Luna Xinyu Zhang
Qifan Wang
Jiebo Luo
Ruixiang Tang
AAML
632
32
0
18 Nov 2024
Few-Shot Task Learning through Inverse Generative Modeling
Neural Information Processing Systems (NeurIPS), 2024
Aviv Netanyahu
Yilun Du
Antonia Bronars
Jyothish Pari
J. Tenenbaum
Tianmin Shu
Pulkit Agrawal
493
5
0
07 Nov 2024
STEER: Flexible Robotic Manipulation via Dense Language Grounding
IEEE International Conference on Robotics and Automation (ICRA), 2024
Laura Smith
A. Irpan
Montserrat Gonzalez Arenas
Sean Kirmani
Dmitry Kalashnikov
Dhruv Shah
Ted Xiao
LLMSV
306
8
0
05 Nov 2024
Addressing Failures in Robotics using Vision-Based Language Models (VLMs) and Behavior Trees (BT)
Faseeh Ahmad
Jonathan Styrud
Volker Krueger
287
4
0
03 Nov 2024
CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision
Gi-Cheon Kang
Junghyun Kim
Kyuhwan Shim
Jun Ki Lee
Byoung-Tak Zhang
LM&Ro
861
13
1
01 Nov 2024
Local Policies Enable Zero-shot Long-horizon Manipulation
IEEE International Conference on Robotics and Automation (ICRA), 2024
Murtaza Dalal
Min Liu
Walter Talbott
Chen Chen
Deepak Pathak
Jian Zhang
Ruslan Salakhutdinov
415
22
0
29 Oct 2024
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
Günter Klambauer
Razvan Pascanu
Sepp Hochreiter
782
10
0
29 Oct 2024
HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
IEEE International Conference on Robotics and Automation (ICRA), 2024
Tairan He
Wenli Xiao
Toru Lin
Zhengyi Luo
Zhenjia Xu
...
Changliu Liu
Guanya Shi
Xiaolong Wang
Linxi Fan
Yuke Zhu
346
90
0
28 Oct 2024
MotionGlot: A Multi-Embodied Motion Generation Model
IEEE International Conference on Robotics and Automation (ICRA), 2024
Sudarshan Harithas
Srinath Sridhar
404
3
0
22 Oct 2024
A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM
ByungOk Han
Jaehong Kim
Jinhyeok Jang
288
24
0
21 Oct 2024
VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making
Zuojin Tang
Bin-Bin Hu
Chenyang Zhao
De Ma
Gang Pan
Yinan Han
367
1
0
21 Oct 2024
Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance
Conference on Robot Learning (CoRL), 2024
Mitsuhiko Nakamoto
Oier Mees
Aviral Kumar
Sergey Levine
OffRL
361
49
0
17 Oct 2024
The State of Robot Motion Generation
Kostas E. Bekris
Joe H. Doerr
Patrick Meng
Sumanth Tangirala
3DV
349
3
0
16 Oct 2024
In-Context Learning Enables Robot Action Prediction in LLMs
IEEE International Conference on Robotics and Automation (ICRA), 2024
Yida Yin
Zekai Wang
Yuvan Sharma
Dantong Niu
Trevor Darrell
Roei Herzig
LM&Ro
505
16
0
16 Oct 2024
Latent Action Pretraining from Videos
International Conference on Learning Representations (ICLR), 2024
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
452
145
0
15 Oct 2024
Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback
Michelle Zhao
Reid G. Simmons
H. Admoni
Aaditya Ramdas
Andrea Bajcsy
695
7
0
11 Oct 2024
Zero-Shot Offline Imitation Learning via Optimal Transport
Thomas Rupf
Marco Bagatella
Nico Gürtler
Jonas Frey
Georg Martius
OffRL
1.1K
3
0
11 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
405
41
0
10 Oct 2024
Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning
Mariano Ramírez Montero
Ebrahim Shahabi
Giovanni Franzese
Jens Kober
Barbara Mazzolai
Cosimo Della Santina
295
0
0
10 Oct 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
International Conference on Learning Representations (ICLR), 2024
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
371
372
0
10 Oct 2024
Zero-Shot Generalization of Vision-Based RL Without Data Augmentation
Sumeet Batra
Gaurav Sukhatme
OffRL
DRL
292
3
0
09 Oct 2024
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
Zhijie Wang
Zhehua Zhou
Jiayang Song
Yuheng Huang
Zhan Shu
Lei Ma
229
6
0
07 Oct 2024
Control-oriented Clustering of Visual Latent Representation
International Conference on Learning Representations (ICLR), 2024
Han Qi
Haocheng Yin
Heng Yang
SSL
511
4
0
07 Oct 2024
Autoregressive Action Sequence Learning for Robotic Manipulation
IEEE Robotics and Automation Letters (RA-L), 2024
Xinyu Zhang
Yuhan Liu
Haonan Chang
Liam Schramm
Abdeslam Boularias
460
33
0
04 Oct 2024
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Haibo Wang
Zhiyang Xu
Yu Cheng
Shizhe Diao
Jiuxiang Gu
Yixin Cao
Qifan Wang
Weifeng Ge
Lifu Huang
266
56
0
04 Oct 2024
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
IEEE International Conference on Robotics and Automation (ICRA), 2024
Ricardo Garcia
Shizhe Chen
Cordelia Schmid
LM&Ro
344
35
0
02 Oct 2024
FoAM: Foresight-Augmented Multi-Task Imitation Policy for Robotic Manipulation
Litao Liu
Wentao Wang
Yifan Han
Zhuoli Xie
Pengfei Yi
Junyan Li
Yi Qin
Wenzhao Lian
386
2
0
29 Sep 2024
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
IEEE International Conference on Robotics and Automation (ICRA), 2024
Kun Wu
Yichen Zhu
Jinming Li
Junjie Wen
Ning Liu
Zhiyuan Xu
Qinru Qiu
535
17
0
27 Sep 2024
Previous
1
2
3
...
13
14
15
Next
Page 14 of 15
Page
of 15
Go