v1v2 (latest)

OpenVLA: An Open-Source Vision-Language-Action Model

13 June 2024

Quan Vuong

Dorsa Sadigh

Percy Liang

Chelsea Finn

LM&Ro

VLM

ArXiv (abs)PDF HTML HuggingFace (40 upvotes)

Papers citing "OpenVLA: An Open-Source Vision-Language-Action Model"

50 / 727 papers shown

...

Phillip J. K. Christoffersen

A. Pinar Ozisik

Rakshit Trivedi

Dylan Hadfield-Menell

Noam Kolt

516

03 Feb 2025

Inference-Time Enhancement of Generative Robot Policies via Predictive World Modeling

708

02 Feb 2025

Embodied Scene Understanding for Vision Language Models via MetaVQAComputer Vision and Pattern Recognition (CVPR), 2025

350

17 Jan 2025

Motion Tracks: A Unified Representation for Human-Robot Transfer in Few-Shot Imitation LearningIEEE International Conference on Robotics and Automation (ICRA), 2025

387

13 Jan 2025

Whole-Body Integrated Motion Planning for Aerial Manipulators

374

11 Jan 2025

OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial ConstraintsComputer Vision and Pattern Recognition (CVPR), 2025

298

08 Jan 2025

Visual Large Language Models for Generalized and Specialized Applications

499

06 Jan 2025

T-DOM: A Taxonomy for Robotic Manipulation of Deformable Objects

305

31 Dec 2024

UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI

348

30 Dec 2024

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall SpacesComputer Vision and Pattern Recognition (CVPR), 2024

576

428

18 Dec 2024

Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace ProjectionComputer Vision and Pattern Recognition (CVPR), 2024

705

18 Dec 2024

RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation

...

652

116

18 Dec 2024

Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask LearningInternational Conference on Learning Representations (ICLR), 2024

341

17 Dec 2024

AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation

502

09 Dec 2024

A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future DirectionsACM Computing Surveys (ACM CSUR), 2024

488

07 Dec 2024

Quantization-Aware Imitation-Learning for Resource-Efficient Robotic Control

365

02 Dec 2024

Robot Learning with Super-Linear Scaling

348

02 Dec 2024

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

...

388

242

29 Nov 2024

RoboMatrix: A Skill-centric Hierarchical Framework for Scalable Robot Task Planning and Execution in Open-World

...

655

29 Nov 2024

Don't Let Your Robot be Harmful: Responsible Robotic Manipulation via Safety-as-PolicyIEEE Robotics and Automation Letters (RA-L), 2024

443

27 Nov 2024

Inference-Time Policy Steering through Human InteractionsIEEE International Conference on Robotics and Automation (ICRA), 2024

Yanwei Wang

Lirui Wang

Yilun Du

Balakumar Sundaralingam

Xuning Yang

Yu-Wei Chao

Claudia Pérez-DÁrpino

Dieter Fox

Julie Shah

VGen

552

25 Nov 2024

RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for RoboticsComputer Vision and Pattern Recognition (CVPR), 2024

993

102

25 Nov 2024

Iris: Integrating Language into Diffusion-based Monocular Depth Estimation

820

24 Nov 2024

Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy ConditioningComputer Vision and Pattern Recognition (CVPR), 2024

473

21 Nov 2024

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

682

18 Nov 2024

Few-Shot Task Learning through Inverse Generative ModelingNeural Information Processing Systems (NeurIPS), 2024

520

07 Nov 2024

STEER: Flexible Robotic Manipulation via Dense Language GroundingIEEE International Conference on Robotics and Automation (ICRA), 2024

Laura Smith

A. Irpan

Montserrat Gonzalez Arenas

316

05 Nov 2024

Addressing Failures in Robotics using Vision-Based Language Models (VLMs) and Behavior Trees (BT)

Faseeh Ahmad

Jonathan Styrud

Volker Krueger

302

03 Nov 2024

CLIP-RT: Learning Language-Conditioned Robotic Policies from Natural Language Supervision

969

01 Nov 2024

Local Policies Enable Zero-shot Long-horizon ManipulationIEEE International Conference on Robotics and Automation (ICRA), 2024

503

29 Oct 2024

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Johannes Brandstetter

Günter Klambauer

Razvan Pascanu

Sepp Hochreiter

823

29 Oct 2024

HOVER: Versatile Neural Whole-Body Controller for Humanoid RobotsIEEE International Conference on Robotics and Automation (ICRA), 2024

Zhenjia Xu

...

Xiaolong Wang

379

112

28 Oct 2024

MotionGlot: A Multi-Embodied Motion Generation ModelIEEE International Conference on Robotics and Automation (ICRA), 2024

Sudarshan Harithas

Srinath Sridhar

429

22 Oct 2024

A Dual Process VLA: Efficient Robotic Manipulation Leveraging VLM

ByungOk Han

Jaehong Kim

Jinhyeok Jang

406

21 Oct 2024

VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making

393

21 Oct 2024

Steering Your Generalists: Improving Robotic Foundation Models via Value GuidanceConference on Robot Learning (CoRL), 2024

387

17 Oct 2024

The State of Robot Motion Generation

411

16 Oct 2024

In-Context Learning Enables Robot Action Prediction in LLMsIEEE International Conference on Robotics and Automation (ICRA), 2024

605

16 Oct 2024

Latent Action Pretraining from VideosInternational Conference on Learning Representations (ICLR), 2024

...

522

191

15 Oct 2024

Zero-Shot Offline Imitation Learning via Optimal Transport

1.2K

11 Oct 2024

Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback

725

11 Oct 2024

RDT-1B: a Diffusion Foundation Model for Bimanual ManipulationInternational Conference on Learning Representations (ICLR), 2024

Zhengyi Wang

Hang Su

Jun Zhu

415

463

10 Oct 2024

Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation

Qingwen Bu

Hongyang Li

Li Chen

430

10 Oct 2024

Mastering Contact-rich Tasks by Combining Soft and Rigid Robotics with Imitation Learning

Mariano Ramírez Montero

335

10 Oct 2024

Zero-Shot Generalization of Vision-Based RL Without Data Augmentation

Sumeet Batra

Gaurav Sukhatme

OffRL DRL

309

09 Oct 2024

LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation

Zhijie Wang

Zhehua Zhou

Jiayang Song

Yuheng Huang

Zhan Shu

Lei Ma

268

07 Oct 2024

Control-oriented Clustering of Visual Latent RepresentationInternational Conference on Learning Representations (ICLR), 2024

583

07 Oct 2024

Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models

304

04 Oct 2024

Autoregressive Action Sequence Learning for Robotic ManipulationIEEE Robotics and Automation Letters (RA-L), 2024

Yuhan Liu

524

04 Oct 2024

Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D PolicyIEEE International Conference on Robotics and Automation (ICRA), 2024

374

02 Oct 2024