v1v2 (latest)

OpenVLA: An Open-Source Vision-Language-Action Model

13 June 2024

Quan Vuong

Dorsa Sadigh

Percy Liang

Chelsea Finn

LM&Ro

VLM

ArXiv (abs)PDF HTML HuggingFace (40 upvotes)

Papers citing "OpenVLA: An Open-Source Vision-Language-Action Model"

50 / 727 papers shown

PartInstruct: Part-level Instruction Following for Fine-grained Robot ManipulationRobotics (RAS), 2025

312

27 May 2025

Benign-to-Toxic Jailbreaking: Inducing Harmful Responses from Harmless Prompts

206

26 May 2025

RFTF: Reinforcement Fine-tuning for Embodied Agents with Temporal Feedback

Junyang Shu

Zhiwei Lin

Yongtao Wang

337

26 May 2025

Chain-of-Thought for Autonomous Driving: A Comprehensive Survey and Future Prospects

450

26 May 2025

What Can RL Bring to VLA Generalization? An Empirical Study

1.0K

26 May 2025

RetroMotion: Retrocausal Motion Forecasting Models are Instructable

Abhishek Vivekanandan

Carlos Fernandez

Christoph Stiller

330

26 May 2025

HAND Me the Data: Fast Robot Adaptation via Hand Path Retrieval

Matthew Hong

Anthony Liang

Kevin Kim

Harshitha Rajaprakash

Jesse Thomason

Erdem Bıyık

Jesse Zhang

572

26 May 2025

Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review

341

26 May 2025

ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving

...

308

26 May 2025

ReFineVLA: Reasoning-Aware Teacher-Guided Transfer Fine-Tuning

220

25 May 2025

WorldEval: World Model as Real-World Robot Policies Evaluator

237

25 May 2025

Genie Centurion: Accelerating Scalable Real-World Robot Training with Human Rewind-and-Refine Guidance

...

319

24 May 2025

VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning

565

24 May 2025

Bootstrapping Imitation Learning for Long-horizon Manipulation via Hierarchical Data Collection Space

...

300

23 May 2025

One Demo Is All It Takes: Planning Domain Derivation with LLMs from A Single Demonstration

543

23 May 2025

HoloLLM: Multisensory Foundation Model for Language-Grounded Human Sensing and Reasoning

Chuhao Zhou

Jianfei Yang

VLM

550

23 May 2025

SEM: Enhancing Spatial Understanding for Robust Robot Manipulation

353

22 May 2025

ScanBot: Towards Intelligent Surface Scanning in Embodied Robotic Systems

291

22 May 2025

Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

387

22 May 2025

VL-SAFE: Vision-Language Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving

279

22 May 2025

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization

477

21 May 2025

AnyBody: A Benchmark Suite for Cross-Embodiment Manipulation

486

21 May 2025

Robo-DM: Data Management For Large Robot DatasetsIEEE International Conference on Robotics and Automation (ICRA), 2025

Lawrence Yunliang Chen

...

231

21 May 2025

Saliency-Aware Quantized Imitation Learning for Efficient Robotic Control

489

21 May 2025

Object-Focus Actor for Data-efficient Robot Generalization Dexterous Manipulation

348

21 May 2025

APEX: Empowering LLMs with Physics-Based Task Planning for Real-time Insight

401

20 May 2025

GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation

553

19 May 2025

Policy Contrastive Decoding for Robotic Foundation Models

936

19 May 2025

RoboFAC: A Comprehensive Framework for Robotic Failure Analysis and Correction

373

18 May 2025

OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning

272

17 May 2025

Unveiling the Potential of Vision-Language-Action Models with Open-Ended Multimodal Instructions

289

16 May 2025

ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations

Jiahui Zhang

Yusen Luo

Abrar Anwar

Sumedh Anand Sontakke

475

16 May 2025

Search-TTA: A Multimodal Test-Time Adaptation Framework for Visual Search in the Wild

...

765

16 May 2025

Real-Time Out-of-Distribution Failure Prevention via Multi-Modal Reasoning

531

15 May 2025

RT-Cache: Training-Free Retrieval for Real-Time Manipulation

515

14 May 2025

ManipBench: Benchmarking Vision-Language Models for Low-Level Robot Manipulation

395

14 May 2025

Mini Diffuser: Fast Multi-task Diffusion Policy Training Using Two-level Mini-batches

329

14 May 2025

VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation

408

14 May 2025

TransDiffuser: Diverse Trajectory Generation with Decorrelated Multi-modal Representation for End-to-end Autonomous Driving

490

14 May 2025

Augmented Reality for RObots (ARRO): Pointing Visuomotor Policies Towards Visual Robustness

590

13 May 2025

Training Strategies for Efficient Embodied Reasoning

528

13 May 2025

LaDi-WM: A Latent Diffusion-based World Model for Predictive Manipulation

620

13 May 2025

From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

521

13 May 2025

DexWild: Dexterous Human Interactions for In-the-Wild Robot PoliciesRobotics (RAS), 2025

244

12 May 2025

Pixel Motion as Universal Representation for Robot Control

476

12 May 2025

X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real

774

11 May 2025

Efficient Robotic Policy Learning via Latent Space Backward Planning

369

11 May 2025

UniVLA: Learning to Act Anywhere with Task-centric Latent ActionsRobotics (RAS), 2025

958

190

09 May 2025

StreamBridge: Turning Your Offline Video Large Language Model into a Proactive Streaming Assistant

697

08 May 2025

SITE: towards Spatial Intelligence Thorough Evaluation

377

08 May 2025