Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2405.12213
Cited By
v1
v2 (latest)
Octo: An Open-Source Generalist Robot Policy
20 May 2024
Octo Model Team
Dibya Ghosh
Homer Walke
Karl Pertsch
Kevin Black
Oier Mees
Sudeep Dasari
Joey Hejna
Tobias Kreiman
Charles Xu
Jianlan Luo
You Liang Tan
Lawrence Yunliang Chen
Pannag R Sanketi
Quan Vuong
Ted Xiao
Dorsa Sadigh
Chelsea Finn
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (30 upvotes)
Papers citing
"Octo: An Open-Source Generalist Robot Policy"
50 / 400 papers shown
Title
E
0
\mathcal{E}_0
E
0
: Enhancing Generalization and Fine-Grained Control in VLA Models via Continuized Discrete Diffusion
Zhihao Zhan
Jiaying Zhou
Likui Zhang
Qinhan Lv
Hao Liu
...
Ziliang Chen
Tianshui Chen
Keze Wang
Liang Lin
Guangrun Wang
VGen
VLM
116
0
0
26 Nov 2025
Unifying Perception and Action: A Hybrid-Modality Pipeline with Implicit Visual Chain-of-Thought for Robotic Action Generation
Xiangkai Ma
Lekai Xing
Han Zhang
Wenzhong Li
Sanglu Lu
LM&Ro
VGen
167
0
0
25 Nov 2025
Reinforcing Action Policies by Prophesying
Jiahui Zhang
Ze Huang
Chun Gu
Zipei Ma
Li Zhang
184
0
0
25 Nov 2025
Mixture of Horizons in Action Chunking
Dong Jing
Gang Wang
Jiaqi Liu
Weiliang Tang
Zelong Sun
Yunchao Yao
Zhenyu Wei
Y. Liu
Zhiwu Lu
Mingyu Ding
183
0
0
24 Nov 2025
Discover, Learn, and Reinforce: Scaling Vision-Language-Action Pretraining with Diverse RL-Generated Trajectories
Rushuai Yang
Zhiyuan Feng
Tianxiang Zhang
Kaixin Wang
Chuheng Zhang
Li Zhao
Xiu Su
Yi-Ling Chen
Jiang Bian
OffRL
173
0
0
24 Nov 2025
EchoVLA: Robotic Vision-Language-Action Model with Synergistic Declarative Memory for Mobile Manipulation
Min Lin
Xiwen Liang
Bingqian Lin
Liu Jingzhi
Zijian Jiao
...
Yuhan Ma
Yuecheng Liu
Shen Zhao
Yuzheng Zhuang
Xiaodan Liang
LM&Ro
179
0
0
22 Nov 2025
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model
Y. Fu
Ning Chen
Junkai Zhao
Shaozhe Shan
Guocai Yao
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
148
0
0
21 Nov 2025
RynnVLA-002: A Unified Vision-Language-Action and World Model
Jun Cen
Siteng Huang
Yuqian Yuan
Kehan Li
Hangjie Yuan
...
Xin Li
Hao Luo
Fan Wang
Deli Zhao
H. Chen
VGen
SyDa
265
0
0
21 Nov 2025
SPEAR-1: Scaling Beyond Robot Demonstrations via 3D Understanding
Nikolay Nikolov
Giuliano Albanese
Sombit Dey
Aleksandar Yanev
Luc Van Gool
Jan-Nico Zaech
D. Paudel
LM&Ro
300
0
0
21 Nov 2025
RoboCOIN: An Open-Sourced Bimanual Robotic Data COllection for INtegrated Manipulation
Shihan Wu
Xuecheng Liu
Shaoxuan Xie
Pengwei Wang
Xinghang Li
...
Hao Zhao
Tiejun Huang
Shanghang Zhang
Yonghua Lin
Zhongyuan Wang
148
0
0
21 Nov 2025
VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation
Hanyu Zhou
Chuanhao Ma
Gim Hee Lee
140
0
0
21 Nov 2025
BOP-ASK: Object-Interaction Reasoning for Vision-Language Models
V. Bhat
Sungsu Kim
Valts Blukis
Greg Heinrich
Prashanth Krishnamurthy
Ramesh Karri
Stan Birchfield
Farshad Khorrami
Jonathan Tremblay
VLM
189
1
0
20 Nov 2025
When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models
Yuping Yan
Yuhan Xie
Yinxin Zhang
Lingjuan Lyu
Yaochu Jin
Yaochu Jin
AAML
184
1
0
20 Nov 2025
Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight
Yi Yang
X. Li
Yiyang Chen
Jin Song
Yihan Wang
Zipeng Xiao
Jiadi Su
You Qiaoben
Pengfei Liu
Zhijie Deng
VLM
157
0
0
20 Nov 2025
Enhancing End-to-End Autonomous Driving with Risk Semantic Distillaion from VLM
Jack Qin
Zhitao Wang
Yinan Zheng
Keyu Chen
Yang Zhou
Yuanxin Zhong
Siyuan Cheng
100
0
0
18 Nov 2025
HMC: Learning Heterogeneous Meta-Control for Contact-Rich Loco-Manipulation
Lai Wei
Xuanbin Peng
Ri-Zhao Qiu
Tianshu Huang
Xuxin Cheng
Xiaolong Wang
56
2
0
18 Nov 2025
One-Step Generative Policies with Q-Learning: A Reformulation of MeanFlow
Zeyuan Wang
Da Li
Yulin Chen
Ye-ling Shi
Liang Bai
Tianyuan Yu
Yanwei Fu
OffRL
132
0
0
17 Nov 2025
Decoupled Action Head: Confining Task Knowledge to Conditioning Layers
Jian Zhou
Sihao Lin
Shuai Fu
Qi Wu
OffRL
68
0
0
15 Nov 2025
SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
W. Li
Renshan Zhang
Rui Shao
Zhijian Fang
Kaiwen Zhou
Zhuotao Tian
Liqiang Nie
283
2
0
13 Nov 2025
Learning a Thousand Tasks in a Day
Science Robotics (Sci. Robot.), 2025
Kamil Dreczkowski
Pietro Vitiello
Vitalis Vosylius
Edward Johns
OffRL
280
1
0
13 Nov 2025
How Do VLAs Effectively Inherit from VLMs?
Chuheng Zhang
Rushuai Yang
Xiaoyu Chen
Kaixin Wang
Li Zhao
Yi-Ling Chen
Jiang Bian
LM&Ro
234
0
0
10 Nov 2025
ExpReS-VLA: Specializing Vision-Language-Action Models Through Experience Replay and Retrieval
Shahram Najam Syed
Yatharth Ahuja
Arthur Jakobsson
Jeff Ichnowski
VLM
73
1
0
09 Nov 2025
Let Me Show You: Learning by Retrieving from Egocentric Video for Robotic Manipulation
Yichen Zhu
Feifei Feng
84
1
0
07 Nov 2025
EveryDayVLA: A Vision-Language-Action Model for Affordable Robotic Manipulation
Samarth Chopra
Alex McMoil
Ben Carnovale
Evan Sokolson
Rajkumar Kubendran
Samuel Dickerson
66
0
0
07 Nov 2025
TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models
Hokyun Im
Euijin Jeong
Jianlong Fu
Andrey Kolobov
Youngwoon Lee
60
0
0
07 Nov 2025
Real-to-Sim Robot Policy Evaluation with Gaussian Splatting Simulation of Soft-Body Interactions
Kaifeng Zhang
Shuo Sha
Hanxiao Jiang
M. Loper
Hyunjong Song
Guangyan Cai
Zhuo Xu
Xiaochen Hu
Changxi Zheng
Yunzhu Li
219
1
0
06 Nov 2025
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
Shichao Fan
K. Wu
Zhengping Che
X. Wang
Di Wu
...
M. M. Li
Qingjie Liu
Shanghang Zhang
Min Wan
Yong Dai
196
1
0
04 Nov 2025
Learning Interactive World Model for Object-Centric Reinforcement Learning
Fan Feng
Phillip Lippe
Sara Magliacane
OffRL
OCL
258
0
0
04 Nov 2025
Dexterous Robotic Piano Playing at Scale
Le Chen
Yi Zhao
Jan Schneider
Quankai Gao
Simon Guist
Cheng Qian
Juho Kannala
Bernhard Schölkopf
Joni Pajarinen
Dieter Büchler
132
0
0
04 Nov 2025
Scaling Cross-Embodiment World Models for Dexterous Manipulation
Zihao He
Bo Ai
Tongzhou Mu
Yulin Liu
Weikang Wan
Jiawei Fu
Yilun Du
Henrik I. Christensen
H. Su
166
1
0
03 Nov 2025
PixelVLA: Advancing Pixel-level Understanding in Vision-Language-Action Model
Wenqi Liang
Gan Sun
Yao He
Jiahua Dong
Suyan Dai
Ivan Laptev
Salman Khan
Yang Cong
LM&Ro
3DV
VLM
158
0
0
03 Nov 2025
RobustVLA: Robustness-Aware Reinforcement Post-Training for Vision-Language-Action Models
Hongyin Zhang
Shuo Zhang
Junxi Jin
Qixin Zeng
Runze Li
Donglin Wang
VLM
232
1
0
03 Nov 2025
Towards a Multi-Embodied Grasping Agent
Roman Freiberg
Alexander Qualmann
Ngo Anh Vien
Gerhard Neumann
100
0
0
31 Oct 2025
A Step Toward World Models: A Survey on Robotic Manipulation
Peng-Fei Zhang
Ying Cheng
Xiaofan Sun
S. Wang
Lei Zhu
Lei Zhu
Heng Tao Shen
LM&Ro
602
2
0
31 Oct 2025
DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models
Cheng Yin
Yankai Lin
Wang Xu
Sikyuen Tam
Xiangrui Zeng
Zhiyuan Liu
Zhouping Yin
LRM
137
1
0
31 Oct 2025
EgoMI: Learning Active Vision and Whole-Body Manipulation from Egocentric Human Demonstrations
Justin Yu
Yide Shentu
Di Wu
Pieter Abbeel
Ken Goldberg
Philipp Wu
65
2
0
31 Oct 2025
Learning Generalizable Visuomotor Policy through Dynamics-Alignment
Dohyeok Lee
Jung Min Lee
Munkyung Kim
Seokhun Ju
Jin Woo Koo
Kyungjae Lee
Dohyeong Kim
Taehyun Cho
Jungwoo Lee
84
0
0
31 Oct 2025
BLM
1
_1
1
: A Boundless Large Model for Cross-Space, Cross-Task, and Cross-Embodiment Learning
Wentao Tan
Bowen Wang
Heng Zhi
Chenyu Liu
Z. Li
...
Chen Xu
Zhibin Wang
Tianshi Wang
Lei Zhu
Heng Tao Shen
LM&Ro
139
0
0
28 Oct 2025
DynaRend: Learning 3D Dynamics via Masked Future Rendering for Robotic Manipulation
Jingyi Tian
Le Wang
Sanping Zhou
Sen Wang
Jiayi Li
Gang Hua
80
0
0
28 Oct 2025
RoboOmni: Proactive Robot Manipulation in Omni-modal Context
Siyin Wang
Jinlan Fu
Feihong Liu
Xinzhe He
Huangxuan Wu
...
Z. F. Wu
Yugang Jiang
See-Kiong Ng
Tat-Seng Chua
Xipeng Qiu
LM&Ro
222
1
0
27 Oct 2025
RobotArena
∞
\infty
∞
: Scalable Robot Benchmarking via Real-to-Sim Translation
Yash Jangir
Yidi Zhang
Kashu Yamazaki
Chenyu Zhang
Kuan-Hsun Tu
Tsung-Wei Ke
Lei Ke
Yonatan Bisk
Katerina Fragkiadaki
92
2
0
27 Oct 2025
ACG: Action Coherence Guidance for Flow-based VLA models
Minho Park
Kinam Kim
J. Hyung
Hyojin Jang
Hoiyeong Jin
Jooyeol Yun
Hojoon Lee
Jaegul Choo
102
0
0
25 Oct 2025
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Qixiu Li
Yu Deng
Yaobo Liang
L. Luo
Lei Zhou
...
Hao Chen
Lily Sun
Dong Chen
J. Yang
B. Guo
105
4
0
24 Oct 2025
GSWorld: Closed-Loop Photo-Realistic Simulation Suite for Robotic Manipulation
Guangqi Jiang
Haoran Chang
Ri-Zhao Qiu
Yutong Liang
Mazeyu Ji
Jiyue Zhu
Zhao Dong
Xueyan Zou
Xiaolong Wang
3DGS
148
3
0
23 Oct 2025
PointMapPolicy: Structured Point Cloud Processing for Multi-Modal Imitation Learning
Xiaogang Jia
Qian Wang
Anrui Wang
Han A. Wang
B. Gyenes
...
Xi Huang
Maximilian Beck
Moritz Reuss
Rudolf Lioutikov
Gerhard Neumann
3DPC
169
0
0
23 Oct 2025
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
GigaBrain Team
Angen Ye
Boyuan Wang
Chaojun Ni
Guan Huang
...
Yukun Zhou
Z. Dong
Z. J. Wang
Zhichao Liu
Zheng Hua Zhu
LM&Ro
VLM
369
1
0
22 Oct 2025
Using Temperature Sampling to Effectively Train Robot Learning Policies on Imbalanced Datasets
Basavasagar Patil
Sydney Belt
Jayjun Lee
Nima Fazeli
Bernadette Bucher
64
0
0
22 Oct 2025
From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors
Z. Zhang
Hao Li
Yalun Dai
Zhengbang Zhu
Lei Zhou
...
S. Chen
Ziwei Liu
Y. Liu
Xinghang Li
Pan Zhou
73
1
0
20 Oct 2025
Consistent Zero-Shot Imitation with Contrastive Goal Inference
Kathryn Wantlin
Chongyi Zheng
Benjamin Eysenbach
148
0
0
20 Oct 2025
RESample: A Robust Data Augmentation Framework via Exploratory Sampling for Robotic Manipulation
Yuquan Xue
Guanxing Lu
Zhenyu Wu
Chuanrui Zhang
Bofang Jia
Zhengyi Gu
Yansong Tang
Ziwei Wang
146
0
0
20 Oct 2025
1
2
3
4
5
6
7
8
Next