Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.09747
Cited By
FAST: Efficient Action Tokenization for Vision-Language-Action Models
17 January 2025
Karl Pertsch
Kyle Stachowicz
Brian Ichter
Danny Driess
Suraj Nair
Q. Vuong
Oier Mees
Chelsea Finn
Sergey Levine
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"FAST: Efficient Action Tokenization for Vision-Language-Action Models"
50 / 52 papers shown
Title
Diffusion Models for Robotic Manipulation: A Survey
Rosa Wolf
Yitian Shi
Sheng Liu
Rania Rayyes
127
2
0
01 Jul 2025
AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
Zewei Zhou
Tianhui Cai
Seth Z. Zhao
Yun Zhang
Zhiyu Huang
Bolei Zhou
Jiaqi Ma
LRM
VLM
25
0
0
16 Jun 2025
CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding
Wenxuan Song
Jiayi Chen
Pengxiang Ding
Yuxin Huang
Han Zhao
Donglin Wang
Haoang Li
28
0
0
16 Jun 2025
mimic-one: a Scalable Model Recipe for General Purpose Robot Dexterity
Elvis Nava
Victoriano Montesinos
Erik Bauer
Benedek Forrai
Jonas Pai
...
Stephan-Daniel Gravert
Philipp Wand
Stephan Polinski
Benjamin Grewe
Robert K. Katzschmann
27
0
0
13 Jun 2025
Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop
Justin Kerr
Kush Hari
Ethan Weber
Chung Min Kim
Brent Yi
Tyler Bonnen
Ken Goldberg
Angjoo Kanazawa
121
0
0
12 Jun 2025
From Intention to Execution: Probing the Generalization Boundaries of Vision-Language-Action Models
Irving Fang
Juexiao Zhang
Shengbang Tong
Chen Feng
LM&Ro
63
1
0
11 Jun 2025
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Qiao Gu
Yuanliang Ju
Shengxiang Sun
Igor Gilitschenski
Haruki Nishimura
Masha Itkina
Florian Shkurti
55
0
0
11 Jun 2025
An Open-Source Software Toolkit & Benchmark Suite for the Evaluation and Adaptation of Multimodal Action Models
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Jaewoo Song
Harshvardhan Sikka
41
0
0
10 Jun 2025
Scaling Laws of Motion Forecasting and Planning -- A Technical Report
Mustafa Baniodeh
Kratarth Goel
Scott Ettinger
Carlos Fuertes
Ari Seff
...
Vinutha Kallem
Sergio Casas
Rami Al-Rfou
Benjamin Sapp
Dragomir Anguelov
33
0
0
09 Jun 2025
Real-Time Execution of Action Chunking Flow Policies
Kevin Black
Manuel Y. Galliker
Sergey Levine
OffRL
28
0
0
09 Jun 2025
Robotic Policy Learning via Human-assisted Action Preference Optimization
Wenke Xia
Yichu Yang
Hongtao Wu
Xiao Ma
Tao Kong
Di Hu
35
0
0
08 Jun 2025
BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning
Hongyi Zhou
Weiran Liao
Xi Huang
Yucheng Tang
Fabian Otto
...
Qian Wang
Ömer Erdinç Yagmurlu
Nils Blank
Moritz Reuss
Rudolf Lioutikov
65
0
0
06 Jun 2025
DemoSpeedup: Accelerating Visuomotor Policies via Entropy-Guided Demonstration Acceleration
Lingxiao Guo
Zhengrong Xue
Zijing Xu
Huazhe Xu
168
0
0
05 Jun 2025
STAR: Learning Diverse Robot Skill Abstractions through Rotation-Augmented Vector Quantization
Hao Li
Qi Lv
Rui Shao
Xiang Deng
Yinchuan Li
Jianye Hao
Liqiang Nie
141
1
0
04 Jun 2025
FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
Yiming Zhong
Yumeng Liu
Chuyang Xiao
Zemin Yang
Youzhuo Wang
Yufei Zhu
Ye-ling Shi
Yujing Sun
X. Zhu
Yuexin Ma
61
0
0
02 Jun 2025
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
Danny Driess
Jost Tobias Springenberg
Brian Ichter
Lili Yu
Adrian Li-Bell
...
Allen Z. Ren
Homer Walke
Quan Vuong
Lucy Xiaoyang Shi
Sergey Levine
119
2
0
29 May 2025
ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained Knowledge
Zhongyi Zhou
Yichen Zhu
Junjie Wen
Chaomin Shen
Yi Xu
LM&Ro
LRM
VLM
98
1
0
28 May 2025
SCIZOR: A Self-Supervised Approach to Data Curation for Large-Scale Imitation Learning
Yu Zhang
Yuqi Xie
Huihan Liu
Rutav Shah
Michael Wan
Linxi Fan
Yuke Zhu
58
0
0
28 May 2025
WorldEval: World Model as Real-World Robot Policies Evaluator
Yaxuan Li
Yichen Zhu
Junjie Wen
Chaomin Shen
Yi Xu
OffRL
VGen
31
0
0
25 May 2025
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning
Guanxing Lu
Wenkai Guo
Chubin Zhang
Yuheng Zhou
Haonan Jiang
Zifeng Gao
Yansong Tang
Ziwei Wang
OffRL
118
0
0
24 May 2025
3D Equivariant Visuomotor Policy Learning via Spherical Projection
Boce Hu
Dian Wang
David Klee
Heng Tian
Xupeng Zhu
Haojie Huang
Robert Platt
Robin Walters
101
0
0
22 May 2025
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization
Jiaming Zhou
Ke Ye
Jiayi Liu
Teli Ma
Zifang Wang
Ronghe Qiu
Kun-Yu Lin
Zhilin Zhao
Junwei Liang
130
2
0
21 May 2025
From Grounding to Manipulation: Case Studies of Foundation Model Integration in Embodied Robotic Systems
Xiuchao Sui
Daiying Tian
Qi Sun
Ruirui Chen
Dongkyu Choi
Kenneth Kwok
Soujanya Poria
LM&Ro
113
0
0
21 May 2025
Policy Contrastive Decoding for Robotic Foundation Models
Shihan Wu
Ji Zhang
Xu Luo
Junlin Xie
Jingkuan Song
Heng Tao Shen
Lianli Gao
OffRL
271
0
0
19 May 2025
OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning
Fanqi Lin
Ruiqian Nai
Yingdong Hu
Jiacheng You
Junming Zhao
Yang Gao
LRM
101
0
0
17 May 2025
Zero-Shot Visual Generalization in Robot Manipulation
Sumeet Batra
Gaurav Sukhatme
79
0
0
16 May 2025
Surgical Foundation Model Leveraging Compression and Entropy Maximization for Image-Guided Surgical Assistance
Lianhao Yin
O. Meireles
Guy Rosman
Daniela Rus
35
0
0
16 May 2025
Conditioning Matters: Training Diffusion Policies is Faster Than You Think
Zibin Dong
Yicheng Liu
Yinchuan Li
Hang Zhao
Haifeng Zhang
128
0
0
16 May 2025
ManipBench: Benchmarking Vision-Language Models for Low-Level Robot Manipulation
Enyu Zhao
Vedant Raval
Hejia Zhang
Jiageng Mao
Zeyu Shangguan
Stefanos Nikolaidis
Yun Wang
Daniel Seita
LM&Ro
CoGe
98
0
0
14 May 2025
Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware
Justin Yu
Letian Fu
Huang Huang
Karim El-Refai
Rares Andrei Ambrus
Richard Cheng
Muhammad Zubair Irshad
Ken Goldberg
81
1
0
14 May 2025
Training Strategies for Efficient Embodied Reasoning
William Chen
Suneel Belkhale
Suvir Mirchandani
Oier Mees
Danny Driess
Karl Pertsch
Sergey Levine
OffRL
LRM
99
0
0
13 May 2025
Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments
Pranav Guruprasad
Yangyue Wang
Sudipta Chowdhury
Harshvardhan Sikka
Paul Pu Liang
LM&Ro
VLM
455
1
0
08 May 2025
D-CODA: Diffusion for Coordinated Dual-Arm Data Augmentation
Isabella Liu
Jason Chen
Gaurav Sukhatme
Daniel Seita
131
0
0
08 May 2025
Vision-Language-Action Models: Concepts, Progress, Applications and Challenges
Ranjan Sapkota
Yang Cao
Konstantinos I. Roumeliotis
Manoj Karkee
LM&Ro
407
2
0
07 May 2025
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Chia-Yu Hung
Qi Sun
Pengfei Hong
Amir Zadeh
Chuan Li
U-Xuan Tan
Navonil Majumder
Soujanya Poria
LM&Ro
120
4
0
28 Apr 2025
STDArm: Transferring Visuomotor Policies From Static Data Training to Dynamic Robot Manipulation
YiFan Duan
Heng Li
Yilong Wu
Wenhao Yu
Xinran Zhang
Yedong Shen
Jianmin Ji
Yanzhe Zhang
139
0
0
26 Apr 2025
π
0.5
π_{0.5}
π
0.5
: a Vision-Language-Action Model with Open-World Generalization
Physical Intelligence
Kevin Black
Noah Brown
James Darpinian
Karan Dhabalia
...
Homer Walke
Anna Walling
Haohuan Wang
Lili Yu
Ury Zhilinsky
LM&Ro
VLM
137
51
0
22 Apr 2025
AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning
Alan Dao
Dinh Bach Vu
Bui Quang Huy
188
0
0
24 Mar 2025
Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
Tongxuan Tian
Haoyang Li
Bo Ai
Xiaodi Yuan
Zhiao Huang
H. Su
DiffM
AI4CE
123
3
0
15 Mar 2025
Towards Fast, Memory-based and Data-Efficient Vision-Language Policy
Haoxuan Li
Sixu Yan
Yongqian Li
Xinggang Wang
LM&Ro
128
1
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
196
20
0
13 Mar 2025
Open-World Skill Discovery from Unsegmented Demonstrations
Jingwen Deng
Zihao Wang
Shaofei Cai
Hoang Trung-Dung
Yitao Liang
67
1
0
11 Mar 2025
PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM
Alan Dao
Dinh Bach Vu
Tuan Le Duc Anh
Bui Quang Huy
104
0
0
10 Mar 2025
PointVLA: Injecting the 3D World into Vision-Language-Action Models
Chengmeng Li
Junjie Wen
Yan Peng
Chaomin Shen
Feifei Feng
Yinlin Zhu
3DPC
162
9
0
10 Mar 2025
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction
Huang Huang
Fangchen Liu
Letian Fu
Tingfan Wu
Mustafa Mukadam
Jitendra Malik
Ken Goldberg
Pieter Abbeel
LM&Ro
VLM
184
10
0
05 Mar 2025
SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning
Borong Zhang
Yuhao Zhang
Yalan Qin
Yingshan Lei
Josef Dai
Yuanpei Chen
Yaodong Yang
128
0
0
05 Mar 2025
Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding
Wenxuan Song
Jiayi Chen
Pengxiang Ding
Han Zhao
Wei Zhao
Zhide Zhong
Zongyuan Ge
Jun Ma
Haoang Li
114
7
0
04 Mar 2025
Action Tokenizer Matters in In-Context Imitation Learning
An Vuong
M. Vu
Dong An
Ian Reid
121
1
0
03 Mar 2025
ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration
Minjie Zhu
Yinlin Zhu
Jinming Li
Zhongyi Zhou
Junjie Wen
Xiaoyu Liu
Yaxin Peng
Chaomin Shen
Feifei Feng
LM&Ro
153
6
0
26 Feb 2025
Hi Robot: Open-Ended Instruction Following with Hierarchical Vision-Language-Action Models
Lucy Xiaoyang Shi
Brian Ichter
Michael Equi
Liyiming Ke
Karl Pertsch
...
Adrian Li-Bell
Danny Driess
Lachy Groom
Sergey Levine
Chelsea Finn
LM&Ro
LRM
149
23
0
26 Feb 2025
1
2
Next