Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.15818
Cited By
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
28 July 2023
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
K. Choromanski
Tianli Ding
Danny Driess
Kumar Avinava Dubey
Chelsea Finn
Peter R. Florence
Chuyuan Fu
Montse Gonzalez Arenas
K. Gopalakrishnan
Kehang Han
Karol Hausman
Alexander Herzog
Jasmine Hsu
Brian Ichter
A. Irpan
Nikhil J. Joshi
Ryan C. Julian
Dmitry Kalashnikov
Yuheng Kuang
Isabel Leal
Lisa Lee
Tsang-Wei Edward Lee
Sergey Levine
Yao Lu
Henryk Michalewski
Igor Mordatch
Karl Pertsch
Kanishka Rao
Krista Reymann
Michael S. Ryoo
Grecia Salazar
Pannag R. Sanketi
P. Sermanet
Jaspiar Singh
Anika Singh
Radu Soricut
Huong Tran
Vincent Vanhoucke
Q. Vuong
Ayzaan Wahid
Stefan Welker
Paul Wohlhart
Jialin Wu
Fei Xia
Ted Xiao
Peng-Tao Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
LM&Ro
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control"
50 / 194 papers shown
Title
DiSPo: Diffusion-SSM based Policy Learning for Coarse-to-Fine Action Discretization
Nayoung Oh
Jaehyeong Jang
Moonkyeong Jung
Daehyung Park
80
0
0
23 Sep 2024
KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems
Zixuan Wang
Bo Yu
Junzhe Zhao
Wenhao Sun
Sai Hou
Shuai Liang
Xing Hu
Yinhe Han
Yiming Gan
40
1
0
23 Sep 2024
InteLiPlan: An Interactive Lightweight LLM-Based Planner for Domestic Robot Autonomy
Kim Tien Ly
Kai Lu
Ioannis Havoutis
26
2
0
22 Sep 2024
TinyVLA: Towards Fast, Data-Efficient Vision-Language-Action Models for Robotic Manipulation
Junjie Wen
Y. X. Zhu
Jinming Li
Minjie Zhu
Kun Wu
...
Ran Cheng
Chaomin Shen
Yaxin Peng
Feifei Feng
Jian Tang
LM&Ro
56
41
0
19 Sep 2024
TacDiffusion: Force-domain Diffusion Policy for Precise Tactile Manipulation
Yansong Wu
Zongxie Chen
Fan Wu
L. Chen
Liding Zhang
Zhenshan Bing
Abdalla Swikir
Sami Haddadin
Sami Haddadin
57
7
0
17 Sep 2024
Embodiment-Agnostic Action Planning via Object-Part Scene Flow
Weiliang Tang
Jia-Hui Pan
Wei Zhan
Jianshu Zhou
Huaxiu Yao
Yun-Hui Liu
M. Tomizuka
Mingyu Ding
Chi-Wing Fu
41
0
0
16 Sep 2024
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
P. Krishnamurthy
Ramesh Karri
Farshad Khorrami
42
3
0
16 Sep 2024
LLM-as-BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning
Jicong Ao
Fan Wu
Yansong Wu
Abdalla Swikir
Sami Haddadin
26
5
0
16 Sep 2024
Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models
Yuan-Hong Liao
Rafid Mahmood
Sanja Fidler
David Acuna
ReLM
LRM
26
7
0
15 Sep 2024
Intelligent LiDAR Navigation: Leveraging External Information and Semantic Maps with LLM as Copilot
Fujing Xie
Jiajie Zhang
Sören Schwertfeger
33
1
0
13 Sep 2024
CoVLA: Comprehensive Vision-Language-Action Dataset for Autonomous Driving
Hidehisa Arai
Keita Miwa
Kento Sasaki
Yu Yamaguchi
Kohei Watanabe
Shunsuke Aoki
Issei Yamamoto
35
9
0
19 Aug 2024
Jacta: A Versatile Planner for Learning Dexterous and Whole-body Manipulation
Jan Brüdigam
Ali-Adeeb Abbas
Maks Sorokin
Kuan Fang
Brandon Hung
Maya Guru
Stefan Sosnowski
Jiuguang Wang
Sandra Hirche
Simon Le Cleac'h
31
2
0
02 Aug 2024
QueST: Self-Supervised Skill Abstractions for Learning Continuous Control
Atharva Mete
Haotian Xue
Albert Wilcox
Yongxin Chen
Animesh Garg
SSL
21
15
0
22 Jul 2024
GET-Zero: Graph Embodiment Transformer for Zero-shot Embodiment Generalization
Austin Patel
Shuran Song
LM&Ro
22
2
0
20 Jul 2024
Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts
Yanting Yang
Minghao Chen
Qibo Qiu
Jiahao Wu
Wenxiao Wang
Binbin Lin
Ziyu Guan
Xiaofei He
LM&Ro
32
2
0
20 Jul 2024
R+X: Retrieval and Execution from Everyday Human Videos
Georgios Papagiannis
Norman Di Palo
Pietro Vitiello
Edward Johns
48
15
0
17 Jul 2024
Affordance-Guided Reinforcement Learning via Visual Prompting
Olivia Y. Lee
Annie Xie
Kuan Fang
Karl Pertsch
Chelsea Finn
OffRL
LM&Ro
67
7
0
14 Jul 2024
VLMPC: Vision-Language Model Predictive Control for Robotic Manipulation
Wentao Zhao
Jiaming Chen
Ziyu Meng
Donghui Mao
Ran Song
Wei Zhang
35
8
0
13 Jul 2024
RoboMorph: Evolving Robot Morphology using Large Language Models
Kevin Qiu
Krzysztof Ciebiera
Krzysztof Ciebiera
Marek Cygan
Marek Cygan
Łukasz Kuciński
LM&Ro
45
0
0
11 Jul 2024
RoboCAS: A Benchmark for Robotic Manipulation in Complex Object Arrangement Scenarios
Liming Zheng
Feng Yan
Fanfan Liu
Chengjian Feng
Zhuoliang Kang
Lin Ma
38
2
0
09 Jul 2024
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li
Cristina Mata
J. Park
Kumara Kahatapitiya
Yoo Sung Jang
...
Kanchana Ranasinghe
R. Burgert
Mu Cai
Yong Jae Lee
Michael S. Ryoo
LM&Ro
62
25
0
28 Jun 2024
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng
Yuxin Cui
Haomiao Tang
Zekun Qi
Runpei Dong
Jing Bai
Chunrui Han
Zheng Ge
Xiangyu Zhang
Shu-Tao Xia
EGVM
72
31
0
24 Jun 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
41
3
0
20 Jun 2024
BAKU: An Efficient Transformer for Multi-Task Policy Learning
Siddhant Haldar
Zhuoran Peng
Lerrel Pinto
OffRL
32
25
0
11 Jun 2024
Language Guided Skill Discovery
Seungeun Rho
Laura Smith
Tianyu Li
Sergey Levine
Xue Bin Peng
Sehoon Ha
LM&Ro
40
4
0
07 Jun 2024
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang
Xuweiyi Chen
Nikhil Madaan
Madhavan Iyengar
Shengyi Qian
David Fouhey
Joyce Chai
3DV
68
11
0
07 Jun 2024
Aligning Agents like Large Language Models
Adam Jelley
Yuhan Cao
Dave Bignell
Sam Devlin
Tabish Rashid
LM&Ro
28
1
0
06 Jun 2024
ManiCM: Real-time 3D Diffusion Policy via Consistency Model for Robotic Manipulation
Guanxing Lu
Zifeng Gao
Tianxing Chen
Wen-Dao Dai
Ziwei Wang
Yansong Tang
Yansong Tang
DiffM
68
14
0
03 Jun 2024
Hierarchical World Models as Visual Whole-Body Humanoid Controllers
Nicklas Hansen
V. JyothirS
Vlad Sobal
Yann LeCun
Xiaolong Wang
Hao Su
VGen
38
10
0
28 May 2024
Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control
Michal Nauman
M. Ostaszewski
Krzysztof Jankowski
Piotr Milo's
Marek Cygan
OffRL
37
16
0
25 May 2024
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Yang Zhang
Shixin Yang
Chenjia Bai
Fei Wu
Xiu Li
Zhen Wang
Xuelong Li
LLMAG
31
25
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
Learning Manipulation Skills through Robot Chain-of-Thought with Sparse Failure Guidance
Kaifeng Zhang
Zhao-Heng Yin
Weirui Ye
Yang Gao
57
3
0
22 May 2024
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
Koffivi Fidele Gbagbe
Miguel Altamirano Cabrera
Ali Alabbas
Oussama Alyunes
Artem Lykov
Dzmitry Tsetserukou
LM&Ro
25
17
0
09 May 2024
Deep Reinforcement Learning for Bipedal Locomotion: A Brief Survey
Lingfan Bao
Josephine N. Humphreys
Tianhu Peng
Chengxu Zhou
65
5
0
25 Apr 2024
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni
Rui Ye
Yuxian Wei
Zhen Xiang
Yanfeng Wang
Siheng Chen
AAML
32
9
0
19 Apr 2024
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
41
12
0
16 Apr 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRM
LM&Ro
38
28
0
16 Apr 2024
Navigating the Landscape of Large Language Models: A Comprehensive Review and Analysis of Paradigms and Fine-Tuning Strategies
Benjue Weng
LM&MA
30
7
0
13 Apr 2024
Training a Vision Language Model as Smartphone Assistant
Nicolai Dorka
Janusz Marecki
Ammar Anwar
14
3
0
12 Apr 2024
Agile and versatile bipedal robot tracking control through reinforcement learning
Jiayi Li
Linqi Ye
Yi Cheng
Houde Liu
Bin Liang
23
1
0
12 Apr 2024
Self-Explainable Affordance Learning with Embodied Caption
Zhipeng Zhang
Zhimin Wei
Guolei Sun
Peng Wang
Luc Van Gool
40
2
0
08 Apr 2024
Humanoid Robots at work: where are we ?
Fabrice R. Noreils
24
3
0
05 Apr 2024
JUICER: Data-Efficient Imitation Learning for Robotic Assembly
Lars Ankile
Anthony Simeonov
Idan Shenfeld
Pulkit Agrawal
LM&Ro
27
14
0
04 Apr 2024
ZeroCAP: Zero-Shot Multi-Robot Context Aware Pattern Formation via Large Language Models
Vishnunandan L. N. Venkatesh
Byung-Cheol Min
LM&Ro
66
2
0
02 Apr 2024
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents
Zeren Chen
Zhelun Shi
Xiaoya Lu
Lehan He
Sucheng Qian
...
Zhen-fei Yin
Jing Shao
Jing Shao
Cewu Lu
Cewu Lu
33
5
0
28 Mar 2024
Cross-domain Multi-modal Few-shot Object Detection via Rich Text
Zeyu Shangguan
Daniel Seita
Mohammad Rostami
ObjD
45
1
0
24 Mar 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
...
Thomas Kollar
Sergey Levine
Chelsea Finn
Sergey Levine
Chelsea Finn
41
172
0
19 Mar 2024
ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models
Runyu Ma
Jelle Luijkx
Zlatan Ajanović
Jens Kober
LM&Ro
LRM
33
7
0
14 Mar 2024
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
Bingqian Lin
Yunshuang Nie
Ziming Wei
Jiaqi Chen
Shikui Ma
Jianhua Han
Hang Xu
Xiaojun Chang
Xiaodan Liang
LM&Ro
LRM
60
19
0
12 Mar 2024
Previous
1
2
3
4
Next