ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.15818
  4. Cited By
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
  Control

RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

28 July 2023
Anthony Brohan
Noah Brown
Justice Carbajal
Yevgen Chebotar
Xi Chen
K. Choromanski
Tianli Ding
Danny Driess
Kumar Avinava Dubey
Chelsea Finn
Peter R. Florence
Chuyuan Fu
Montse Gonzalez Arenas
K. Gopalakrishnan
Kehang Han
Karol Hausman
Alexander Herzog
Jasmine Hsu
Brian Ichter
A. Irpan
Nikhil J. Joshi
Ryan C. Julian
Dmitry Kalashnikov
Yuheng Kuang
Isabel Leal
Lisa Lee
Tsang-Wei Edward Lee
Sergey Levine
Yao Lu
Henryk Michalewski
Igor Mordatch
Karl Pertsch
Kanishka Rao
Krista Reymann
Michael S. Ryoo
Grecia Salazar
Pannag R. Sanketi
P. Sermanet
Jaspiar Singh
Anika Singh
Radu Soricut
Huong Tran
Vincent Vanhoucke
Q. Vuong
Ayzaan Wahid
Stefan Welker
Paul Wohlhart
Jialin Wu
Fei Xia
Ted Xiao
Peng-Tao Xu
Sichun Xu
Tianhe Yu
Brianna Zitkovich
    LM&Ro
    LRM
ArXivPDFHTML

Papers citing "RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control"

44 / 194 papers shown
Title
Embracing Large Language and Multimodal Models for Prosthetic
  Technologies
Embracing Large Language and Multimodal Models for Prosthetic Technologies
S. Dey
Arndt F. Schilling
16
1
0
08 Mar 2024
Embodied Understanding of Driving Scenarios
Embodied Understanding of Driving Scenarios
Yunsong Zhou
Linyan Huang
Qingwen Bu
Jia Zeng
Tianyu Li
Hang Qiu
Hongzi Zhu
Minyi Guo
Yu Qiao
Hongyang Li
LM&Ro
55
30
0
07 Mar 2024
DNAct: Diffusion Guided Multi-Task 3D Policy Learning
DNAct: Diffusion Guided Multi-Task 3D Policy Learning
Ge Yan
Yueh-hua Wu
Xiaolong Wang
VGen
27
20
0
07 Mar 2024
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
  and Simulation
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation
Junting Chen
Yao Mu
Qiaojun Yu
Tianming Wei
Silang Wu
...
Wenqi Shao
Yu Qiao
Huazhe Xu
Mingyu Ding
Ping Luo
LM&Ro
25
11
0
22 Feb 2024
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal
  Morphology Control
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Zheng Xiong
Risto Vuorio
Jacob Beck
Matthieu Zimmer
Kun Shao
Shimon Whiteson
27
1
0
09 Feb 2024
CLIP-Loc: Multi-modal Landmark Association for Global Localization in
  Object-based Maps
CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps
Shigemichi Matsuzaki
Takuma Sugino
Kazuhito Tanaka
Zijun Sha
Shintaro Nakaoka
Shintaro Yoshizawa
Kazuhiro Shintani
VLM
11
5
0
08 Feb 2024
Zero-Shot Reinforcement Learning via Function Encoders
Zero-Shot Reinforcement Learning via Function Encoders
Tyler Ingebrand
Amy Zhang
Ufuk Topcu
OffRL
22
2
0
30 Jan 2024
CognitiveOS: Large Multimodal Model based System to Endow Any Type of
  Robot with Generative AI
CognitiveOS: Large Multimodal Model based System to Endow Any Type of Robot with Generative AI
Artem Lykov
Mikhail Konenkov
Koffivi Fidele Gbagbe
Mikhail Litvinov
D. Davletshin
A. Fedoseev
Miguel Altamirano Cabrera
Robinroy Peter
Dzmitry Tsetserukou
LM&Ro
29
5
0
29 Jan 2024
Imitation Learning Inputting Image Feature to Each Layer of Neural
  Network
Imitation Learning Inputting Image Feature to Each Layer of Neural Network
Koki Yamane
S. Sakaino
T. Tsuji
9
3
0
18 Jan 2024
CognitiveDog: Large Multimodal Model Based System to Translate Vision
  and Language into Action of Quadruped Robot
CognitiveDog: Large Multimodal Model Based System to Translate Vision and Language into Action of Quadruped Robot
Artem Lykov
Mikhail Litvinov
Mikhail Konenkov
Rinat Prochii
Nikita Burtsev
Ali Alridha Abdulkarim
Artem Bazhenov
Vladimir Berman
Dzmitry Tsetserukou
VLM
LM&Ro
8
17
0
17 Jan 2024
RePLan: Robotic Replanning with Perception and Language Models
RePLan: Robotic Replanning with Perception and Language Models
Marta Skreta
Zihan Zhou
Jia Lin Yuan
Kourosh Darvish
Alán Aspuru-Guzik
Animesh Garg
LM&Ro
LRM
27
26
0
08 Jan 2024
General-purpose foundation models for increased autonomy in
  robot-assisted surgery
General-purpose foundation models for increased autonomy in robot-assisted surgery
Samuel Schmidgall
Ji Woong Kim
Alan Kuntz
A. Ghazi
Axel Krieger
MedIm
36
8
0
01 Jan 2024
LLM-SAP: Large Language Models Situational Awareness Based Planning
LLM-SAP: Large Language Models Situational Awareness Based Planning
Liman Wang
Hanyang Zhong
LLMAG
23
2
0
26 Dec 2023
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric
  Robotic Manipulation
ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation
Xiaoqi Li
Mingxu Zhang
Yiran Geng
Haoran Geng
Yuxing Long
Yan Shen
Renrui Zhang
Jiaming Liu
Hao Dong
LM&Ro
LRM
25
78
0
24 Dec 2023
LHManip: A Dataset for Long-Horizon Language-Grounded Manipulation Tasks
  in Cluttered Tabletop Environments
LHManip: A Dataset for Long-Horizon Language-Grounded Manipulation Tasks in Cluttered Tabletop Environments
Federico Ceola
Lorenzo Natale
Niko Sünderhauf
Krishan Rana
LM&Ro
22
1
0
19 Dec 2023
Building Open-Ended Embodied Agent via Language-Policy Bidirectional
  Adaptation
Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation
Shaopeng Zhai
Jie Wang
Tianyi Zhang
Fuxian Huang
Qi Zhang
Ming Zhou
Jing Hou
Yu Qiao
Yu Liu
LLMAG
LM&Ro
19
1
0
12 Dec 2023
Photorealistic Video Generation with Diffusion Models
Photorealistic Video Generation with Diffusion Models
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
VGen
30
172
0
11 Dec 2023
Harmonic Mobile Manipulation
Harmonic Mobile Manipulation
Ruihan Yang
Yejin Kim
Aniruddha Kembhavi
Xiaolong Wang
Kiana Ehsani
23
13
0
11 Dec 2023
Large Scale Foundation Models for Intelligent Manufacturing
  Applications: A Survey
Large Scale Foundation Models for Intelligent Manufacturing Applications: A Survey
Haotian Zhang
S. D. Semujju
Zhicheng Wang
Xianwei Lv
Kang Xu
...
Jing Wu
Zhuo Long
Wensheng Liang
Xiaoguang Ma
Ruiyan Zhuang
UQCV
AI4TS
AI4CE
27
4
0
11 Dec 2023
On the Role of the Action Space in Robot Manipulation Learning and
  Sim-to-Real Transfer
On the Role of the Action Space in Robot Manipulation Learning and Sim-to-Real Transfer
Elie Aljalbout
Felix Frank
Maximilian Karl
Patrick van der Smagt
8
20
0
06 Dec 2023
Human Demonstrations are Generalizable Knowledge for Robots
Human Demonstrations are Generalizable Knowledge for Robots
Te Cui
Guangyan Chen
Tianxing Zhou
Zicai Peng
Mengxiao Hu
Haoyang Lu
Haizhou Li
Meiling Wang
Yi Yang
Yufeng Yue
LM&Ro
27
6
0
05 Dec 2023
Dolphins: Multimodal Language Model for Driving
Dolphins: Multimodal Language Model for Driving
Yingzi Ma
Yulong Cao
Jiachen Sun
Marco Pavone
Chaowei Xiao
MLLM
21
49
0
01 Dec 2023
On Bringing Robots Home
On Bringing Robots Home
Nur Muhammad (Mahi) Shafiullah
Anant Rai
Haritheja Etukuru
Yiqian Liu
Ishan Misra
Soumith Chintala
Lerrel Pinto
22
74
0
27 Nov 2023
ADriver-I: A General World Model for Autonomous Driving
ADriver-I: A General World Model for Autonomous Driving
Fan Jia
Weixin Mao
Yingfei Liu
Yucheng Zhao
Yuqing Wen
Chi Zhang
Xiangyu Zhang
Tiancai Wang
22
63
0
22 Nov 2023
Advances in Embodied Navigation Using Large Language Models: A Survey
Advances in Embodied Navigation Using Large Language Models: A Survey
Jinzhou Lin
Han Gao
Xuxiang Feng
Rongtao Xu
Changwei Wang
Man Zhang
Li Guo
Shibiao Xu
LM&Ro
LLMAG
56
9
0
01 Nov 2023
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Haoyi Zhu
Honghui Yang
Xiaoyang Wu
Di Huang
Sha Zhang
...
Hengshuang Zhao
Chunhua Shen
Yu Qiao
Tong He
Wanli Ouyang
SSL
69
42
0
12 Oct 2023
AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the
  Wild
AirExo: Low-Cost Exoskeletons for Learning Whole-Arm Manipulation in the Wild
Hongjie Fang
Haoshu Fang
Yiming Wang
Jieji Ren
Jing Chen
Ruo Zhang
Weiming Wang
Cewu Lu
24
46
0
26 Sep 2023
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on
  Scene Graphs
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs
Guangyao Zhai
Xiaoni Cai
Dianye Huang
Yan Di
Fabian Manhardt
Federico Tombari
Nassir Navab
Benjamin Busam
LM&Ro
10
26
0
21 Sep 2023
HiCRISP: An LLM-based Hierarchical Closed-Loop Robotic Intelligent
  Self-Correction Planner
HiCRISP: An LLM-based Hierarchical Closed-Loop Robotic Intelligent Self-Correction Planner
Chenlin Ming
Jiacheng Lin
Pangkit Fong
Han Wang
Xiaoming Duan
Jianping He
15
1
0
21 Sep 2023
Towards Joint Modeling of Dialogue Response and Speech Synthesis based
  on Large Language Model
Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model
Xinyu Zhou
Delong Chen
Yudong Chen
AuLLM
27
0
0
20 Sep 2023
Cognitive Architectures for Language Agents
Cognitive Architectures for Language Agents
T. Sumers
Shunyu Yao
Karthik Narasimhan
Thomas L. Griffiths
LLMAG
LM&Ro
34
150
0
05 Sep 2023
ExpeL: LLM Agents Are Experiential Learners
ExpeL: LLM Agents Are Experiential Learners
Andrew Zhao
Daniel Huang
Quentin Xu
Matthieu Lin
Y. Liu
Gao Huang
LLMAG
17
192
0
20 Aug 2023
"Tidy Up the Table": Grounding Common-sense Objective for Tabletop
  Object Rearrangement
"Tidy Up the Table": Grounding Common-sense Objective for Tabletop Object Rearrangement
Yiqing Xu
David Hsu
LM&Ro
LMTD
21
0
0
21 Jul 2023
Decomposing the Generalization Gap in Imitation Learning for Visual
  Robotic Manipulation
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation
Annie Xie
Lisa Lee
Ted Xiao
Chelsea Finn
21
53
0
07 Jul 2023
Transferring Foundation Models for Generalizable Robotic Manipulation
Transferring Foundation Models for Generalizable Robotic Manipulation
Jiange Yang
Wenhui Tan
Chuhao Jin
Keling Yao
Bei Liu
Jianlong Fu
Ruihua Song
Gangshan Wu
Limin Wang
LM&Ro
45
6
0
09 Jun 2023
Vision-Language Models as Success Detectors
Vision-Language Models as Success Detectors
Yuqing Du
Ksenia Konyushkova
Misha Denil
A. Raju
Jessica Landon
Felix Hill
Nando de Freitas
Serkan Cabi
MLLM
LRM
84
76
0
13 Mar 2023
Open-World Object Manipulation using Pre-trained Vision-Language Models
Open-World Object Manipulation using Pre-trained Vision-Language Models
Austin Stone
Ted Xiao
Yao Lu
K. Gopalakrishnan
Kuang-Huei Lee
...
Sean Kirmani
Brianna Zitkovich
F. Xia
Chelsea Finn
Karol Hausman
LM&Ro
142
144
0
02 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
247
4,186
0
30 Jan 2023
ProgPrompt: Generating Situated Robot Task Plans using Large Language
  Models
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
112
616
0
22 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
155
449
0
12 Sep 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
139
430
0
10 Jul 2022
Formal Mathematics Statement Curriculum Learning
Formal Mathematics Statement Curriculum Learning
Stanislas Polu
Jesse Michael Han
Kunhao Zheng
Mantas Baksys
Igor Babuschkin
Ilya Sutskever
AIMat
73
115
0
03 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Open-vocabulary Object Detection via Vision and Language Knowledge
  Distillation
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
223
897
0
28 Apr 2021
Previous
1234