Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.08782
Cited By
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis
14 December 2023
Yafei Hu
Quanting Xie
Vidhi Jain
Jonathan M Francis
Jay Patrikar
Nikhil Varma Keetha
Seungchan Kim
Yaqi Xie
Tianyi Zhang
Shibo Zhao
Yu Quan Chong
Chen Wang
Katia P. Sycara
Matthew Johnson-Roberson
Dhruv Batra
Xiaolong Wang
Sebastian A. Scherer
Z. Kira
Fei Xia
Yonatan Bisk
LM&Ro
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis"
50 / 55 papers shown
Title
Instrumentation for Better Demonstrations: A Case Study
Remko Proesmans
Thomas Lips
Francis Wyffels
24
0
0
25 Apr 2025
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
Chen Wang
Fei Xia
Wenhao Yu
Tingnan Zhang
Ruohan Zhang
Ce Liu
Li Fei-Fei
Jie Tan
Jacky Liang
27
0
0
17 Apr 2025
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills
Haoqi Yuan
Yu Bai
Yuhui Fu
Bohan Zhou
Yicheng Feng
Xinrun Xu
Yi Zhan
Börje F. Karlsson
Zongqing Lu
LM&Ro
74
0
0
16 Mar 2025
Generative Artificial Intelligence in Robotic Manipulation: A Survey
Kun Zhang
Peng Yun
Jun Cen
Junhao Cai
DiDi Zhu
...
Qifeng Chen
Jia Pan
Wei K. Zhang
Bo Yang
Hua Chen
55
1
0
05 Mar 2025
KineSoft: Learning Proprioceptive Manipulation Policies with Soft Robot Hands
Uksang Yoo
Jonathan M Francis
Jean Oh
Jeffrey Ichnowski
62
1
0
03 Mar 2025
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
G. Klambauer
Razvan Pascanu
Sepp Hochreiter
61
4
0
21 Feb 2025
Are Transformers Truly Foundational for Robotics?
James A. R. Marshall
Andrew B. Barron
AI4CE
71
0
0
25 Nov 2024
GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation
Yangtao Chen
Zixuan Chen
Junhui Yin
Jing Huo
Pinzhuo Tian
Jieqi Shi
Yang Gao
LM&Ro
40
2
0
30 Sep 2024
Embodied-RAG: General Non-parametric Embodied Memory for Retrieval and Generation
Quanting Xie
So Yeon Min
Tianyi Zhang
Kedi Xu
Aarav Bajaj
Ruslan Salakhutdinov
Matthew Johnson-Roberson
Yonatan Bisk
Matthew Johnson-Roberson
Yonatan Bisk
LM&Ro
48
6
0
26 Sep 2024
Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
Sriram Yenamandra
Arun Ramachandran
Mukul Khanna
Karmesh Yadav
Jay Vakil
...
Z. Kira
Dhruv Batra
Roozbeh Mottaghi
Yonatan Bisk
Chris Paxton
LM&Ro
47
6
0
09 Jul 2024
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
43
75
0
27 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
60
38
0
23 May 2024
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang
Guandao Yang
Leonidas J. Guibas
26
3
0
26 Apr 2024
Deep Reinforcement Learning for Bipedal Locomotion: A Brief Survey
Lingfan Bao
Josephine N. Humphreys
Tianhu Peng
Chengxu Zhou
65
5
0
25 Apr 2024
Verifiably Following Complex Robot Instructions with Foundation Models
Benedict Quartey
Eric Rosen
Stefanie Tellex
G. Konidaris
LM&Ro
39
10
0
18 Feb 2024
General-purpose foundation models for increased autonomy in robot-assisted surgery
Samuel Schmidgall
Ji Woong Kim
Alan Kuntz
A. Ghazi
Axel Krieger
MedIm
22
8
0
01 Jan 2024
Sample Efficient Preference Alignment in LLMs via Active Exploration
Viraj Mehta
Vikramjeet Das
Ojash Neopane
Yijia Dai
Ilija Bogunovic
Ilija Bogunovic
W. Neiswanger
Stefano Ermon
Jeff Schneider
Willie Neiswanger
OffRL
25
12
0
01 Dec 2023
Video Language Planning
Yilun Du
Mengjiao Yang
Peter R. Florence
Fei Xia
Ayzaan Wahid
...
Pieter Abbeel
Josh Tenenbaum
L. Kaelbling
Andy Zeng
Jonathan Tompson
PINN
LM&Ro
84
83
0
16 Oct 2023
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
Yevgen Chebotar
Q. Vuong
A. Irpan
Karol Hausman
F. Xia
...
Brianna Zitkovich
Tomas Jackson
Kanishka Rao
Chelsea Finn
Sergey Levine
OffRL
110
81
0
18 Sep 2023
Reasoning about the Unseen for Efficient Outdoor Object Navigation
Quanting Xie
Tianyi Zhang
Kedi Xu
Matthew Johnson-Roberson
Yonatan Bisk
LRM
53
9
0
18 Sep 2023
IndoorSim-to-OutdoorReal: Learning to Navigate Outdoors without any Outdoor Experience
Joanne Truong
April Zitkovich
Sonia Chernova
Dhruv Batra
Tingnan Zhang
Jie Tan
Wenhao Yu
LM&Ro
11
13
0
01 May 2023
What Do Self-Supervised Vision Transformers Learn?
Namuk Park
Wonjae Kim
Byeongho Heo
Taekyung Kim
Sangdoo Yun
SSL
65
76
1
01 May 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
123
593
0
26 Apr 2023
Chat with the Environment: Interactive Multimodal Perception Using Large Language Models
Xufeng Zhao
Mengdi Li
C. Weber
Muhammad Burhan Hafez
S. Wermter
LLMAG
LM&Ro
LRM
93
32
0
14 Mar 2023
Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models
Jiarui Xu
Sifei Liu
Arash Vahdat
Wonmin Byeon
Xiaolong Wang
Shalini De Mello
VLM
198
318
0
08 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&Ro
OffRL
LRM
AI4CE
87
148
0
07 Mar 2023
RobotSweater: Scalable, Generalizable, and Customizable Machine-Knitted Tactile Skins for Robots
Zilin Si
T. Yu
Katrene Morozov
James McCann
Wenzhen Yuan
16
14
0
06 Mar 2023
NeU-NBV: Next Best View Planning Using Uncertainty Estimation in Image-Based Neural Rendering
Liren Jin
Xieyuanli Chen
Julius Ruckin
Marija Popović
47
52
0
02 Mar 2023
Open-World Object Manipulation using Pre-trained Vision-Language Models
Austin Stone
Ted Xiao
Yao Lu
K. Gopalakrishnan
Kuang-Huei Lee
...
Sean Kirmani
Brianna Zitkovich
F. Xia
Chelsea Finn
Karol Hausman
LM&Ro
139
144
0
02 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Motion Policy Networks
Adam Fishman
Adithya Murali
Clemens Eppner
Bryan N. Peele
Byron Boots
D. Fox
48
55
0
21 Oct 2022
PI-QT-Opt: Predictive Information Improves Multi-Task Robotic Reinforcement Learning at Scale
Kuang-Huei Lee
Ted Xiao
A. Li
Paul Wohlhart
Ian S. Fischer
Yao Lu
29
10
0
15 Oct 2022
Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
LM&Ro
142
337
0
11 Oct 2022
CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory
Nur Muhammad (Mahi) Shafiullah
Chris Paxton
Lerrel Pinto
Soumith Chintala
Arthur Szlam
VLM
LM&Ro
CLIP
90
155
0
11 Oct 2022
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials
Aviral Kumar
Anika Singh
F. Ebert
Mitsuhiko Nakamoto
Yanlai Yang
Chelsea Finn
Sergey Levine
OffRL
OnRL
117
64
0
11 Oct 2022
Real-World Robot Learning with Masked Visual Pre-training
Ilija Radosavovic
Tete Xiao
Stephen James
Pieter Abbeel
Jitendra Malik
Trevor Darrell
SSL
144
238
0
06 Oct 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
112
616
0
22 Sep 2022
Open-vocabulary Queryable Scene Representations for Real World Planning
Boyuan Chen
F. Xia
Brian Ichter
Kanishka Rao
K. Gopalakrishnan
Michael S. Ryoo
Austin Stone
Daniel Kappler
LM&Ro
138
179
0
20 Sep 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
136
430
0
10 Jul 2022
A Review of Safe Reinforcement Learning: Methods, Theory and Applications
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
OffRL
AI4TS
102
231
0
20 May 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Safe Autonomous Racing via Approximate Reachability on Ego-vision
Bingqing Chen
Jonathan M Francis
Jean Oh
Eric Nyberg
Sylvia L. Herbert
33
14
0
14 Oct 2021
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
203
627
0
12 Oct 2021
FILM: Following Instructions in Language with Modular Methods
So Yeon Min
Devendra Singh Chaplot
Pradeep Ravikumar
Yonatan Bisk
Ruslan Salakhutdinov
LM&Ro
190
159
0
12 Oct 2021
Safety Assurances for Human-Robot Interaction via Confidence-aware Game-theoretic Human Models
Ran Tian
Liting Sun
Andrea V. Bajcsy
M. Tomizuka
Anca Dragan
40
55
0
29 Sep 2021
Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain Datasets
F. Ebert
Yanlai Yang
Karl Schmeckpeper
Bernadette Bucher
G. Georgakis
Kostas Daniilidis
Chelsea Finn
Sergey Levine
152
212
0
27 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
Xiuye Gu
Tsung-Yi Lin
Weicheng Kuo
Yin Cui
VLM
ObjD
220
698
0
28 Apr 2021
1
2
Next