Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.07843
Cited By
Foundation Models in Robotics: Applications, Challenges, and the Future
13 December 2023
Roya Firoozi
Johnathan Tucker
Stephen Tian
Anirudha Majumdar
Jiankai Sun
Weiyu Liu
Yuke Zhu
Shuran Song
Ashish Kapoor
Karol Hausman
Brian Ichter
Danny Driess
Jiajun Wu
Cewu Lu
Mac Schwager
LM&Ro
AI4CE
LRM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Foundation Models in Robotics: Applications, Challenges, and the Future"
50 / 56 papers shown
Title
Meta-Optimization and Program Search using Language Models for Task and Motion Planning
Denis Shcherba
Eckart Cobo-Briesewitz
Cornelius V. Braun
Marc Toussaint
LM&Ro
LRM
29
0
0
06 May 2025
Position: Foundation Models Need Digital Twin Representations
Yiqing Shen
Hao Ding
Lalithkumar Seenivasan
Tianmin Shu
Mathias Unberath
AI4CE
31
0
0
01 May 2025
Improvements of Dark Experience Replay and Reservoir Sampling towards Better Balance between Consolidation and Plasticity
Taisuke Kobayashi
CLL
34
0
0
29 Apr 2025
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
Chen Wang
Fei Xia
Wenhao Yu
Tingnan Zhang
Ruohan Zhang
Ce Liu
Li Fei-Fei
Jie Tan
Jacky Liang
31
0
0
17 Apr 2025
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees
Hang Zhao
Juzhan Xu
Kexiong Yu
Ruizhen Hu
Chenyang Zhu
K. Xu
65
1
0
06 Apr 2025
Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills
Haoqi Yuan
Yu Bai
Yuhui Fu
Bohan Zhou
Yicheng Feng
Xinrun Xu
Yi Zhan
Börje F. Karlsson
Zongqing Lu
LM&Ro
74
0
0
16 Mar 2025
Scalable Real2Sim: Physics-Aware Asset Generation Via Robotic Pick-and-Place Setups
Nicholas Pfaff
Evelyn Fu
Jeremy Binagia
Phillip Isola
Russ Tedrake
47
2
0
01 Mar 2025
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
G. Klambauer
Razvan Pascanu
Sepp Hochreiter
68
4
0
21 Feb 2025
A Multimodal PDE Foundation Model for Prediction and Scientific Text Descriptions
Elisa Negrini
Yuxuan Liu
Liu Yang
Stanley Osher
Hayden Schaeffer
AI4CE
79
0
0
09 Feb 2025
Large Language Models for Multi-Robot Systems: A Survey
Peihan Li
Zijian An
Shams Abrar
Lifeng Zhou
LM&Ro
LRM
44
4
0
06 Feb 2025
Technical report on label-informed logit redistribution for better domain generalization in low-shot classification with foundation models
Behraj Khan
T. Syed
52
1
0
29 Jan 2025
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
Kei Katsumata
Motonari Kambara
Daichi Yashima
Ryosuke Korekata
Komei Sugiura
56
0
0
28 Jan 2025
3D Foundation Models Enable Simultaneous Geometry and Pose Estimation of Grasped Objects
Weiming Zhi
Haozhan Tang
Tianyi Zhang
Matthew Johnson-Roberson
32
1
0
14 Jul 2024
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control
Delin Qu
Qizhi Chen
Pingrui Zhang
Xianqiang Gao
Bin Zhao
Bin Zhao
Dong Wang
Xuelong Li
AI4CE
34
7
0
23 Jun 2024
Evaluating Uncertainty-based Failure Detection for Closed-Loop LLM Planners
Zhi Zheng
Qian Feng
Hang Li
Alois C. Knoll
Jianxiang Feng
44
6
0
01 Jun 2024
Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Yang Zhang
Shixin Yang
Chenjia Bai
Fei Wu
Xiu Li
Zhen Wang
Xuelong Li
LLMAG
31
25
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
60
38
0
23 May 2024
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
Ian Huang
Guandao Yang
Leonidas J. Guibas
29
3
0
26 Apr 2024
Deep Reinforcement Learning for Bipedal Locomotion: A Brief Survey
Lingfan Bao
Josephine N. Humphreys
Tianhu Peng
Chengxu Zhou
65
5
0
25 Apr 2024
Clio: Real-time Task-Driven Open-Set 3D Scene Graphs
Dominic Maggio
Yun Chang
Nathan Hughes
Matthew Trang
Dan Griffith
Carlyn Dougherty
Eric Cristofalo
Lukas Schmid
Luca Carlone
3DV
33
31
0
21 Apr 2024
Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
Peiyuan Zhi
Zhiyuan Zhang
Muzhi Han
Zeyu Zhang
Zhitian Li
Ziyuan Jiao
Ziyuan Jiao
Siyuan Huang
Siyuan Huang
LRM
LM&Ro
38
28
0
16 Apr 2024
What AIs are not Learning (and Why)
M. Stefik
33
0
0
19 Mar 2024
Perceive With Confidence: Statistical Safety Assurances for Navigation with Learning-Based Perception
Anushri Dixit
Zhiting Mei
Meghan Booker
Mariko Storey-Matsutani
Mariko Storey-Matsutani
Allen Z. Ren
Ola Shorinwa
Anirudha Majumdar
29
5
0
13 Mar 2024
Verifiably Following Complex Robot Instructions with Foundation Models
Benedict Quartey
Eric Rosen
Stefanie Tellex
G. Konidaris
LM&Ro
39
10
0
18 Feb 2024
Video Language Planning
Yilun Du
Mengjiao Yang
Peter R. Florence
Fei Xia
Ayzaan Wahid
...
Pieter Abbeel
Josh Tenenbaum
L. Kaelbling
Andy Zeng
Jonathan Tompson
PINN
LM&Ro
84
83
0
16 Oct 2023
L3MVN: Leveraging Large Language Models for Visual Target Navigation
Bangguo Yu
H. Kasaei
M. Cao
LM&Ro
45
83
0
11 Apr 2023
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
209
1,701
0
07 Apr 2023
Audio Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
VGen
60
32
0
13 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&Ro
OffRL
LRM
AI4CE
90
148
0
07 Mar 2023
Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervision
Xiaoshuai Zhang
Abhijit Kundu
Thomas Funkhouser
Leonidas J. Guibas
Hao Su
Kyle Genova
70
46
0
06 Mar 2023
Open-World Object Manipulation using Pre-trained Vision-Language Models
Austin Stone
Ted Xiao
Yao Lu
K. Gopalakrishnan
Kuang-Huei Lee
...
Sean Kirmani
Brianna Zitkovich
F. Xia
Chelsea Finn
Karol Hausman
LM&Ro
139
144
0
02 Mar 2023
Visual Language Maps for Robot Navigation
Chen Huang
Oier Mees
Andy Zeng
Wolfram Burgard
LM&Ro
145
337
0
11 Oct 2022
CLIP-Fields: Weakly Supervised Semantic Fields for Robotic Memory
Nur Muhammad (Mahi) Shafiullah
Chris Paxton
Lerrel Pinto
Soumith Chintala
Arthur Szlam
VLM
LM&Ro
CLIP
90
155
0
11 Oct 2022
Real-World Robot Learning with Masked Visual Pre-training
Ilija Radosavovic
Tete Xiao
Stephen James
Pieter Abbeel
Jitendra Malik
Trevor Darrell
SSL
146
238
0
06 Oct 2022
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
208
2,413
0
06 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&Ro
DiffM
93
143
0
05 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
240
1,070
0
05 Oct 2022
Grounding Language with Visual Affordances over Unstructured Data
Oier Mees
Jessica Borja-Diaz
Wolfram Burgard
LM&Ro
121
106
0
04 Oct 2022
NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields
Jiankai Sun
Yan Xu
Mingyu Ding
Hongwei Yi
Chen Wang
Jingdong Wang
Liangjun Zhang
Mac Schwager
40
12
0
24 Sep 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
112
616
0
22 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
143
449
0
12 Sep 2022
Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations
Vadim Tschernezki
Iro Laina
Diane Larlus
Andrea Vedaldi
168
184
0
07 Sep 2022
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli
Liane Lovitt
John Kernion
Amanda Askell
Yuntao Bai
...
Nicholas Joseph
Sam McCandlish
C. Olah
Jared Kaplan
Jack Clark
216
327
0
23 Aug 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
136
430
0
10 Jul 2022
Can Wikipedia Help Offline Reinforcement Learning?
Machel Reid
Yutaro Yamada
S. Gu
3DV
RALM
OffRL
124
95
0
28 Jan 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
382
4,010
0
28 Jan 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
PointCLIP: Point Cloud Understanding by CLIP
Renrui Zhang
Ziyu Guo
Wei Zhang
Kunchang Li
Xupeng Miao
Bin Cui
Yu Qiao
Peng Gao
Hongsheng Li
VLM
3DPC
161
428
0
04 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Sample-Efficient Safety Assurances using Conformal Prediction
Rachel Luo
Shengjia Zhao
Jonathan Kuck
B. Ivanovic
Silvio Savarese
Edward Schmerling
Marco Pavone
45
56
0
28 Sep 2021
1
2
Next