ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09179
  4. Cited By
Pretrained Language Models as Visual Planners for Human Assistance

Pretrained Language Models as Visual Planners for Human Assistance

17 April 2023
Dhruvesh Patel
H. Eghbalzadeh
Nitin Kamra
Michael L. Iuzzolino
Unnat Jain
Ruta Desai
    LM&Ro
ArXivPDFHTML

Papers citing "Pretrained Language Models as Visual Planners for Human Assistance"

12 / 12 papers shown
Title
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Hongxin Zhang
Zeyuan Wang
Qiushi Lyu
Zheyuan Zhang
Sunli Chen
Tianmin Shu
Yilun Du
Kwonjoon Lee
Yilun Du
Chuang Gan
41
12
0
16 Apr 2024
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
  and Simulation
RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation
Junting Chen
Yao Mu
Qiaojun Yu
Tianming Wei
Silang Wu
...
Wenqi Shao
Yu Qiao
Huazhe Xu
Mingyu Ding
Ping Luo
LM&Ro
25
11
0
22 Feb 2024
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
PDPP: Projected Diffusion for Procedure Planning in Instructional Videos
Hanlin Wang
Yilu Wu
Sheng Guo
Limin Wang
VGen
DiffM
63
30
0
26 Mar 2023
Learning State-Aware Visual Representations from Audible Interactions
Learning State-Aware Visual Representations from Audible Interactions
Himangi Mittal
Pedro Morgado
Unnat Jain
Abhinav Gupta
64
22
0
27 Sep 2022
ProgPrompt: Generating Situated Robot Task Plans using Large Language
  Models
ProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Ishika Singh
Valts Blukis
Arsalan Mousavian
Ankit Goyal
Danfei Xu
Jonathan Tremblay
D. Fox
Jesse Thomason
Animesh Garg
LM&Ro
LLMAG
112
619
0
22 Sep 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,412
0
11 Nov 2021
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Ego4D: Around the World in 3,000 Hours of Egocentric Video
Kristen Grauman
Andrew Westbury
Eugene Byrne
Zachary Chavis
Antonino Furnari
...
Mike Zheng Shou
Antonio Torralba
Lorenzo Torresani
Mingfei Yan
Jitendra Malik
EgoV
224
1,017
0
13 Oct 2021
Universal Approximation Under Constraints is Possible with Transformers
Universal Approximation Under Constraints is Possible with Transformers
Anastasis Kratsios
Behnoosh Zamanlooy
Tianlin Liu
Ivan Dokmanić
48
26
0
07 Oct 2021
Procedure Planning in Instructional Videos via Contextual Modeling and
  Model-based Policy Learning
Procedure Planning in Instructional Videos via Contextual Modeling and Model-based Policy Learning
Jing Bi
Jiebo Luo
Chenliang Xu
61
48
0
05 Oct 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
245
557
0
28 Sep 2021
BEHAVIOR: Benchmark for Everyday Household Activities in Virtual,
  Interactive, and Ecological Environments
BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments
S. Srivastava
Chengshu Li
Michael Lingelbach
Roberto Martín-Martín
Fei Xia
...
C. Karen Liu
Silvio Savarese
H. Gweon
Jiajun Wu
Li Fei-Fei
LM&Ro
138
152
0
06 Aug 2021
GridToPix: Training Embodied Agents with Minimal Supervision
GridToPix: Training Embodied Agents with Minimal Supervision
Unnat Jain
Iou-Jen Liu
Svetlana Lazebnik
Aniruddha Kembhavi
Luca Weihs
A. Schwing
20
23
0
14 Apr 2021
1