Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.11795
Cited By
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
23 June 2022
Bowen Baker
Ilge Akkaya
Peter Zhokhov
Joost Huizinga
Jie Tang
Adrien Ecoffet
Brandon Houghton
Raul Sampedro
Jeff Clune
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos"
50 / 54 papers shown
Title
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations
Anthony Liang
Pavel Czempin
Matthew Hong
Yutai Zhou
Erdem Biyik
Stephen Tu
45
0
0
08 May 2025
ViSA-Flow: Accelerating Robot Skill Learning via Large-Scale Video Semantic Action Flow
Changhe Chen
Quantao Yang
Xiaohao Xu
Nima Fazeli
Olov Andersson
24
0
0
02 May 2025
Learning to Drive from a World Model
Mitchell Goff
Greg Hogan
George Hotz
Armand du Parc Locmaria
Kacper Raczy
Harald Schäfer
Adeeb Shihadeh
Weixing Zhang
Yassine Yousfi
34
0
0
27 Apr 2025
Generative AI in Embodied Systems: System-Level Analysis of Performance, Efficiency and Scalability
Zishen Wan
Jiayi Qian
Yuhang Du
Jason J. Jabbour
Yilun Du
Yang Katie Zhao
A. Raychowdhury
Tushar Krishna
Vijay Janapa Reddi
LM&Ro
86
0
0
26 Apr 2025
Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning
Isadora White
Kolby Nottingham
Ayush Maniar
Max Robinson
Hansen Lillemark
Mehul Maheshwari
Lianhui Qin
Prithviraj Ammanabrolu
LLMAG
LM&Ro
115
0
0
24 Apr 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
54
3
0
24 Mar 2025
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Nvidia
Johan Bjorck
Fernando Castañeda
Nikita Cherniadev
Xingye Da
...
Ao Zhang
Hao Zhang
Yizhou Zhao
Ruijie Zheng
Yuke Zhu
VLM
59
15
0
18 Mar 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
78
1
0
24 Feb 2025
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng
Zhiyuan Huang
Zhenghai Xue
Xinrun Wang
Bo An
Shuicheng Yan
77
14
0
17 Feb 2025
Sample-efficient Unsupervised Policy Cloning from Ensemble Self-supervised Labeled Videos
Xin Liu
Yaran Chen
Haoran Li
SSL
94
0
0
14 Dec 2024
Grounding Video Models to Actions through Goal Conditioned Exploration
Yunhao Luo
Yilun Du
LM&Ro
VGen
77
1
0
11 Nov 2024
SPOT: SE(3) Pose Trajectory Diffusion for Object-Centric Manipulation
Cheng-Chun Hsu
Bowen Wen
Jie Xu
Yashraj S. Narang
Xiaolong Wang
Yuke Zhu
Joydeep Biswas
Stan Birchfield
DiffM
35
8
0
01 Nov 2024
Latent Action Pretraining from Videos
Seonghyeon Ye
Joel Jang
Byeongguk Jeon
Sejune Joo
Jianwei Yang
...
Kimin Lee
J. Gao
Luke Zettlemoyer
Dieter Fox
Minjoon Seo
32
27
0
15 Oct 2024
VideoAgent: Self-Improving Video Generation
Achint Soni
Sreyas Venkataraman
Abhranil Chandra
Sebastian Fischmeister
Percy Liang
Bo Dai
Sherry Yang
LM&Ro
VGen
50
7
0
14 Oct 2024
Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers
Jianxin Bi
Kelvin Lim
Kaiqi Chen
Yifei Huang
Harold Soh
25
0
0
10 Oct 2024
Open-World Reinforcement Learning over Long Short-Term Imagination
Jiajian Li
Q. Wang
Yunbo Wang
Xin Jin
Yang Li
Wenjun Zeng
Xiaokang Yang
OCL
VLM
47
1
0
04 Oct 2024
Game On: Towards Language Models as RL Experimenters
Jingwei Zhang
Thomas Lampe
A. Abdolmaleki
Jost Tobias Springenberg
Martin Riedmiller
LM&Ro
29
0
0
05 Sep 2024
MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale
Anton Andreychuk
Konstantin Yakovlev
Aleksandr I. Panov
A. Skrynnik
AI4CE
58
3
0
29 Aug 2024
Aligning Agents like Large Language Models
Adam Jelley
Yuhan Cao
Dave Bignell
Sam Devlin
Tabish Rashid
LM&Ro
28
1
0
06 Jun 2024
Reward Machines for Deep RL in Noisy and Uncertain Environments
Andrew C. Li
Zizhao Chen
Toryn Q. Klassen
Pashootan Vaezipoor
Rodrigo Toro Icarte
Sheila A. McIlraith
46
6
0
31 May 2024
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
60
75
0
27 May 2024
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
Zhuoling Li
Xiaogang Xu
Zhenhua Xu
Sernam Lim
Hengshuang Zhao
LM&Ro
40
2
0
27 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
67
41
0
23 May 2024
QUAR-VLA: Vision-Language-Action Model for Quadruped Robots
Pengxiang Ding
Han Zhao
Wenxuan Song
Zhitao Wang
Zhenyu Wei
Shangke Lyu
Ningxi Yang
Donglin Wang
30
19
0
22 Dec 2023
Learning to Act without Actions
Dominik Schmidt
Minqi Jiang
OffRL
15
30
0
17 Dec 2023
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
Yiran Qin
Enshen Zhou
Qichang Liu
Zhen-fei Yin
Lu Sheng
Ruimao Zhang
Yu Qiao
Jing Shao
LM&Ro
20
39
0
12 Dec 2023
BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
Rohin Shah
21
6
0
05 Dec 2023
Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games
Lukas Schäfer
Logan Jones
Anssi Kanervisto
Yuhan Cao
Tabish Rashid
Raluca Georgescu
David Bignell
Siddhartha Sen
Andrea Trevino Gavito
Sam Devlin
85
3
0
04 Dec 2023
MM-VID: Advancing Video Understanding with GPT-4V(ision)
Kevin Qinghong Lin
Faisal Ahmed
Linjie Li
Chung-Ching Lin
E. Azarnasab
...
Lin Liang
Zicheng Liu
Yumao Lu
Ce Liu
Lijuan Wang
MLLM
26
63
0
30 Oct 2023
Quilt-1M: One Million Image-Text Pairs for Histopathology
Wisdom O. Ikezogwo
M. S. Seyfioglu
Fatemeh Ghezloo
Dylan Stefan Chan Geva
Fatwir Sheikh Mohammed
Pavan Kumar Anand
Ranjay Krishna
Linda G. Shapiro
CLIP
VLM
128
109
0
20 Jun 2023
Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
Shengran Hu
Jeff Clune
LM&Ro
OffRL
LRM
AI4CE
25
27
0
01 Jun 2023
Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning
Jialong Wu
Haoyu Ma
Chao Deng
Mingsheng Long
OffRL
19
24
0
29 May 2023
Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
David Brandfonbrener
Ofir Nachum
Joan Bruna
AI4CE
12
20
0
26 May 2023
Adaptive Policy Learning to Additional Tasks
Wenjian Hao
Zehui Lu
Zihao Liang
Tianyu Zhou
Shaoshuai Mou
14
0
0
24 May 2023
Policy Learning based on Deep Koopman Representation
Wenjian Hao
Paulo Heredia
Bowen Huang
Zehui Lu
Zihao Liang
Shaoshuai Mou
21
1
0
24 May 2023
Reinforcement Learning from Passive Data via Latent Intentions
Dibya Ghosh
Chethan Bhateja
Sergey Levine
OffRL
11
41
0
10 Apr 2023
Accelerating exploration and representation learning with offline pre-training
Bogdan Mazoure
Jake Bruce
Doina Precup
Rob Fergus
Ankit Anand
OffRL
22
5
0
31 Mar 2023
Language Models can Solve Computer Tasks
Geunwoo Kim
Pierre Baldi
Stephen Marcus McAleer
LLMAG
LM&Ro
35
337
0
30 Mar 2023
Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Stephanie Milani
Anssi Kanervisto
Karolis Ramanauskas
Sander Schulhoff
Brandon Houghton
...
Vinicius G. Goecks
Nicholas R. Waytowich
David Watkins
J. Miller
Rohin Shah
22
16
0
23 Mar 2023
Investigating the role of model-based learning in exploration and transfer
Jacob Walker
Eszter Vértes
Yazhe Li
Gabriel Dulac-Arnold
Ankesh Anand
T. Weber
Jessica B. Hamrick
OffRL
26
6
0
08 Feb 2023
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
Zihao Wang
Shaofei Cai
Guanzhou Chen
Anji Liu
Xiaojian Ma
Yitao Liang
LM&Ro
LLMAG
55
315
0
03 Feb 2023
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
Kolby Nottingham
Prithviraj Ammanabrolu
Alane Suhr
Yejin Choi
Hannaneh Hajishirzi
Sameer Singh
Roy Fox
LLMAG
LM&Ro
17
76
0
28 Jan 2023
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto
Sherry Yang
Pieter Abbeel
Doina Precup
Igor Mordatch
Ofir Nachum
OffRL
15
5
0
23 Nov 2022
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
15
334
0
06 Oct 2022
Learning to Generalize with Object-centric Agents in the Open World Survival Game Crafter
Aleksandar Stanić
Yujin Tang
David R Ha
Jürgen Schmidhuber
ELM
16
11
0
05 Aug 2022
Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models
Alex Lamb
Riashat Islam
Yonathan Efroni
Aniket Didolkar
Dipendra Kumar Misra
Dylan J. Foster
Lekan Molu
Rajan Chari
A. Krishnamurthy
John Langford
31
24
0
17 Jul 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
42
347
0
17 Jun 2022
Social Network Structure Shapes Innovation: Experience-sharing in RL with SAPIENS
Eleni Nisioti
Matéo Mahaut
Pierre-Yves Oudeyer
Ida Momennejad
Clément Moulin-Frier
13
9
0
10 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
P. S. Castro
Aaron C. Courville
Marc G. Bellemare
OffRL
OnRL
21
63
0
03 Jun 2022
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
46
783
0
12 May 2022
1
2
Next