Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2308.10901
Cited By
Structured World Models from Human Videos
21 August 2023
Russell Mendonca
Shikhar Bahl
Deepak Pathak
LM&Ro
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Structured World Models from Human Videos"
50 / 99 papers shown
ManualVLA: A Unified VLA Model for Chain-of-Thought Manual Generation and Robotic Manipulation
Chenyang Gu
Jiaming Liu
Hao Chen
Runzhong Huang
Qingpo Wuwu
...
Ying Li
Renrui Zhang
Peng Jia
Pheng-Ann Heng
Shanghang Zhang
LM&Ro
157
1
0
01 Dec 2025
SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments
Xinyi Li
Zaishuo Xia
Weyl Lu
Chenjie Hao
Yubei Chen
137
0
0
28 Nov 2025
Reinforcing Action Policies by Prophesying
Jiahui Zhang
Ze Huang
Chun Gu
Zipei Ma
Li Zhang
233
1
0
25 Nov 2025
In-N-On: Scaling Egocentric Manipulation with in-the-wild and on-task Data
Xiongyi Cai
Ri-Zhao Qiu
Geng Chen
Lai Wei
Isabella Liu
Tianshu Huang
Xuxin Cheng
Xiaolong Wang
EgoV
356
1
0
19 Nov 2025
Robot Learning from a Physical World Model
Jiageng Mao
Sicheng He
Hao-Ning Wu
Yang You
Shuyang Sun
...
Huizhong Chen
Leonidas Guibas
Vitor Campagnolo Guizilini
Zhengyu Ma
Yue Wang
VGen
PINN
421
0
0
10 Nov 2025
TwinVLA: Data-Efficient Bimanual Manipulation with Twin Single-Arm Vision-Language-Action Models
Hokyun Im
Euijin Jeong
Jianlong Fu
Andrey Kolobov
Youngwoon Lee
84
0
0
07 Nov 2025
Learning Interactive World Model for Object-Centric Reinforcement Learning
Fan Feng
Phillip Lippe
Sara Magliacane
OffRL
OCL
312
0
0
04 Nov 2025
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
Shichao Fan
K. Wu
Zhengping Che
X. Wang
Di Wu
...
M. M. Li
Qingjie Liu
Shanghang Zhang
Min Wan
Yong Dai
247
1
0
04 Nov 2025
Clone Deterministic 3D Worlds
Zaishuo Xia
Yukuan Lu
Xinyi Li
Yifan Xu
Yubei Chen
155
0
0
30 Oct 2025
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
Qixiu Li
Yu Deng
Yaobo Liang
L. Luo
Lei Zhou
...
Hao Chen
Lily Sun
Dong Chen
J. Yang
B. Guo
129
7
0
24 Oct 2025
MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
Zhuoyang Liu
Jiaming Liu
Jiadong Xu
Nuowei Han
Chenyang Gu
...
Kai Chin Hsieh
K. Wu
Zhengping Che
Yong Dai
Shanghang Zhang
LM&Ro
124
4
0
30 Sep 2025
IGFuse: Interactive 3D Gaussian Scene Reconstruction via Multi-Scans Fusion
Wenhao Hu
Zesheng Li
Haonan Zhou
Liu Liu
Xuexiang Wen
Zhizhong Su
Xi Li
Gaoang Wang
3DGS
130
0
0
18 Aug 2025
Visuomotor Grasping with World Models for Surgical Robots
Hongbin Lin
Bin Li
K. W. S. Au
153
1
0
15 Aug 2025
Large Model Empowered Embodied AI: A Survey on Decision-Making and Embodied Learning
Wenlong Liang
Rui Zhou
Yang Ma
Bing Zhang
Songlin Li
Yijia Liao
Ping Kuang
LM&Ro
3DV
AI4CE
169
8
0
14 Aug 2025
villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models
Xiaoyu Chen
Hangxing Wei
Pushi Zhang
Chuheng Zhang
Kaixin Wang
...
Yucen Wang
Xinquan Xiao
Li Zhao
Jianyu Chen
Jiang Bian
LM&Ro
361
15
0
31 Jul 2025
GR-3 Technical Report
Chilam Cheang
S. Chen
Zhongren Cui
Yingdong Hu
Liqun Huang
...
Hongtao Wu
Xin Xiao
Yuyang Xiao
Jiafeng Xu
Yichu Yang
320
46
0
21 Jul 2025
Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Yiqi Wang
Mrinal Verghese
Jeff Schneider
251
4
0
17 Jul 2025
A Survey: Learning Embodied Intelligence from Physical Simulators and World Models
Xiaoxiao Long
Qingrui Zhao
Kaiwen Zhang
Zihao Zhang
Dingrui Wang
...
Jia Pan
Qiu Shen
Ruigang Yang
X. Cao
Qionghai Dai
LM&Ro
AI4CE
301
19
0
01 Jul 2025
Goal-VLA: Image-Generative VLMs as Object-Centric World Models Empowering Zero-shot Robot Manipulation
Haonan Chen
Jingxiang Guo
Bangjun Wang
Tianrui Zhang
Xuchuan Huang
Boren Zheng
Yiwen Hou
Chenrui Tie
Jiajun Deng
Lin Shao
VGen
LM&Ro
SyDa
174
2
0
30 Jun 2025
SafeMimic: Towards Safe and Autonomous Human-to-Robot Imitation for Mobile Manipulation
Arpit Bahety
Arnav Balaji
Ben Abbatematteo
Roberto Martín-Martín
145
3
0
18 Jun 2025
WorldPrediction: A Benchmark for High-level World Modeling and Long-horizon Procedural Planning
Delong Chen
Willy Chung
Yejin Bang
Ziwei Ji
Pascale Fung
VGen
LM&Ro
250
6
0
04 Jun 2025
What Do Latent Action Models Actually Learn?
International Conference on Learning Representations (ICLR), 2024
Chuheng Zhang
Tim Pearce
Pushi Zhang
Kaixin Wang
Xiaoyu Chen
Wei Shen
Li Zhao
Jiang Bian
174
7
0
27 May 2025
OSVI-WM: One-Shot Visual Imitation for Unseen Tasks using World-Model-Guided Trajectory Generation
Raktim Gautam Goswami
Prashanth Krishnamurthy
Yann LeCun
Farshad Khorrami
VGen
OffRL
268
5
0
26 May 2025
WorldEval: World Model as Real-World Robot Policies Evaluator
Yaxuan Li
Yichen Zhu
Junjie Wen
Chaomin Shen
Yi Xu
OffRL
VGen
194
0
0
25 May 2025
Imagine Beyond! Distributionally Robust Auto-Encoding for State Space Coverage in Online Reinforcement Learning
Nicolas Castanet
Olivier Sigaud
Sylvain Lamprier
OffRL
443
0
0
23 May 2025
TeleOpBench: A Simulator-Centric Benchmark for Dual-Arm Dexterous Teleoperation
Hangyu Li
Qin Zhao
Haoran Xu
Xinyu Jiang
Qingwei Ben
...
Jia Zeng
Hanqing Wang
Bo Dai
Junting Dong
Jiangmiao Pang
467
4
0
19 May 2025
ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
Jiahui Zhang
Yusen Luo
Abrar Anwar
Sumedh Anand Sontakke
Joseph J Lim
Jesse Thomason
Erdem Biyik
Jesse Zhang
OffRL
LM&Ro
420
18
0
16 May 2025
UniVLA: Learning to Act Anywhere with Task-centric Latent Actions
Robotics (RAS), 2025
Qingwen Bu
Yanting Yang
Jisong Cai
Shenyuan Gao
Guanghui Ren
Maoqing Yao
Ping Luo
Hongyang Li
889
102
0
09 May 2025
PIN-WM: Learning Physics-INformed World Models for Non-Prehensile Manipulation
Wenxuan Li
Hang Zhao
Zhiyuan Yu
Yu Du
Qin Zou
Ruizhen Hu
K. Xu
SSL
412
7
0
23 Apr 2025
Novel Diffusion Models for Multimodal 3D Hand Trajectory Prediction
Junyi Ma
Wentao Bao
Jingyi Xu
Guanzhong Sun
Xieyuanli Chen
Hesheng Wang
233
4
0
10 Apr 2025
ZeroMimic: Distilling Robotic Manipulation Skills from Web Videos
IEEE International Conference on Robotics and Automation (ICRA), 2025
Junyao Shi
Zhuolun Zhao
Tianyou Wang
Ian Pedroza
Amy Luo
Jie Wang
Jason Ma
Dinesh Jayaraman
LM&Ro
283
13
0
31 Mar 2025
PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model
Computer Vision and Pattern Recognition (CVPR), 2025
Mingju Gao
Yike Pan
Huan-ang Gao
Zongzheng Zhang
Wenyi Li
Hao Dong
Hao Tang
Li Yi
Hao Zhao
VGen
255
6
0
25 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
555
35
0
24 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
Shanghang Zhang
622
95
0
13 Mar 2025
LuciBot: Automated Robot Policy Learning from Generated Videos
Xiaowen Qiu
Yian Wang
Jiting Cai
Zhehuan Chen
Chunru Lin
Tsun-Hsuan Wang
Chuang Gan
LM&Ro
VGen
316
2
0
12 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Han Li
Fu Liu
Xianpeng Lang
Xiaolong Sun
VGen
966
4
0
12 Mar 2025
Cross-Embodiment Robotic Manipulation Synthesis via Guided Demonstrations through CycleVAE and Human Behavior Transformer
Apan Dastider
Hao Fang
Mingjie Lin
166
0
0
11 Mar 2025
Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments
Soonwoo Kwon
Jin-Young Kim
Hyojun Go
Kyungjune Baek
272
2
0
11 Mar 2025
Four Principles for Physically Interpretable World Models
Jordan Peper
Zhenjiang Mao
Yuang Geng
Siyuan Pan
Ivan Ruchkin
427
5
0
04 Mar 2025
Exo-ViHa: A Cross-Platform Exoskeleton System with Visual and Haptic Feedback for Efficient Dexterous Skill Learning
Xintao Chao
Shilong Mu
Yushan Liu
Shoujie Li
Chuqiao Lyu
Xiao-Ping Zhang
Wenbo Ding
299
1
0
03 Mar 2025
Magma: A Foundation Model for Multimodal AI Agents
Computer Vision and Pattern Recognition (CVPR), 2025
Jianwei Yang
Reuben Tan
Qianhui Wu
Ruijie Zheng
Baolin Peng
...
Seonghyeon Ye
Joel Jang
Yuquan Deng
Lars Liden
Jianfeng Gao
VLM
AI4TS
355
93
0
18 Feb 2025
Learning from Massive Human Videos for Universal Humanoid Pose Control
Jiageng Mao
Siheng Zhao
Siqi Song
Tianheng Shi
Junjie Ye
Mingtong Zhang
Haoran Geng
Jitendra Malik
Vitor Campagnolo Guizilini
Yue Wang
338
23
0
18 Dec 2024
Reinforcement Learning from Wild Animal Videos
Elliot Chane-Sane
Constant Roux
O. Stasse
Nicolas Mansard
952
1
0
05 Dec 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
354
187
0
29 Nov 2024
Understanding World or Predicting Future? A Comprehensive Survey of World Models
ACM Computing Surveys (ACM CSUR), 2024
Jingtao Ding
Yunke Zhang
Yu Shang
Yuheng Zhang
Zefang Zong
...
Fengli Xu
Yong Li
Chen Gao
Fengli Xu
Yong Li
VGen
SyDa
517
17
0
21 Nov 2024
Grounding Video Models to Actions through Goal Conditioned Exploration
International Conference on Learning Representations (ICLR), 2024
Yunhao Luo
Yilun Du
LM&Ro
VGen
405
21
0
11 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
Hao Fei
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
489
38
0
08 Nov 2024
DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning
G. Zhou
Hengkai Pan
Yann LeCun
Lerrel Pinto
VGen
LM&Ro
OffRL
375
103
0
07 Nov 2024
STEER: Flexible Robotic Manipulation via Dense Language Grounding
IEEE International Conference on Robotics and Automation (ICRA), 2024
Laura Smith
A. Irpan
Montserrat Gonzalez Arenas
Sean Kirmani
Dmitry Kalashnikov
Dhruv Shah
Ted Xiao
LLMSV
295
7
0
05 Nov 2024
Pre-trained Visual Dynamics Representations for Efficient Policy Learning
European Conference on Computer Vision (ECCV), 2024
Hao Luo
Bohan Zhou
Zongqing Lu
267
4
0
05 Nov 2024
1
2
Next