Structured World Models from Human Videos

21 August 2023

Papers citing "Structured World Models from Human Videos"

49 / 99 papers shown

PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic ManipulationNeural Information Processing Systems (NeurIPS), 2024

315

14 Oct 2024

EgoOops: A Dataset for Mistake Action Detection from Egocentric Videos referring to Procedural Texts

278

07 Oct 2024

IoT-LLM: a framework for enhancing Large Language Model reasoning from real-world sensor dataPatterns (Patterns), 2024

417

03 Oct 2024

AVID: Adapting Video Diffusion Models to World Models

295

01 Oct 2024

World Model-based Perception for Visual Legged LocomotionIEEE International Conference on Robotics and Automation (ICRA), 2024

Hang Lai

Jiahang Cao

Jiafeng Xu

Hongtao Wu

Yunfeng Lin

Tao Kong

Yong Yu

Weinan Zhang

VGen

187

25 Sep 2024

Embodiment-Agnostic Action Planning via Object-Part Scene FlowIEEE International Conference on Robotics and Automation (ICRA), 2024

Wei Zhan

Yun-Hui Liu

Mingyu Ding

229

16 Sep 2024

Hand-Object Interaction Pretraining from VideosIEEE International Conference on Robotics and Automation (ICRA), 2024

Himanshu Gaurav Singh

Antonio Loquercio

213

12 Sep 2024

Goal-Reaching Policy Learning from Non-Expert Observations via Effective Subgoal GuidanceConference on Robot Learning (CoRL), 2024

Yang Yang

Hengtao Shen

OffRL

264

06 Sep 2024

GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned PolicyIEEE Robotics and Automation Letters (RA-L), 2024

Peiyan Li

Hongtao Wu

Yan Huang

Chilam Cheang

Liang Wang

Tao Kong

VGen

220

26 Aug 2024

Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and AviationConference on Robot Learning (CoRL), 2024

Sergey Levine

382

21 Aug 2024

Flow as the Cross-Domain Manipulation Interface

Shuran Song

310

103

21 Jul 2024

TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach

Jichen Sun

Lin Shao

280

03 Jul 2024

Open-TeleVision: Teleoperation with Immersive Active Visual Feedback

Xiaolong Wang

460

208

01 Jul 2024

OpenVLA: An Open-Source Vision-Language-Action Model

...

Dorsa Sadigh

Percy Liang

Chelsea Finn

LM&Ro VLM

597

1,350

13 Jun 2024

Scaling Manipulation Learning with Visual Kinematic Chain Prediction

Xinyu Zhang

Yuhan Liu

Haonan Chang

Abdeslam Boularias

224

12 Jun 2024

Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement LearningInternational Conference on Machine Learning (ICML), 2024

258

10 Jun 2024

Learning Manipulation by Predicting Interaction

Li Chen

...

Heming Cui

Bin Zhao

Xuelong Li

Yu Qiao

Hongyang Li

391

01 Jun 2024

World Models for General Surgical Grasping

232

28 May 2024

Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability

Li Chen

453

209

27 May 2024

iVideoGPT: Interactive VideoGPTs are Scalable World Models

Dong Li

290

24 May 2024

A Survey on Vision-Language-Action Models for Embodied AI

885

166

23 May 2024

One-Shot Imitation Learning with Invariance Matching for Robotic Manipulation

Xinyu Zhang

Abdeslam Boularias

387

21 May 2024

Octo: An Open-Source Generalist Robot Policy

...

Dorsa Sadigh

554

876

20 May 2024

Bidirectional Progressive Transformer for Interaction Intention AnticipationEuropean Conference on Computer Vision (ECCV), 2024

Yang Cao

322

09 May 2024

Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

501

07 May 2024

ScrewMimic: Bimanual Imitation from Human Videos with Screw Space Projection

Arpit Bahety

Priyanka Mandikal

Ben Abbatematteo

Roberto Martín-Martín

292

06 May 2024

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond

...

362

06 May 2024

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey

...

900

28 Apr 2024

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

...

326

19 Mar 2024

AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual DistractorsInternational Conference on Machine Learning (ICML), 2024

263

15 Mar 2024

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic ManipulationEuropean Conference on Computer Vision (ECCV), 2024

343

106

13 Mar 2024

Spatiotemporal Predictive Pre-training for Robotic Motor Control

Gangshan Wu

369

08 Mar 2024

Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation

Tao Chen

Abhishek Gupta

Pulkit Agrawal

274

119

06 Mar 2024

World Models for Autonomous Driving: An Initial Survey

Haicheng Liao

Chengzhong Xu

428

05 Mar 2024

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

...

280

28 Feb 2024

Learning by Watching: A Review of Video-based Learning Approaches for Robot ManipulationIEEE Access (IEEE Access), 2024

Chrisantus Eze

Christopher Crick

SSL

466

11 Feb 2024

A Survey on Robotics with Foundation Models: toward Embodied AI

280

04 Feb 2024

Adaptive Mobile Manipulation for Articulated Objects In the Open World

318

25 Jan 2024

General Flow as Foundation Affordance for Scalable Robot LearningConference on Robot Learning (CoRL), 2024

330

21 Jan 2024

Visual Robotic Manipulation with Depth-Aware PretrainingIEEE International Conference on Robotics and Biomimetics (ROBIO), 2024

304

17 Jan 2024

Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot ManipulationEuropean Conference on Computer Vision (ECCV), 2024

276

15 Jan 2024

Any-point Trajectory Modeling for Policy Learning

Pieter Abbeel

544

170

28 Dec 2023

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

466

232

20 Dec 2023

EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning

Mingyu Ding

Ying Shan

355

11 Dec 2023

Applications of Large Scale Foundation Models for Autonomous Driving

Yu Huang

Yue Chen

Zhu Li

ELM AI4CE LRM ALM LM&Ro

641

20 Nov 2023

DreamSmooth: Improving Model-based Reinforcement Learning via Reward SmoothingInternational Conference on Learning Representations (ICLR), 2023

Vint Lee

Pieter Abbeel

Youngwoon Lee

221

02 Nov 2023

Model-Based Runtime Monitoring with Interactive Imitation LearningIEEE International Conference on Robotics and Automation (ICRA), 2023

Huihan Liu

Shivin Dass

Roberto Martín-Martín

Yuke Zhu

228

26 Oct 2023

Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2023

289

29 May 2023

Pretrained Language Models as Visual Planners for Human AssistanceIEEE International Conference on Computer Vision (ICCV), 2023

Ruta Desai

326

17 Apr 2023