Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.11198
Cited By
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
15 December 2024
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Pedro M B Rezende
Yasaman Haghighi
David Brüggemann
Isinsu Katircioglu
Lin Zhang
Xiaoran Chen
Suman Saha
Marco Cannici
Elie Aljalbout
Botao Ye
Xi Wang
A. Davtyan
Mathieu Salzmann
Davide Scaramuzza
Marc Pollefeys
Paolo Favaro
Alexandre Alahi
VLM
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control"
4 / 4 papers shown
Title
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
38
0
0
22 Apr 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
62
3
0
24 Mar 2025
Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception
Dingkang Liang
Dingyuan Zhang
Xin Zhou
Sifan Tu
Tianrui Feng
Xiaofan Li
Yumeng Zhang
Mingyang Du
Xiao Tan
Xiang Bai
67
2
0
17 Mar 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
86
1
0
24 Feb 2025
1