Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2011.02523
Cited By
v1
v2
v3
v4
v5 (latest)
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
4 November 2020
Mike Roberts
Jason Ramapuram
Anurag Ranjan
Atulit Kumar
Miguel Angel Bautista
Nathan Paczan
Russ Webb
Joshua M. Susskind
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding"
50 / 358 papers shown
COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence
Zefeng Zhang
Xiangzhao Hao
Hengzhu Tang
Zhenyu Zhang
Jiawei Sheng
...
Zhenyang Li
Li Gao
Daiting Shi
D. Yin
Tingwen Liu
LRM
VLM
235
3
0
04 Dec 2025
LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging
Zhijian Shu
Cheng Lin
Tao Xie
Wei Yin
B. Li
...
W. Li
Yao Yao
Xun Cao
X. Guo
Xiao-Xiao Long
ViT
273
1
0
04 Dec 2025
UniLight: A Unified Representation for Lighting
Zitian Zhang
Iliyan Georgiev
Michael Fischer
Yannick Hold-Geoffroy
Jean-François Lalonde
Valentin Deschaintre
132
0
0
03 Dec 2025
ReasonX: MLLM-Guided Intrinsic Image Decomposition
Alara Dirik
Tuanfeng Y. Wang
Duygu Ceylan
Stefanos Zafeiriou
Anna Frühstück
103
2
0
03 Dec 2025
MVRoom: Controllable 3D Indoor Scene Generation with Multi-View Diffusion Models
Shaoheng Fang
Chaohui Yu
Fan Wang
Qixing Huang
DiffM
190
0
0
03 Dec 2025
LumiX: Structured and Coherent Text-to-Intrinsic Generation
Xu Han
Biao Zhang
Xiangjun Tang
Xianzhi Li
Peter Wonka
VGen
230
0
0
02 Dec 2025
FOD-S2R: A FOD Dataset for Sim2Real Transfer Learning based Object Detection
Ashish Vashist
Qiranul Saadiyean
Suresh Sundaram
Chandra Sekhar Seelamantula
102
0
0
01 Dec 2025
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model
Jing He
Haodong Li
Mingzhi Sheng
Ying-Cong Chen
DiffM
3DV
246
3
0
30 Nov 2025
MARVO: Marine-Adaptive Radiance-aware Visual Odometry
Sacchin Sundar
Atman Kikani
Aaliya Alam
Sumukh Shrote
A. Nayeemulla Khan
A. Shahina
MDE
425
0
0
28 Nov 2025
Fin3R: Fine-tuning Feed-forward 3D Reconstruction Models via Monocular Knowledge Distillation
Weining Ren
Hongjun Wang
Xiao Tan
Kai Han
143
0
0
27 Nov 2025
CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
Dianbing Xi
Jiepeng Wang
Yuanzhi Liang
Xi Qiu
Jialun Liu
...
Yuchi Huo
Rui Wang
H. Huang
Chi Zhang
Xuelong Li
DiffM
VGen
276
2
0
26 Nov 2025
G
2
^2
2
VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning
Wenbo Hu
Jingli Lin
Yilin Long
Yunlong Ran
Lihan Jiang
Y. Wang
Chenming Zhu
Runsen Xu
Tai Wang
Jiangmiao Pang
VLM
354
5
0
26 Nov 2025
Qwen3-VL Technical Report
Shuai Bai
Yuxuan Cai
Ruizhe Chen
Keqin Chen
Xionghui Chen
...
Jingren Zhou
F. I. S. Kevin Zhou
J. Zhou
Yuanzhi Zhu
Ke Zhu
VLM
2.2K
446
0
26 Nov 2025
AmodalGen3D: Generative Amodal 3D Object Reconstruction from Sparse Unposed Views
Junwei Zhou
Yu-Wing Tai
99
1
0
26 Nov 2025
LocateAnything3D: Vision-Language 3D Detection with Chain-of-Sight
Yunze Man
S. S. Wang
Guowen Zhang
Johan Bjorck
Zhiqi Li
Liang-Yan Gui
Jim Fan
Jan Kautz
Yu Wang
Zhiding Yu
173
1
0
25 Nov 2025
AMB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend
Hengyi Wang
Lourdes Agapito
176
0
0
25 Nov 2025
DetAny4D: Detect Anything 4D Temporally in a Streaming RGB Video
Jiawei Hou
Shenghao Zhang
Can Wang
Zheng Gu
Yonggen Ling
Taiping Zeng
Xiangyang Xue
Jingbo Zhang
3DPC
187
0
0
24 Nov 2025
Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers
Yiqing Shi
Yiren Song
Mike Zheng Shou
DiffM
MDE
371
0
0
24 Nov 2025
4D-VGGT: A General Foundation Model with SpatioTemporal Awareness for Dynamic Scene Geometry Estimation
Haonan Wang
Hanyu Zhou
Haoyue Liu
Luxin Yan
144
2
0
23 Nov 2025
Muskie: Multi-view Masked Image Modeling for 3D Vision Pre-training
Wenyu Li
Sidun Liu
Peng Qiao
Y. Dou
Tongrui Hu
218
0
0
22 Nov 2025
MuM: Multi-View Masked Image Modeling for 3D Vision
David Nordström
Johan Edstedt
Fredrik Kahl
Georg Bökman
321
0
0
21 Nov 2025
Multi-Order Matching Network for Alignment-Free Depth Super-Resolution
Zhengxue Wang
Zhiqiang Yan
Y. Wu
Guangwei Gao
Xiang Li
Jian Yang
3DV
SupR
324
2
0
20 Nov 2025
RoMa v2: Harder Better Faster Denser Feature Matching
Johan Edstedt
David Nordström
Yushan Zhang
Georg Bökman
Jonathan Astermark
Viktor Larsson
Anders Heyden
Fredrik Kahl
Mårten Wadenbäck
Michael Felsberg
3DV
3DH
585
7
0
19 Nov 2025
Lightweight Optimal-Transport Harmonization on Edge Devices
Maria Larchenko
Dmitry Guskov
Alexander Lobashev
Georgy Derevyanko
120
0
0
16 Nov 2025
Depth Anything 3: Recovering the Visual Space from Any Views
Haotong Lin
Sili Chen
Junhao Liew
Donny Y. Chen
Z. Li
Guang Shi
Jiashi Feng
Bingyi Kang
3DV
VLM
MDE
988
137
0
13 Nov 2025
Visual Spatial Tuning
Rui Yang
Ziyu Zhu
Yanwei Li
Jingjia Huang
Shen Yan
...
Xiangtai Li
S. Li
Wenqian Wang
Yi Lin
Hengshuang Zhao
VLM
403
21
0
07 Nov 2025
Room Envelopes: A Synthetic Dataset for Indoor Layout Reconstruction from Images
Sam Bahrami
Dylan Campbell
3DV
269
0
0
06 Nov 2025
Cambrian-S: Towards Spatial Supersensing in Video
Shusheng Yang
J. Yang
Pinzhi Huang
Ellis L Brown
Zihao Yang
...
Daohan Lu
Rob Fergus
Yann LeCun
Li Fei-Fei
Saining Xie
213
43
0
06 Nov 2025
Generative Semantic Coding for Ultra-Low Bitrate Visual Communication and Analysis
Weiming Chen
Yijia Wang
Zhihan Zhu
Z. He
DiffM
172
0
0
31 Oct 2025
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
Yukun Huang
Jiwen Yu
Yanning Zhou
Jianan Wang
Xintao Wang
Pengfei Wan
Xihui Liu
VGen
197
2
0
30 Oct 2025
Rethinking Visual Intelligence: Insights from Video Pretraining
Pablo Acuaviva
A. Davtyan
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Alexandre Alahi
Paolo Favaro
VLM
LRM
244
2
0
28 Oct 2025
More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models
Hongkai Lin
Dingkang Liang
Mingyang Du
Xin Zhou
X. Bai
MoMe
MDE
VLM
581
1
0
27 Oct 2025
Symmetria: A Synthetic Dataset for Learning in Point Clouds
I. Sipiran
Gustavo Santelices
Lucas Oyarzún
A. Ranieri
C. Romanengo
S. Biasotti
B. Falcidieno
137
0
0
27 Oct 2025
M2H: Multi-Task Learning with Efficient Window-Based Cross-Task Attention for Monocular Spatial Perception
U.V.B.L Udugama
G. Vosselman
F. Nex
172
1
0
20 Oct 2025
DepthVLA: Enhancing Vision-Language-Action Models with Depth-Aware Spatial Reasoning
Tianyuan Yuan
Yicheng Liu
Chenhao Lu
Zhuoguang Chen
Tao Jiang
Hang Zhao
VLM
180
13
0
15 Oct 2025
WorldMirror: Universal 3D World Reconstruction with Any-Prior Prompting
Yifan Liu
Zhiyuan Min
Zhenwei Wang
Junta Wu
Tengfei Wang
Yixuan Yuan
Yawei Luo
Chunchao Guo
3DGS
221
24
0
12 Oct 2025
Synthetic Object Compositions for Scalable and Accurate Learning in Detection, Segmentation, and Grounding
Weikai Huang
Jieyu Zhang
Taoyang Jia
Chenhao Zheng
Ziqi Gao
J. S. Park
Winson Han
Ranjay Krishna
284
0
0
10 Oct 2025
OBCache: Optimal Brain KV Cache Pruning for Efficient Long-Context LLM Inference
Yuzhe Gu
Xiyu Liang
Jiaojiao Zhao
Enmao Diao
187
2
0
09 Oct 2025
Teamwork: Collaborative Diffusion with Low-rank Coordination and Adaptation
Sam Sartor
Pieter Peers
DiffM
203
2
0
07 Oct 2025
Benchmark on Monocular Metric Depth Estimation in Wildlife Setting
Niccolò Niccoli
Lorenzo Seidenari
Ilaria Greco
Francesco Rovero
VLM
MDE
249
0
0
06 Oct 2025
Improved probabilistic regression using diffusion models
Carlo Kneissl
Christopher Bülte
Philipp Scholl
Gitta Kutyniok
DiffM
216
0
0
06 Oct 2025
DEPTHOR++: Robust Depth Enhancement from a Real-World Lightweight dToF and RGB Guidance
Jijun Xiang
Longliang Liu
Xuan Zhu
Xianqi Wang
Min Lin
Xin-She Yang
210
1
0
30 Sep 2025
DA
2
^{2}
2
: Depth Anything in Any Direction
Haodong Li
Wangguangdong Zheng
Jing He
Yuhao Liu
Xin Lin
Xin Yang
Ying-Cong Chen
Chunchao Guo
MDE
637
8
0
30 Sep 2025
BRIDGE -- Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation
Dingning Liu
Haoyu Guo
Jingyi Zhou
Tong He
OffRL
MDE
364
0
0
29 Sep 2025
EfficientDepth: A Fast and Detail-Preserving Monocular Depth Estimation Model
Andrii Litvynchuk
Ivan Livinsky
Anand Ravi
N. Kalantari
Andrii Tsarov
MDE
224
0
0
26 Sep 2025
ControlEvents: Controllable Synthesis of Event Camera Datawith Foundational Prior from Image Diffusion Models
Yixuan Hu
Yuxuan Xue
Simon Klenk
Daniel Cremers
Gerard Pons-Moll
DiffM
206
1
0
26 Sep 2025
SLAM-Former: Putting SLAM into One Transformer
Yijun Yuan
Zhuoguang Chen
Kenan Li
Weibang Wang
Hang Zhao
155
0
0
21 Sep 2025
StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
Zhengri Wu
Yiran Wang
Yu Wen
Zeyu Zhang
Biao Wu
Hao Tang
MDE
253
6
0
19 Sep 2025
SPATIALGEN: Layout-guided 3D Indoor Scene Generation
Chuan Fang
Heng Li
Yixun Liang
Jia Zheng
Yongsen Mao
Yuan Liu
Rui Tang
Zihan Zhou
Ping Tan
3DV
453
2
0
18 Sep 2025
Efficient 3D Perception on Embedded Systems via Interpolation-Free Tri-Plane Lifting and Volume Fusion
Sibaek Lee
Jiung Yeon
Hyeonwoo Yu
153
0
0
18 Sep 2025
1
2
3
4
5
6
7
8
Next
Page 1 of 8