Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.15506
Cited By
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
22 March 2024
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation"
50 / 88 papers shown
Title
VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery
Bojin Wu
Jing Chen
MDE
31
0
0
05 May 2025
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Simon Giebenhain
Tobias Kirschstein
Martin Rünz
Lourdes Agapito
Matthias Nießner
CVBM
3DH
47
0
0
01 May 2025
The Fourth Monocular Depth Estimation Challenge
Anton Obukhov
Matteo Poggi
Fabio Tosi
Ripudaman Singh Arora
Jaime Spencer
...
Tuan-Anh Yang
Minh-Quang Nguyen
T. Tran
Albert Luginov
Muhammad Shahzad
MDE
24
0
0
24 Apr 2025
A Guide to Structureless Visual Localization
Vojtech Panek
Qunjie Zhou
Yaqing Ding
Sérgio Agostinho
Zuzana Kúkelová
Torsten Sattler
Laura Leal-Taixe
28
0
0
24 Apr 2025
Physically Consistent Humanoid Loco-Manipulation using Latent Diffusion Models
Ilyass Taouil
Haizhou Zhao
Angela Dai
Majid Khadiv
DiffM
36
0
0
23 Apr 2025
MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation
Xingxing Zuo
Nikhil Ranganathan
Connor T. Lee
Georgia Gkioxari
Soon-Jo Chung
VLM
39
1
0
21 Apr 2025
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Tao Wen
J. Wang
Y. Chen
Shugong Xu
Chi Zhang
Xuelong Li
MDE
26
0
0
16 Apr 2025
RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements
Guangcong Zheng
Teng Li
Xianpan Zhou
Xi Li
VGen
3DV
51
1
0
11 Apr 2025
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou
Wenqi Xian
Guandao Yang
Mohamed Abdelfattah
Bharath Hariharan
Noah Snavely
Ning Yu
P. Debevec
MDE
22
0
0
09 Apr 2025
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
Songyan Zhang
Yongtao Ge
Jinyuan Tian
Guangkai Xu
Hao Chen
Chen Lv
Chunhua Shen
3DPC
21
0
0
08 Apr 2025
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
Kexin Tian
Jingrui Mao
Y. Zhang
Jiwan Jiang
Yang Zhou
Zhengzhong Tu
CoGe
60
0
0
04 Apr 2025
WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments
Jianhao Zheng
Zihan Zhu
Valentin Bieri
Marc Pollefeys
Songyou Peng
Iro Armeni
3DGS
19
0
0
04 Apr 2025
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer
Samuel Rota Buló
Yung-Hsu Yang
Nikhil Varma Keetha
Lorenzo Porzi
Norman Muller
Katja Schwarz
Jonathon Luiten
Marc Pollefeys
Peter Kontschieder
3DGS
39
0
0
02 Apr 2025
DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image
Jijun Xiang
Xuan Zhu
Xianqi Wang
Y. Wang
H. Zhang
Fei Guo
Xin-She Yang
31
0
0
02 Apr 2025
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Tian-Xing Xu
Xiangjun Gao
Wenbo Hu
Xiaoyu Li
Song-Hai Zhang
Ying Shan
VGen
MDE
56
1
0
01 Apr 2025
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos
Felix Wimbauer
Weirong Chen
Dominik Muhle
Christian Rupprecht
Daniel Cremers
VGen
60
0
0
30 Mar 2025
MVSAnywhere: Zero-Shot Multi-View Stereo
Sergio Izquierdo
Mohamed Sayed
Michael Firman
Guillermo Garcia-Hernando
Daniyar Turmukhambetov
Javier Civera
Oisin Mac Aodha
Gabriel J. Brostow
Jamie Watson
3DV
34
3
0
28 Mar 2025
Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video
David Yifan Yao
Albert Zhai
Shenlong Wang
VGen
43
1
0
27 Mar 2025
ST-VLM: Kinematic Instruction Tuning for Spatio-Temporal Reasoning in Vision-Language Models
Dohwan Ko
S. Kim
Yumin Suh
Vijay Kumar B.G
Minseo Yoon
Manmohan Chandraker
Hyunwoo J. Kim
LRM
36
0
0
25 Mar 2025
CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis
Youngkyoon Jang
Eduardo Pérez-Pellitero
38
0
0
25 Mar 2025
MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction
Wenyuan Zhang
Yixiao Yang
Han Huang
Liang Han
Kanle Shi
Yu-Shen Liu
Zhizhong Han
MDE
53
3
0
24 Mar 2025
Distilling Monocular Foundation Model for Fine-grained Depth Completion
Yingping Liang
Yutao Hu
Wenqi Shao
Ying Fu
MDE
42
0
0
21 Mar 2025
Loop Closure from Two Views: Revisiting PGO for Scalable Trajectory Estimation through Monocular Priors
Tian Yi Lim
Boyang Sun
Marc Pollefeys
Hermann Blum
39
0
0
20 Mar 2025
UniK3D: Universal Camera Monocular 3D Estimation
Luigi Piccinelli
Christos Sakaridis
Mattia Segu
Y. Yang
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
35
0
0
20 Mar 2025
A Recipe for Generating 3D Worlds From a Single Image
Katja Schwarz
Denys Rozumnyi
Samuel Rota Buló
Lorenzo Porzi
Peter Kontschieder
VGen
74
1
0
20 Mar 2025
Vision-Language Embodiment for Monocular Depth Estimation
Jinchang Zhang
Guoyu Lu
VLM
MDE
42
0
0
18 Mar 2025
Niagara: Normal-Integrated Geometric Affine Field for Scene Reconstruction from a Single View
Xianzu Wu
Zhenxin Ai
Harry Yang
Ser-Nam Lim
Jun Liu
H. Wang
3DV
36
0
0
16 Mar 2025
Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation
Hongyu Wen
Yiming Zuo
Venkat Subramanian
Patrick Chen
Jia Deng
3DV
51
0
0
14 Mar 2025
LiSu: A Dataset and Method for LiDAR Surface Normal Estimation
Dušan Malić
Christian Fruhwirth-Reisinger
Samuel Schulter
Horst Possegger
3DV
42
0
0
11 Mar 2025
VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation
Hanzhi Chen
Boyang Sun
Anran Zhang
Marc Pollefeys
Stefan Leutenegger
LM&Ro
57
0
0
10 Mar 2025
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clement Chadebec
O. Tasar
Sanjeev Sreetharan
Benjamin Aubin
34
0
0
10 Mar 2025
H3O: Hyper-Efficient 3D Occupancy Prediction with Heterogeneous Supervision
Y. Shi
H. Cai
Amin Ansari
Fatih Porikli
33
0
0
06 Mar 2025
DuCos: Duality Constrained Depth Super-Resolution via Foundation Model
Zhiqiang Yan
Zhengxue Wang
Haoye Dong
Jun Yu Li
Jian Yang
Gim Hee Lee
64
0
0
06 Mar 2025
A Novel Solution for Drone Photogrammetry with Low-overlap Aerial Images using Monocular Depth Estimation
J. Zhong
Qi Zhou
Ming Li
Armin Gruen
Xuan Liao
MDE
53
0
0
06 Mar 2025
Is Pre-training Applicable to the Decoder for Dense Prediction?
Chao Ning
Wanshui Gan
Weihao Xuan
Naoto Yokoya
39
0
0
05 Mar 2025
Back to the Future Cyclopean Stereo: a human perception approach combining deep and geometric constraints
Sherlon Almeida da Silva
Davi Geiger
Luiz Velho
Moacir Antonelli Ponti
28
0
0
28 Feb 2025
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
Luigi Piccinelli
Christos Sakaridis
Y. Yang
Mattia Segu
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
31
6
0
27 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
99
4
0
11 Feb 2025
Survey on Monocular Metric Depth Estimation
Jiuling Zhang
VLM
62
0
0
21 Jan 2025
FrontierNet: Learning Visual Cues to Explore
Boyang Sun
Hanzhi Chen
Stefan Leutenegger
César Cadena
Marc Pollefeys
Hermann Blum
62
0
0
08 Jan 2025
DPBridge: Latent Diffusion Bridge for Dense Prediction
Haorui Ji
Taojun Lin
Hongdong Li
DiffM
41
1
0
29 Dec 2024
SolidGS: Consolidating Gaussian Surfel Splatting for Sparse-View Surface Reconstruction
Zhuowen Shen
Yuan Liu
Zhang Chen
Zhong Li
Jiepeng Wang
...
Jingdong Zhang
Yi Tian Xu
Scott Schaefer
Xin Li
Wenping Wang
3DGS
86
0
0
19 Dec 2024
GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
Haoyi Jiang
Liu Liu
Tianheng Cheng
Xinjie Wang
Tianwei Lin
Zhizhong Su
W. Liu
X. Wang
3DGS
ViT
90
5
0
17 Dec 2024
RoMeO: Robust Metric Visual Odometry
JunDa Cheng
Z. Cai
Zhaoxing Zhang
Wei Yin
Matthias Müller
Michael Paulitsch
Xin Yang
81
0
0
16 Dec 2024
Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction
Dongxu Wei
Zhiqi Li
Peidong Liu
79
1
0
09 Dec 2024
PaintScene4D: Consistent 4D Scene Generation from Text Prompts
Vinayak Gupta
Yunze Man
Yu-Xiong Wang
VGen
77
0
0
05 Dec 2024
Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Jiahao Lu
Tianyu Huang
Peng Li
Zhiyang Dou
Cheng Lin
Zhiming Cui
Z. Dong
Sai-Kit Yeung
Wenping Wang
Yuan-Bin Liu
VGen
MDE
95
3
0
04 Dec 2024
AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos
Yuze He
Wang Zhao
Shaohui Liu
Yubin Hu
Yushi Bai
Yu-Hui Wen
Y. Liu
84
1
0
29 Nov 2024
SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation
Duc-Hai Pham
Tung Do
P. Nguyen
Binh-Son Hua
K. Nguyen
Rang Nguyen
MDE
64
1
0
27 Nov 2024
Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration
Junyuan Deng
Wei Yin
Xiaoyang Guo
Qian Zhang
Xiaotao Hu
Weiqiang Ren
Xiaoxiao Long
P. Tan
DiffM
MDE
79
0
0
26 Nov 2024
1
2
Next