Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.13413
Cited By
Vision Transformers for Dense Prediction
24 March 2021
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViT
MDE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision Transformers for Dense Prediction"
50 / 982 papers shown
Title
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
18
0
0
15 May 2025
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis
B. Ke
Kevin Qu
T. Wang
Nando Metzger
Shengyu Huang
Bo Li
Anton Obukhov
Konrad Schindler
DiffM
VLM
20
0
0
14 May 2025
FreeDriveRF: Monocular RGB Dynamic NeRF without Poses for Autonomous Driving via Point-Level Dynamic-Static Decoupling
Yue Wen
Liang Song
Y. Liu
Siting Zhu
Yanzi Miao
Lijun Han
Hesheng Wang
28
0
0
14 May 2025
Camera-Only Bird's Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles
Anupkumar Bochare
14
0
0
09 May 2025
DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion
Qitao Zhao
Amy Lin
Jeff Tan
Jason Y. Zhang
Deva Ramanan
Shubham Tulsiani
VGen
46
0
0
08 May 2025
VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery
Bojin Wu
Jing Chen
MDE
44
0
0
05 May 2025
JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
Kwon Byung-Ki
Qi Dai
Lee Hyoseok
Chong Luo
Tae-Hyun Oh
59
0
0
01 May 2025
Pixel3DMM: Versatile Screen-Space Priors for Single-Image 3D Face Reconstruction
Simon Giebenhain
Tobias Kirschstein
Martin Rünz
Lourdes Agapito
Matthias Nießner
CVBM
3DH
57
0
0
01 May 2025
Adept: Annotation-Denoising Auxiliary Tasks with Discrete Cosine Transform Map and Keypoint for Human-Centric Pretraining
Weizhen He
Yunfeng Yan
Shixiang Tang
Yiheng Deng
Yangyang Zhong
Pengxin Luo
Donglian Qi
VLM
88
1
0
29 Apr 2025
Category-Level and Open-Set Object Pose Estimation for Robotics
Peter Honig
Matthias Hirschmanner
Markus Vincze
29
0
0
28 Apr 2025
Joint Optimization of Neural Radiance Fields and Continuous Camera Motion from a Monocular Video
Hoang Chuong Nguyen
Wei Mao
Jose M. Alvarez
Miaomiao Liu
52
0
0
28 Apr 2025
Leveraging Multi-Modal Saliency and Fusion for Gaze Target Detection
Athul M. Mathew
Arshad Ali Khan
Thariq Khalid
Faroq AL-Tam
R. Souissi
77
1
0
27 Apr 2025
Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models
Patrick Müller
Alexander Braun
M. Keuper
52
0
0
25 Apr 2025
The Fourth Monocular Depth Estimation Challenge
Anton Obukhov
Matteo Poggi
Fabio Tosi
Ripudaman Singh Arora
Jaime Spencer
...
Tuan-Anh Yang
Minh-Quang Nguyen
T. Tran
Albert Luginov
Muhammad Shahzad
MDE
94
0
0
24 Apr 2025
Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections
Max Kirchner
Alexander C. Jenke
S. Bodenstedt
F. Kolbinger
Oliver Saldanha
Jakob N. Kather
M. Wagner
Stefanie Speidel
FedML
MedIm
64
0
0
23 Apr 2025
SmallGS: Gaussian Splatting-based Camera Pose Estimation for Small-Baseline Videos
Yuxin Yao
Yan Zhang
Zhening Huang
Joan Lasenby
3DGS
19
0
0
22 Apr 2025
Landmark-Free Preoperative-to-Intraoperative Registration in Laparoscopic Liver Resection
Jun Zhou
Bingchen Gao
Kai Wang
Jialun Pei
Pheng-Ann Heng
Jing Qin
MedIm
32
0
0
21 Apr 2025
VistaDepth: Frequency Modulation With Bias Reweighting For Enhanced Long-Range Depth Estimation
Mingxia Zhan
Li Zhang
Xiaomeng Chu
Beibei Wang
MDE
57
0
0
21 Apr 2025
PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling
Alara Dirik
Tuanfeng Y. Wang
Duygu Ceylan
Stefanos Zafeiriou
Anna Frühstück
DiffM
40
0
0
19 Apr 2025
Visual Consensus Prompting for Co-Salient Object Detection
J. T. Wang
Nana Yu
Zihao Zhang
Yahong Han
24
0
0
19 Apr 2025
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models
Haiwen Huang
Anpei Chen
Volodymyr Havrylov
Andreas Geiger
Dan Zhang
34
1
0
18 Apr 2025
Mono3R: Exploiting Monocular Cues for Geometric 3D Reconstruction
Wenyu Li
Sidun Liu
Peng Qiao
Yong Dou
25
0
0
18 Apr 2025
SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling
Yasin Almalioglu
Andrzej Kucik
Geoffrey French
Dafni Antotsiou
Alexander Adam
Cedric Archambeau
21
0
0
17 Apr 2025
St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World
Haiwen Feng
Junyi Zhang
Qianqian Wang
Yufei Ye
Pengcheng Yu
Michael J. Black
Trevor Darrell
Angjoo Kanazawa
VGen
3DV
50
1
0
17 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
103
0
0
17 Apr 2025
Regist3R: Incremental Registration with Stereo Foundation Model
Sidun Liu
Wenyu Li
Peng Qiao
Yong Dou
3DV
44
0
0
16 Apr 2025
TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion
Y. Wang
J. Li
Chaoyi Hong
Ruibo Li
Liusheng Sun
Xiao-yang Song
Zhe Wang
Zhiguo Cao
Guosheng Lin
MDE
29
0
0
16 Apr 2025
Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image
Tao Wen
J. Wang
Y. Chen
Shugong Xu
Chi Zhang
Xuelong Li
MDE
31
0
0
16 Apr 2025
Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception
Ziqi Pang
Xin Xu
Yu-Xiong Wang
DiffM
60
0
0
15 Apr 2025
SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
Jonathan Prexl
M. Recla
M. Schmitt
29
0
0
11 Apr 2025
PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction
Mingzhi Pei
Xu Cao
Xiangyi Wang
Heng Guo
Zhanyu Ma
3DV
50
0
0
11 Apr 2025
Novel Pooling-based VGG-Lite for Pneumonia and Covid-19 Detection from Imbalanced Chest X-Ray Datasets
Santanu Roy
Ashvath Suresh
Palak Sahu
Tulika Rudra Gupta
29
0
0
10 Apr 2025
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou
Wenqi Xian
Guandao Yang
Mohamed Abdelfattah
Bharath Hariharan
Noah Snavely
Ning Yu
P. Debevec
MDE
27
0
0
09 Apr 2025
MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular Detection
Rishubh Parihar
Srinjay Sarkar
Sarthak Vora
Jogendra Nath Kundu
R. V. Babu
95
0
0
09 Apr 2025
D^2USt3R: Enhancing 3D Reconstruction with 4D Pointmaps for Dynamic Scenes
Jisang Han
Honggyu An
Jaewoo Jung
Takuya Narihira
Junyoung Seo
Kazumi Fukuda
Chaehyun Kim
Sunghwan Hong
Yuki Mitsufuji
Seungryong Kim
38
0
0
08 Apr 2025
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
Songyan Zhang
Yongtao Ge
Jinyuan Tian
Guangkai Xu
Hao Chen
Chen Lv
Chunhua Shen
3DPC
24
0
0
08 Apr 2025
Window Token Concatenation for Efficient Visual Large Language Models
Yifan Li
Wentao Bao
Botao Ye
Zhen Tan
Tianlong Chen
Huan Liu
Yu Kong
VLM
41
0
0
05 Apr 2025
Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation
Xin Zhang
Robby T. Tan
Mamba
48
0
0
04 Apr 2025
Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape Images
In-Hwan Jin
Haesoo Choo
Seong-Hun Jeong
Heemoon Park
Junghwan Kim
Oh-joon Kwon
Kyeongbo Kong
3DGS
34
0
0
04 Apr 2025
PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation
Lihua Liu
Jiehong Lin
Zhenxin Liu
Kui Jia
38
0
0
03 Apr 2025
ADGaussian: Generalizable Gaussian Splatting for Autonomous Driving with Multi-modal Inputs
Qi Song
Chenghong Li
Haotong Lin
Sida Peng
Rui Huang
3DGS
46
0
0
01 Apr 2025
Monocular and Generalizable Gaussian Talking Head Animation
Shengjie Gong
H. Li
Jiapeng Tang
Dongming Hu
Shuangping Huang
Hao Chen
Tianshui Chen
Zhuoman Liu
3DGS
41
1
0
01 Apr 2025
Easi3R: Estimating Disentangled Motion from DUSt3R Without Training
Xingyu Chen
Yue Chen
Yuliang Xiu
Andreas Geiger
Anpei Chen
3DPC
VGen
38
1
0
31 Mar 2025
Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach
Francesco P. Ramunno
Paolo Massa
Vitaliy Kinakh
Brandon Panos
A. Csillaghy
S. Voloshynovskiy
DiffM
53
0
0
31 Mar 2025
Distance Estimation to Support Assistive Drones for the Visually Impaired using Robust Calibration
Suman Raj
Bhavani A Madhabhavi
Madhav Kumar
Prabhav Gupta
Yogesh Simmhan
43
1
0
31 Mar 2025
Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views
Chong Bao
Xiyu Zhang
Zehao Yu
Jiale Shi
Guofeng Zhang
Songyou Peng
Zhaopeng Cui
3DGS
3DV
36
0
0
31 Mar 2025
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model
Jannik Endres
Oliver Hahn
Charles Corbière
Simone Schaub-Meyer
Stefan Roth
Alexandre Alahi
MDE
37
0
0
30 Mar 2025
One Look is Enough: A Novel Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation Models on High-Resolution Images
Byeongjun Kwon
Munchurl Kim
VLM
MDE
57
0
0
28 Mar 2025
MVSAnywhere: Zero-Shot Multi-View Stereo
Sergio Izquierdo
Mohamed Sayed
Michael Firman
Guillermo Garcia-Hernando
Daniyar Turmukhambetov
Javier Civera
Oisin Mac Aodha
Gabriel J. Brostow
Jamie Watson
3DV
39
3
0
28 Mar 2025
Deep Depth Estimation from Thermal Image: Dataset, Benchmark, and Challenges
Ukcheol Shin
Jinsun Park
3DV
MDE
39
0
0
28 Mar 2025
1
2
3
4
...
18
19
20
Next