ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.13413
  4. Cited By
Vision Transformers for Dense Prediction

Vision Transformers for Dense Prediction

IEEE International Conference on Computer Vision (ICCV), 2021
24 March 2021
René Ranftl
Alexey Bochkovskiy
V. Koltun
    ViTMDE
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (2138★)

Papers citing "Vision Transformers for Dense Prediction"

50 / 1,224 papers shown
GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats
GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats
K. Deng
Zhiqiang Wang
Shenlong Wang
J. Xie
3DGS
344
0
0
11 Mar 2025
1LoRA: Summation Compression for Very Low-Rank Adaptation
Alessio Quercia
Zhuo Cao
Arya Bangun
Richard D. Paul
Abigail Morrison
Ira Assent
Hanno Scharr
222
2
0
11 Mar 2025
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
LBM: Latent Bridge Matching for Fast Image-to-Image Translation
Clement Chadebec
O. Tasar
Sanjeev Sreetharan
Benjamin Aubin
456
12
0
10 Mar 2025
Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity
Xiaohao Xu
Feng Xue
Xianrui Li
Haowei Li
Steve Yang
Tianze Zhang
Matthew Johnson-Roberson
Xiaonan Huang
3DV
280
1
0
08 Mar 2025
EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images
EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images
Rohit Menon
Nils Dengler
Sicong Pan
Gokul Krishna Chenchani
Maren Bennewitz
EDL
470
0
0
06 Mar 2025
Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Zhong Ji
Weilong Cao
Yan Zhang
Yanwei Pang
Jungong Han
Xuelong Li
DiffMVLM
317
1
0
06 Mar 2025
S2Gaussian: Sparse-View Super-Resolution 3D Gaussian SplattingComputer Vision and Pattern Recognition (CVPR), 2025
Yecong Wan
Mingwen Shao
Yuanshuo Cheng
W. Zuo
441
15
0
06 Mar 2025
Is Pre-training Applicable to the Decoder for Dense Prediction?
Is Pre-training Applicable to the Decoder for Dense Prediction?
Chao Ning
Wanshui Gan
Weihao Xuan
Xiangwei Zhu
473
0
0
05 Mar 2025
COARSE: Collaborative Pseudo-Labeling with Coarse Real Labels for Off-Road Semantic Segmentation
Aurelio Noca
Xianmei Lei
Jonathan Becktor
J. Edlund
Anna Sabel
Patrick Spieler
Curtis Padgett
Alexandre Alahi
Deegan Atha
348
0
0
05 Mar 2025
SVDC: Consistent Direct Time-of-Flight Video Depth Completion with Frequency Selective FusionComputer Vision and Pattern Recognition (CVPR), 2025
Xuan Zhu
Jijun Xiang
Longliang Liu
Longliang Liu
Yuanbo Wang
Hao Zhang
Fei Guo
Xin-She Yang
407
5
0
03 Mar 2025
MUSt3R: Multi-view Network for Stereo 3D ReconstructionComputer Vision and Pattern Recognition (CVPR), 2025
Yohann Cabon
Lucas Stoffl
L. Antsfeld
G. Csurka
Boris Chidlovskii
Jérôme Revaud
Vincent Leroy
3DGS3DV
290
57
0
03 Mar 2025
Blind Augmentation: Calibration-free Camera Distortion Model Estimation for Real-time Mixed-reality ConsistencyIEEE Transactions on Visualization and Computer Graphics (TVCG), 2025
Siddhant Prakash
David R. Walton
R. K. D. Anjos
A. Steed
Tobias Ritschel
200
1
0
03 Mar 2025
Bridging Spectral-wise and Multi-spectral Depth Estimation via Geometry-guided Contrastive LearningIEEE International Conference on Robotics and Automation (ICRA), 2025
Ukcheol Shin
Kyunghyun Lee
Jean Oh
MDE
365
1
0
02 Mar 2025
Bring Your Own Grasp Generator: Leveraging Robot Grasp Generation for Prosthetic GraspingIEEE International Conference on Robotics and Automation (ICRA), 2025
Giuseppe Stracquadanio
Federico Vasile
Elisa Maiettini
Nicoló Boccardo
Lorenzo Natale
272
2
0
01 Mar 2025
Back to the Future Cyclopean Stereo: a human perception approach combining deep and geometric constraints
Back to the Future Cyclopean Stereo: a human perception approach combining deep and geometric constraints
Sherlon Almeida da Silva
Davi Geiger
Luiz Velho
Moacir Antonelli Ponti
290
0
0
28 Feb 2025
TrackGS: Optimizing COLMAP-Free 3D Gaussian Splatting with Global Track Constraints
TrackGS: Optimizing COLMAP-Free 3D Gaussian Splatting with Global Track Constraints
D. Shi
Shen Cao
Lubin Fan
Bojian Wu
Jinhui Guo
Renjie Chen
Ligang Liu
3DGS
328
3
0
27 Feb 2025
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
Luigi Piccinelli
Daniel Gehrig
Yifan Yang
Mattia Segu
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
544
75
0
27 Feb 2025
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He
Xiaodong Gu
Xiaodan Ye
Chao Xu
Zhengyi Zhao
Yuan Dong
Weihao Yuan
Zilong Dong
Liefeng Bo
3DGS
556
5
0
25 Feb 2025
FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks
FoREST: Frame of Reference Evaluation in Spatial Reasoning Tasks
Tanawan Premsri
Parisa Kordjamshidi
371
4
0
25 Feb 2025
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain ModelInternational Conference on Learning Representations (ICLR), 2025
Yaxuan Huang
Xili Dai
Jianan Wang
Xianbiao Qi
Yixing Yuan
Xiangyu Yue
562
4
0
24 Feb 2025
Challenges of Multi-Modal Coreset Selection for Depth Prediction
Viktor Moskvoretskii
Narek Alvandian
201
0
0
20 Feb 2025
L4P: Towards Unified Low-Level 4D Vision Perception
L4P: Towards Unified Low-Level 4D Vision Perception
Abhishek Badki
Hang Su
Bowen Wen
Orazio Gallo
VLM
468
4
0
18 Feb 2025
FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse Views
FLARE: Feed-forward Geometry, Appearance and Camera Estimation from Uncalibrated Sparse ViewsComputer Vision and Pattern Recognition (CVPR), 2025
Shangzhan Zhang
Jianyuan Wang
Yinghao Xu
Nan Xue
Christian Rupprecht
Xiaowei Zhou
Yujun Shen
Gordon Wetzstein
607
94
0
17 Feb 2025
NPSim: Nighttime Photorealistic Simulation From Daytime Images With Monocular Inverse Rendering and Ray Tracing
NPSim: Nighttime Photorealistic Simulation From Daytime Images With Monocular Inverse Rendering and Ray Tracing
Shutong Zhang
357
1
0
15 Feb 2025
CoL3D: Collaborative Learning of Single-view Depth and Camera Intrinsics for Metric 3D Shape Recovery
CoL3D: Collaborative Learning of Single-view Depth and Camera Intrinsics for Metric 3D Shape RecoveryIEEE International Conference on Robotics and Automation (ICRA), 2025
Chenghao Zhang
Lubin Fan
Shen Cao
Bojian Wu
Jieping Ye
478
0
0
13 Feb 2025
Matrix3D: Large Photogrammetry Model All-in-One
Matrix3D: Large Photogrammetry Model All-in-OneComputer Vision and Pattern Recognition (CVPR), 2025
Yuanxun Lu
Jingyang Zhang
Tian Fang
Jean-Daniel Nahmias
Yanghai Tsin
Long Quan
Xun Cao
Yao Yao
Shiwei Li
689
21
0
11 Feb 2025
Semantic to Structure: Learning Structural Representations for Infringement Detection
Semantic to Structure: Learning Structural Representations for Infringement DetectionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Chuanwei Huang
Zexi Jia
Hongyan Fei
Yeshuang Zhu
Zhiqiang Yuan
Jinchao Zhang
Jie Zhou
DiffM
233
6
0
11 Feb 2025
Revisiting Gradient-based Uncertainty for Monocular Depth EstimationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Julia Hornauer
Amir El-Ghoussani
Vasileios Belagiannis
UQCV
285
3
0
09 Feb 2025
Edge Attention Module for Object Classification
Edge Attention Module for Object Classification
Santanu Roy
Ashvath Suresh
Archit Gupta
235
1
0
05 Feb 2025
Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic Encoding
Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic EncodingTowards Autonomous Robotic Systems (TAROS), 2025
Jingming Xia
Guanqun Cao
Guang Ma
Yiben Luo
Qinzhao Li
John Oyekan
MDE
313
0
0
01 Feb 2025
MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model
MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model
Jihyeok Kim
Seongwoo Moon
Sungwon Nah
David Hyunchul Shim
MDE
412
2
0
01 Feb 2025
CheapNVS: Real-Time On-Device Narrow-Baseline Novel View Synthesis
CheapNVS: Real-Time On-Device Narrow-Baseline Novel View SynthesisIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
K. Georgiadis
M. K. Yucel
Albert Saà-Garriga
ViT
320
1
0
24 Jan 2025
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward PassComputer Vision and Pattern Recognition (CVPR), 2025
Jianing Yang
Alexander Sax
Kevin J. Liang
Mikael Henaff
Hao Tang
Ang Cao
J. Chai
Franziska Meier
Matt Feiszli
3DGS
769
169
0
23 Jan 2025
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks
Enhancing Monocular Depth Estimation with Multi-Source Auxiliary TasksIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2025
Alessio Quercia
Erenus Yildiz
Zhuo Cao
Kai Krajsek
Abigail Morrison
Ira Assent
Hanno Scharr
302
0
0
22 Jan 2025
Continuous 3D Perception Model with Persistent State
Continuous 3D Perception Model with Persistent StateComputer Vision and Pattern Recognition (CVPR), 2025
Qianqian Wang
Yifei Zhang
Aleksander Holyñski
Alexei A. Efros
Angjoo Kanazawa
VGen
355
236
0
21 Jan 2025
Survey on Monocular Metric Depth Estimation
Survey on Monocular Metric Depth Estimation
Jiuling Zhang
VLM
726
10
0
21 Jan 2025
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
Video Depth Anything: Consistent Depth Estimation for Super-Long VideosComputer Vision and Pattern Recognition (CVPR), 2025
Sili Chen
Hengkai Guo
Shengnan Zhu
Feihu Zhang
Zilong Huang
Jiashi Feng
Bingyi Kang
MDEVLMAuLLM
619
114
0
21 Jan 2025
See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization
See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic RegularizationIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Zongqi He
Zhe Xiao
Kin-Chung Chan
Yushen Zuo
Jun Xiao
Kin-Man Lam
3DGS
341
8
0
20 Jan 2025
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation
FutureDepth: Learning to Predict the Future Improves Video Depth EstimationEuropean Conference on Computer Vision (ECCV), 2024
R. Yasarla
Manish Kumar Singh
Hong Cai
Yunxiao Shi
Jisoo Jeong
Yinhao Zhu
Shizhong Han
Risheek Garrepalli
Fatih Porikli
MDE
516
12
0
17 Jan 2025
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation
MAMo: Leveraging Memory and Attention for Monocular Video Depth EstimationIEEE International Conference on Computer Vision (ICCV), 2023
R. Yasarla
H. Cai
Jisoo Jeong
Y. Shi
Risheek Garrepalli
Fatih Porikli
MDE
600
27
0
17 Jan 2025
MonSter++: Unified Stereo Matching, Multi-view Stereo, and Real-time Stereo with Monodepth Priors
MonSter++: Unified Stereo Matching, Multi-view Stereo, and Real-time Stereo with Monodepth Priors
JunDa Cheng
Longliang Liu
Gangwei Xu
Longliang Liu
Zhenru Zhang
...
Yong Deng
Xin-She Yang
Yangyang Shi
Jinhui Tang
Xin Yang
3DVMDE
387
28
0
15 Jan 2025
Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers
Advancing Semantic Future Prediction through Multimodal Visual Sequence TransformersComputer Vision and Pattern Recognition (CVPR), 2025
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
392
3
0
14 Jan 2025
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Kolmogorov-Arnold Network for Remote Sensing Image Semantic Segmentation
Xianping Ma
Ziyao Wang
Yin Hu
Xiaokang Zhang
Man-On Pun
242
6
0
13 Jan 2025
OneLLM: One Framework to Align All Modalities with Language
OneLLM: One Framework to Align All Modalities with LanguageComputer Vision and Pattern Recognition (CVPR), 2023
Jiaming Han
Kaixiong Gong
Yiyuan Zhang
Yuan Liu
Kaipeng Zhang
Dahua Lin
Yu Qiao
Shiyang Feng
Xiangyu Yue
MLLM
577
198
0
10 Jan 2025
Powerful Design of Small Vision Transformer on CIFAR10
Powerful Design of Small Vision Transformer on CIFAR10
Gent Wu
ViT
269
2
0
07 Jan 2025
A Novel Vision Transformer for Camera-LiDAR Fusion based Traffic Object SegmentationInternational Conference on Agents and Artificial Intelligence (ICAART), 2025
Toomas Tahves
Junyi Gu
M. Bellone
Raivo Sell
ViT
195
0
0
06 Jan 2025
PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation
Zhenyu Li
Wenqing Cui
S. Bhat
Peter Wonka
MDE
350
1
0
03 Jan 2025
TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions
Vriksha Srihari
R. Bhavya
Shruti Jayaraman
V. Mary Anita Rajam
DiffMVGen
331
0
0
02 Jan 2025
MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning
MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning
Chunpu Liu
Guanglei Yang
Wangmeng Zuo
Tianyi Zan
MDE
343
0
0
31 Dec 2024
TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation
TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation
Shaoqing Xu
Fang Li
Peixiang Huang
Ziying Song
Zhi-xin Yang
78
7
0
31 Dec 2024
Previous
123...678...232425
Next
Page 7 of 25
Pageof 25