Vision Transformers for Dense Prediction

IEEE International Conference on Computer Vision (ICCV), 2021

24 March 2021

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)Github (2138★)

Papers citing "Vision Transformers for Dense Prediction"

50 / 1,224 papers shown

Perception Encoder: The best visual embeddings are not at the output of the network

Daniel Bolya

Po-Yao (Bernie) Huang

...

Christoph Feichtenhofer

ObjD VOS

678

118

17 Apr 2025

SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling

310

17 Apr 2025

Regist3R: Incremental Registration with Stereo Foundation Model

434

16 Apr 2025

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage FusionComputer Vision and Pattern Recognition (CVPR), 2025

287

16 Apr 2025

Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image

360

16 Apr 2025

Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual PerceptionInternational Conference on Learning Representations (ICLR), 2025

491

15 Apr 2025

SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data

Jonathan Prexl

M. Recla

M. Schmitt

255

11 Apr 2025

PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface ReconstructionComputer Vision and Pattern Recognition (CVPR), 2025

310

11 Apr 2025

Novel Pooling-based VGG-Lite for Pneumonia and Covid-19 Detection from Imbalanced Chest X-Ray DatasetsIEEE Transactions on Emerging Topics in Computational Intelligence (TETCI), 2025

355

10 Apr 2025

FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution

466

09 Apr 2025

MonoPlace3D: Learning 3D-Aware Object Placement for 3D Monocular DetectionComputer Vision and Pattern Recognition (CVPR), 2025

1.0K

09 Apr 2025

^2

USt3R: Enhancing 3D Reconstruction for Dynamic Scenes

297

08 Apr 2025

POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction

322

08 Apr 2025

Window Token Concatenation for Efficient Visual Large Language Models

277

05 Apr 2025

Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic SegmentationComputer Vision and Pattern Recognition (CVPR), 2025

Xin Zhang

Robby T. Tan

Mamba

305

04 Apr 2025

Optimizing 4D Gaussians for Dynamic Scene Video from Single Landscape ImagesInternational Conference on Learning Representations (ICLR), 2025

337

04 Apr 2025

PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation

311

03 Apr 2025

Monocular and Generalizable Gaussian Talking Head AnimationComputer Vision and Pattern Recognition (CVPR), 2025

234

01 Apr 2025

ADGaussian: Generalizable Gaussian Splatting for Autonomous Driving with Multi-modal Inputs

389

01 Apr 2025

Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed ViewsComputer Vision and Pattern Recognition (CVPR), 2025

226

31 Mar 2025

Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach

272

31 Mar 2025

Easi3R: Estimating Disentangled Motion from DUSt3R Without Training

401

31 Mar 2025

NeoARCADE: Robust Calibration for Distance Estimation to Support Assistive Drones for the Visually Impaired

341

31 Mar 2025

BoundMatch: Boundary detection applied to semi-supervised segmentationIEEE Access (IEEE Access), 2025

Haruya Ishikawa

Yoshimitsu Aoki

580

30 Mar 2025

Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model

386

30 Mar 2025

One Look is Enough: Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation on High-Resolution Images

Byeongjun Kwon

Munchurl Kim

VLM MDE

361

28 Mar 2025

MVSAnywhere: Zero-Shot Multi-View StereoComputer Vision and Pattern Recognition (CVPR), 2025

Sergio Izquierdo

Mohamed Sayed

Michael Firman

Guillermo Garcia-Hernando

Daniyar Turmukhambetov

374

28 Mar 2025

Deep Depth Estimation from Thermal Image: Dataset, Benchmark, and Challenges

Ukcheol Shin

Jinsun Park

3DV MDE

257

28 Mar 2025

DuckSegmentation: A segmentation model based on the AnYue Hemp Duck Dataset

181

27 Mar 2025

The Coralscapes Dataset: Semantic Scene Understanding in Coral Reefs

297

25 Mar 2025

FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion

473

25 Mar 2025

Semi-SMD: Semi-Supervised Metric Depth Estimation via Surrounding Cameras for Autonomous Driving

428

25 Mar 2025

Co-SemDepth: Fast Joint Semantic Segmentation and Depth Estimation on Aerial Images

Yara AlaaEldin

Francesca Odone

MDE

383

23 Mar 2025

ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling

Radu Beche

Sergiu Nedevschi

955

22 Mar 2025

Co-op: Correspondence-based Novel Object Pose EstimationComputer Vision and Pattern Recognition (CVPR), 2025

254

22 Mar 2025

UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion ModelsInternational Conference on Learning Representations (ICLR), 2025

488

21 Mar 2025

Pow3R: Empowering Unconstrained 3D Reconstruction with Camera and Scene PriorsComputer Vision and Pattern Recognition (CVPR), 2025

278

21 Mar 2025

Radar-Guided Polynomial Fitting for Metric Depth Estimation

373

21 Mar 2025

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the EdgeComputer Vision and Pattern Recognition (CVPR), 2025

...

355

20 Mar 2025

Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras

...

245

20 Mar 2025

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose EstimationComputer Vision and Pattern Recognition (CVPR), 2025

314

19 Mar 2025

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual SceneComputer Vision and Pattern Recognition (CVPR), 2025

314

19 Mar 2025

Bolt3D: Generating 3D Scenes in Seconds

Stanislaw Szymanowicz

Ricardo Martín Brualla

Jonathan T. Barron

Philipp Henzler

408

18 Mar 2025

Learning Efficient Fuse-and-Refine for Feed-Forward 3D Gaussian Splatting

542

18 Mar 2025

Deblur Gaussian Splatting SLAM

271

16 Mar 2025

VGGT: Visual Geometry Grounded TransformerComputer Vision and Pattern Recognition (CVPR), 2025

521

550

14 Mar 2025

Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural RepresentationsComputer Vision and Pattern Recognition (CVPR), 2025

Xunzhi Zheng

Dan Xu

AI4CE

274

13 Mar 2025

Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis

1.3K

13 Mar 2025

VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames

272

13 Mar 2025

Knowledge Consultation for Semi-Supervised Semantic Segmentation

391

12 Mar 2025