ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.13413
  4. Cited By
Vision Transformers for Dense Prediction

Vision Transformers for Dense Prediction

24 March 2021
René Ranftl
Alexey Bochkovskiy
V. Koltun
    ViT
    MDE
ArXivPDFHTML

Papers citing "Vision Transformers for Dense Prediction"

50 / 982 papers shown
Title
Robust Reflection Removal with Flash-only Cues in the Wild
Robust Reflection Removal with Flash-only Cues in the Wild
Chenyang Lei
Xu-dong Jiang
Qifeng Chen
13
14
0
05 Nov 2022
RCDPT: Radar-Camera fusion Dense Prediction Transformer
RCDPT: Radar-Camera fusion Dense Prediction Transformer
Chen-Chou Lo
P. Vandewalle
ViT
MDE
17
13
0
04 Nov 2022
Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source
  Separation
Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation
Moitreya Chatterjee
N. Ahuja
A. Cherian
33
11
0
29 Oct 2022
ImplantFormer: Vision Transformer based Implant Position Regression
  Using Dental CBCT Data
ImplantFormer: Vision Transformer based Implant Position Regression Using Dental CBCT Data
Xinquan Yang
Xuguang Li
Xuechen Li
Pei-Yao Wu
Linlin Shen
Yongqiang Deng
MedIm
23
8
0
29 Oct 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya-Qin Zhang
Weidi Xie
VLM
21
48
0
27 Oct 2022
M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
  Learning with Model-Accelerator Co-design
M3^33ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
31
81
0
26 Oct 2022
Monocular Dynamic View Synthesis: A Reality Check
Monocular Dynamic View Synthesis: A Reality Check
Han Gao
Ruilong Li
Shubham Tulsiani
Bryan C. Russell
Angjoo Kanazawa
20
111
0
24 Oct 2022
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View
  Completion
CroCo: Self-Supervised Pre-training for 3D Vision Tasks by Cross-View Completion
Philippe Weinzaepfel
Vincent Leroy
Thomas Lucas
Romain Brégier
Yohann Cabon
Vaibhav Arora
L. Antsfeld
Boris Chidlovskii
G. Csurka
Jérôme Revaud
SSL
36
64
0
19 Oct 2022
High-Resolution Depth Estimation for 360-degree Panoramas through
  Perspective and Panoramic Depth Images Registration
High-Resolution Depth Estimation for 360-degree Panoramas through Perspective and Panoramic Depth Images Registration
Chi-Han Peng
Jiayao Zhang
MDE
30
12
0
19 Oct 2022
A Tri-Layer Plugin to Improve Occluded Detection
A Tri-Layer Plugin to Improve Occluded Detection
Guanqi Zhan
Weidi Xie
Andrew Zisserman
24
20
0
18 Oct 2022
Hierarchical Normalization for Robust Monocular Depth Estimation
Hierarchical Normalization for Robust Monocular Depth Estimation
Chi Zhang
Wei Yin
Zhibin Wang
Gang Yu
Bin-Bin Fu
Chunhua Shen
MDE
25
38
0
18 Oct 2022
Attention Attention Everywhere: Monocular Depth Prediction with Skip
  Attention
Attention Attention Everywhere: Monocular Depth Prediction with Skip Attention
Ashutosh Agarwal
Chetan Arora
MDE
14
136
0
17 Oct 2022
Multi-Task Learning based Video Anomaly Detection with Attention
Multi-Task Learning based Video Anomaly Detection with Attention
M. Baradaran
R. Bergevin
24
3
0
14 Oct 2022
How to Train Vision Transformer on Small-scale Datasets?
How to Train Vision Transformer on Small-scale Datasets?
Hanan Gani
Muzammal Naseer
Mohammad Yaqub
ViT
12
49
0
13 Oct 2022
RTFormer: Efficient Design for Real-Time Semantic Segmentation with
  Transformer
RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer
Jian Wang
Chen-xi Gou
Qiman Wu
Haocheng Feng
Junyu Han
Errui Ding
Jingdong Wang
ViT
25
95
0
13 Oct 2022
SegViT: Semantic Segmentation with Plain Vision Transformers
SegViT: Semantic Segmentation with Plain Vision Transformers
Bowen Zhang
Zhi Tian
Quan Tang
Xiangxiang Chu
Xiaolin K. Wei
Chunhua Shen
Yifan Liu
ViT
16
133
0
12 Oct 2022
Map-free Visual Relocalization: Metric Pose Relative to a Single Image
Map-free Visual Relocalization: Metric Pose Relative to a Single Image
Eduardo Arnold
Jamie M. Wynn
Sara Vicente
Guillermo Garcia-Hernando
Áron Monszpart
V. Prisacariu
Daniyar Turmukhambetov
Eric Brachmann
19
55
0
11 Oct 2022
Self-Supervised Monocular Depth Underwater
Self-Supervised Monocular Depth Underwater
Shlomi Amitai
Itzik Klein
T. Treibitz
MDE
32
8
0
06 Oct 2022
Multi-Camera Collaborative Depth Prediction via Consistent Structure
  Estimation
Multi-Camera Collaborative Depth Prediction via Consistent Structure Estimation
Jialei Xu
Xianming Liu
Yuanchao Bai
Junjun Jiang
Kaixuan Wang
Xiaozhi Chen
Xiangyang Ji
3DV
MDE
40
21
0
05 Oct 2022
Dense Prediction Transformer for Scale Estimation in Monocular Visual
  Odometry
Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry
André O. Françani
Marcos R. O. A. Máximo
MDE
13
9
0
04 Oct 2022
Expediting Large-Scale Vision Transformer for Dense Prediction without
  Fine-tuning
Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning
Weicong Liang
Yuhui Yuan
Henghui Ding
Xiao Luo
Weihong Lin
Ding Jia
Zheng-Wei Zhang
Chao Zhang
Hanhua Hu
25
25
0
03 Oct 2022
Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator
Zifan Shi
Yinghao Xu
Yujun Shen
Deli Zhao
Qifeng Chen
Dit-Yan Yeung
84
20
0
30 Sep 2022
Zero-shot visual reasoning through probabilistic analogical mapping
Zero-shot visual reasoning through probabilistic analogical mapping
Taylor W. Webb
Shuhao Fu
Trevor J. Bihl
K. Holyoak
Hongjing Lu
LRM
55
12
0
29 Sep 2022
DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image
DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image
Yijin Li
Xinyang Liu
Wenqian Dong
Han Zhou
Hujun Bao
Guofeng Zhang
Yinda Zhang
Zhaopeng Cui
40
26
0
27 Sep 2022
UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater
  Robots
UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater Robots
Boxiao Yu
Jiayi Wu
M. Islam
MDE
12
37
0
26 Sep 2022
SOCRATES: A Stereo Camera Trap for Monitoring of Biodiversity
SOCRATES: A Stereo Camera Trap for Monitoring of Biodiversity
T. Haucke
H. Kühl
Volker Steinhage
33
11
0
19 Sep 2022
SegNeXt: Rethinking Convolutional Attention Design for Semantic
  Segmentation
SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation
Meng-Hao Guo
Chenggang Lu
Qibin Hou
Zheng Liu
Ming-Ming Cheng
Shiyong Hu
SSeg
ViT
VLM
21
607
0
18 Sep 2022
Bridging Implicit and Explicit Geometric Transformation for Single-Image
  View Synthesis
Bridging Implicit and Explicit Geometric Transformation for Single-Image View Synthesis
Byeongjun Park
Hyojun Go
Changick Kim
3DV
30
6
0
15 Sep 2022
Self-distilled Feature Aggregation for Self-supervised Monocular Depth
  Estimation
Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation
Zhengming Zhou
Qiulei Dong
MDE
53
27
0
15 Sep 2022
VIPHY: Probing "Visible" Physical Commonsense Knowledge
VIPHY: Probing "Visible" Physical Commonsense Knowledge
Shikhar Singh
Ehsan Qasemi
Muhao Chen
43
6
0
15 Sep 2022
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation
Mohit Shridhar
Lucas Manuelli
D. Fox
LM&Ro
155
456
0
12 Sep 2022
An Empirical Study of End-to-End Video-Language Transformers with Masked
  Visual Modeling
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Tsu-jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
William Yang Wang
Lijuan Wang
Zicheng Liu
VLM
19
63
0
04 Sep 2022
SphereDepth: Panorama Depth Estimation from Spherical Domain
SphereDepth: Panorama Depth Estimation from Spherical Domain
Q. Yan
Qiang-qiang Wang
Kaiyong Zhao
Bo Li
X. Chu
F. Deng
MDE
26
2
0
29 Aug 2022
Uncertainty Guided Depth Fusion for Spike Camera
Uncertainty Guided Depth Fusion for Spike Camera
Jianing Li
Jiaming Liu
Xi Wei
Jiyuan Zhang
Ming Lu
Lei Ma
Li Du
Tiejun Huang
Shanghang Zhang
3DH
MDE
16
1
0
26 Aug 2022
Unsupervised Spike Depth Estimation via Cross-modality Cross-domain
  Knowledge Transfer
Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer
Jiaming Liu
Qizhe Zhang
Jianing Li
Ming Lu
Tiejun Huang
Shanghang Zhang
24
10
0
26 Aug 2022
Bokeh-Loss GAN: Multi-Stage Adversarial Training for Realistic
  Edge-Aware Bokeh
Bokeh-Loss GAN: Multi-Stage Adversarial Training for Realistic Edge-Aware Bokeh
Brian Lee
Fei Lei
Huaijin Chen
Alexis Baudron
MDE
19
6
0
25 Aug 2022
DepthFake: a depth-based strategy for detecting Deepfake videos
DepthFake: a depth-based strategy for detecting Deepfake videos
Luca Maiano
Lorenzo Papa
Ketbjano Vocaj
Irene Amerini
20
9
0
23 Aug 2022
Depth Map Decomposition for Monocular Depth Estimation
Depth Map Decomposition for Monocular Depth Estimation
Jinyoung Jun
Jae-Han Lee
Chulwoo Lee
Chang-Su Kim
MDE
39
22
0
23 Aug 2022
FCN-Transformer Feature Fusion for Polyp Segmentation
FCN-Transformer Feature Fusion for Polyp Segmentation
Edward Sanderson
B. Matuszewski
ViT
MedIm
25
117
0
17 Aug 2022
Inpainting at Modern Camera Resolution by Guided PatchMatch with
  Auto-Curation
Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation
Lingzhi Zhang
Connelly Barnes
Kevin Wampler
Sohrab Amirghodsi
Eli Shechtman
Zhe Lin
Jianbo Shi
17
6
0
06 Aug 2022
MonoViT: Self-Supervised Monocular Depth Estimation with a Vision
  Transformer
MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer
Chaoqiang Zhao
Youming Zhang
Matteo Poggi
Fabio Tosi
Xianda Guo
Zheng Zhu
Guan Huang
Yang Tang
S. Mattoccia
ViT
MDE
31
174
0
06 Aug 2022
A Lightweight Machine Learning Pipeline for LiDAR-simulation
A Lightweight Machine Learning Pipeline for LiDAR-simulation
Richard Marcus
Niklas Knoop
Bernhard Egger
Marc Stamminger
31
6
0
05 Aug 2022
A Novel Transformer Network with Shifted Window Cross-Attention for
  Spatiotemporal Weather Forecasting
A Novel Transformer Network with Shifted Window Cross-Attention for Spatiotemporal Weather Forecasting
Alabi Bojesomo
Hasan Al-Marzouqi
P. Liatsis
11
9
0
02 Aug 2022
DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields
DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields
Zijin Wu
Xingyi Li
Juewen Peng
Hao Lu
Zhiguo Cao
Weicai Zhong
27
34
0
01 Aug 2022
Less is More: Consistent Video Depth Estimation with Masked Frames
  Modeling
Less is More: Consistent Video Depth Estimation with Masked Frames Modeling
Yiran Wang
Zhiyu Pan
Xingyi Li
Zhiguo Cao
Ke Xian
Jianming Zhang
27
27
0
31 Jul 2022
Global-Local Self-Distillation for Visual Representation Learning
Global-Local Self-Distillation for Visual Representation Learning
Tim Lebailly
Tinne Tuytelaars
SSL
30
6
0
29 Jul 2022
Depth Field Networks for Generalizable Multi-view Scene Representation
Depth Field Networks for Generalizable Multi-view Scene Representation
Vitor Campagnolo Guizilini
Igor Vasiljevic
Jiading Fang
Rares Ambrus
G. Shakhnarovich
Matthew R. Walter
Adrien Gaidon
3DV
MDE
29
15
0
28 Jul 2022
RealFlow: EM-based Realistic Optical Flow Dataset Generation from Videos
RealFlow: EM-based Realistic Optical Flow Dataset Generation from Videos
Yunhui Han
Kunming Luo
Ao Luo
Jiangyu Liu
Haoqiang Fan
G. Luo
Shuaicheng Liu
9
23
0
22 Jul 2022
Contributions of Shape, Texture, and Color in Visual Recognition
Contributions of Shape, Texture, and Color in Visual Recognition
Yunhao Ge
Yao Xiao
Zhi-Qin John Xu
X. Wang
Laurent Itti
3DH
16
26
0
19 Jul 2022
DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View
  Manipulation
DUQIM-Net: Probabilistic Object Hierarchy Representation for Multi-View Manipulation
Vladimir Tchuiev
Yakov Miron
Dotan Di Castro
20
6
0
19 Jul 2022
Previous
123...151617181920
Next