ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.13413
  4. Cited By
Vision Transformers for Dense Prediction

Vision Transformers for Dense Prediction

24 March 2021
René Ranftl
Alexey Bochkovskiy
V. Koltun
    ViT
    MDE
ArXivPDFHTML

Papers citing "Vision Transformers for Dense Prediction"

50 / 982 papers shown
Title
Large Spatial Model: End-to-end Unposed Images to Semantic 3D
Large Spatial Model: End-to-end Unposed Images to Semantic 3D
Zhiwen Fan
Jian Zhang
Wenyan Cong
Peihao Wang
Renjie Li
...
Z. Wang
Danfei Xu
B. Ivanovic
Marco Pavone
Yue Wang
3DV
41
11
0
24 Oct 2024
Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse
  View Synthesis
Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis
Liang Han
Junsheng Zhou
Yu-Shen Liu
Zhizhong Han
3DGS
35
12
0
24 Oct 2024
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Ruicheng Wang
Sicheng Xu
Cassie Dai
Jianfeng Xiang
Yu Deng
Xin Tong
Jiaolong Yang
TPM
3DH
MDE
58
30
0
24 Oct 2024
TIPS: Text-Image Pretraining with Spatial awareness
TIPS: Text-Image Pretraining with Spatial awareness
Kevis-Kokitsi Maninis
Kaifeng Chen
Soham Ghosh
Arjun Karpur
Koert Chen
...
Jan Dlabal
Dan Gnanapragasam
Mojtaba Seyedhosseini
Howard Zhou
Andre Araujo
VLM
35
3
0
21 Oct 2024
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting
EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting
Bohao Liao
Wei-dong Zhai
Zengyu Wan
Tianzhu Zhang
Wenfei Yang
Zheng-jun Zha
Yang Cao
Zheng-Jun Zha
3DGS
97
2
0
20 Oct 2024
DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine
  Domain
DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain
Kun Wang
Zhiqiang Yan
Junkai Fan
Wanlu Zhu
X. Li
Jun Li
Jian Yang
MDE
31
5
0
19 Oct 2024
DepthSplat: Connecting Gaussian Splatting and Depth
DepthSplat: Connecting Gaussian Splatting and Depth
Haofei Xu
Songyou Peng
Fangjinhua Wang
Hermann Blum
Dániel Baráth
Andreas Geiger
Marc Pollefeys
3DGS
MDE
50
29
0
17 Oct 2024
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image
  Generation
3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation
Dewei Zhou
Ji Xie
Zongxin Yang
Yi Yang
DiffM
64
7
0
16 Oct 2024
Order-aware Interactive Segmentation
Order-aware Interactive Segmentation
Bin Wang
Anwesa Choudhuri
Meng Zheng
Zhongpai Gao
Benjamin Planche
Andong Deng
Qin Liu
Terrence Chen
Ulas Bagci
Ziyan Wu
VLM
115
1
0
16 Oct 2024
LoGS: Visual Localization via Gaussian Splatting with Fewer Training
  Images
LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images
Yuzhou Cheng
Jianhao Jiao
Yue Wang
Dimitrios Kanoulas
3DGS
32
3
0
15 Oct 2024
A Simple Approach to Unifying Diffusion-based Conditional Generation
A Simple Approach to Unifying Diffusion-based Conditional Generation
Xirui Li
Charles Herrmann
Kelvin C.K. Chan
Yinxiao Li
Deqing Sun
Chao Ma
Ming Yang
DiffM
VLM
38
1
0
15 Oct 2024
Few-shot Novel View Synthesis using Depth Aware 3D Gaussian Splatting
Few-shot Novel View Synthesis using Depth Aware 3D Gaussian Splatting
Raja Kumar
Vanshika Vats
3DGS
27
0
0
14 Oct 2024
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Ziyue Li
Tianyi Zhou
MoE
66
16
0
14 Oct 2024
Browsing without Third-Party Cookies: What Do You See?
Browsing without Third-Party Cookies: What Do You See?
Maxwell Lin
Shihan Lin
Helen Wu
Karen Wang
Xiaowei Yang
BDL
51
0
0
14 Oct 2024
Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting
  without Accurate Pose Initialization
Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting without Accurate Pose Initialization
Christian Schmidt
Jens Piekenbrinck
Bastian Leibe
3DGS
30
3
0
11 Oct 2024
Diffusion-Based Depth Inpainting for Transparent and Reflective Objects
Diffusion-Based Depth Inpainting for Transparent and Reflective Objects
Tianyu Sun
Dingchang Hu
Yixiang Dai
Guijin Wang
DiffM
39
5
0
11 Oct 2024
A Lightweight Target-Driven Network of Stereo Matching for Inland
  Waterways
A Lightweight Target-Driven Network of Stereo Matching for Inland Waterways
Jing Su
Yiqing Zhou
Yu Zhang
Chao Wang
Yi Wei
3DV
28
0
0
10 Oct 2024
O1O: Grouping of Known Classes to Identify Unknown Objects as
  Odd-One-Out
O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out
Mısra Yavuz
Fatma Guney
26
0
0
10 Oct 2024
Surgical Depth Anything: Depth Estimation for Surgical Scenes using
  Foundation Models
Surgical Depth Anything: Depth Estimation for Surgical Scenes using Foundation Models
Ange Lou
Yamin Li
Yike Zhang
Jack Noble
MedIm
24
4
0
09 Oct 2024
Structure-Centric Robust Monocular Depth Estimation via Knowledge
  Distillation
Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation
Runze Chen
Haiyong Luo
Fang Zhao
Jingze Yu
Yupeng Jia
Juan Wang
Xuepeng Ma
MDE
34
1
0
09 Oct 2024
CUBE360: Learning Cubic Field Representation for Monocular 360 Depth
  Estimation for Virtual Reality
CUBE360: Learning Cubic Field Representation for Monocular 360 Depth Estimation for Virtual Reality
Wenjie Chang
Hao Ai
Tianzhu Zhang
Lin Wang
MDE
19
0
0
08 Oct 2024
Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of
  Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery
Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery
Willow Liu
Shuxin Qiao
K. Gao
Hongjie He
M. Chapman
Linlin Xu
Jonathan Li
27
0
0
08 Oct 2024
A Simple Image Segmentation Framework via In-Context Examples
A Simple Image Segmentation Framework via In-Context Examples
Yang Liu
Chenchen Jing
Hengtao Li
Muzhi Zhu
Hao Chen
Xinlong Wang
Chunhua Shen
33
6
0
07 Oct 2024
Improving Image Clustering with Artifacts Attenuation via Inference-Time
  Attention Engineering
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering
Kazumoto Nakamura
Yuji Nozawa
Yu-Chieh Lin
K. Nakata
Youyang Ng
ViT
35
1
0
07 Oct 2024
CoVLM: Leveraging Consensus from Vision-Language Models for
  Semi-supervised Multi-modal Fake News Detection
CoVLM: Leveraging Consensus from Vision-Language Models for Semi-supervised Multi-modal Fake News Detection
Devank
Jayateja Kalla
Soma Biswas
34
1
0
06 Oct 2024
Image First or Text First? Optimising the Sequencing of Modalities in
  Large Language Model Prompting and Reasoning Tasks
Image First or Text First? Optimising the Sequencing of Modalities in Large Language Model Prompting and Reasoning Tasks
Grant Wardle
Teo Susnjak
LRM
26
5
0
04 Oct 2024
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
Junyi Zhang
Charles Herrmann
Junhwa Hur
Varun Jampani
Trevor Darrell
Forrester Cole
Deqing Sun
Ming Yang
VGen
81
70
0
04 Oct 2024
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through
  Language Descriptions
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
Ziyao Zeng
Yangchao Wu
Hyoungseob Park
Daniel Wang
Fengyu Yang
Stefano Soatto
Dong Lao
Byung-Woo Hong
Alex Wong
MDE
16
7
0
03 Oct 2024
Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats
Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats
Mingyang Xie
Haoming Cai
Sachin Shah
Yiran Xu
Brandon Y. Feng
Jia-Bin Huang
Christopher A. Metzler
3DGS
37
2
0
03 Oct 2024
DecTrain: Deciding When to Train a Monocular Depth DNN Online
DecTrain: Deciding When to Train a Monocular Depth DNN Online
Zih-Sing Fu
Soumya Sudhakar
S. Karaman
Vivienne Sze
41
0
0
03 Oct 2024
Learning to Build by Building Your Own Instructions
Learning to Build by Building Your Own Instructions
Aaron Walsman
Muru Zhang
Adam Fishman
Ali Farhadi
Dieter Fox
23
0
0
01 Oct 2024
AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation
AUCSeg: AUC-oriented Pixel-level Long-tail Semantic Segmentation
Boyu Han
Qianqian Xu
Zhiyong Yang
Shilong Bao
Peisong Wen
Yangbangyan Jiang
Qingming Huang
26
2
0
30 Sep 2024
OptiGrasp: Optimized Grasp Pose Detection Using RGB Images for Warehouse
  Picking Robots
OptiGrasp: Optimized Grasp Pose Detection Using RGB Images for Warehouse Picking Robots
Soofiyan Atar
Yi Li
Markus Grotz
Michael Wolf
Dieter Fox
Joshua Smith
35
1
0
29 Sep 2024
KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation
KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation
Soofiyan Atar
Yuheng Zhi
Florian Richter
Michael C. Yip
MDE
34
0
0
29 Sep 2024
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He
Haodong Li
Wei Yin
Yixun Liang
Leheng Li
Kaiqiang Zhou
Hongbo Zhang
Bingbing Liu
Ying-Cong Chen
DiffM
VLM
44
40
0
26 Sep 2024
Self-Distilled Depth Refinement with Noisy Poisson Fusion
Self-Distilled Depth Refinement with Noisy Poisson Fusion
Jiaqi Li
Yiran Wang
Jinghong Zheng
Zihao Huang
Ke Xian
Zhiguo Cao
Jianming Zhang
25
2
0
26 Sep 2024
CAMOT: Camera Angle-aware Multi-Object Tracking
CAMOT: Camera Angle-aware Multi-Object Tracking
Felix Limanta
K. Uto
K. Shinoda
VOT
26
5
0
26 Sep 2024
Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth
  Estimation
Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation
Richard D. Paul
Alessio Quercia
Vincent Fortuin
Katharina Nöh
Hanno Scharr
UQCV
BDL
24
2
0
25 Sep 2024
Semantics-Controlled Gaussian Splatting for Outdoor Scene Reconstruction
  and Rendering in Virtual Reality
Semantics-Controlled Gaussian Splatting for Outdoor Scene Reconstruction and Rendering in Virtual Reality
Hannah Schieber
Jacob Young
Tobias Langlotz
Stefanie Zollmann
Daniel Roth
3DGS
23
5
0
24 Sep 2024
AIM 2024 Sparse Neural Rendering Challenge: Methods and Results
AIM 2024 Sparse Neural Rendering Challenge: Methods and Results
Michal Nazarczuk
Sibi Catley-Chandar
T. Tanay
Richard Shaw
Eduardo Pérez-Pellitero
...
Yanyan Zu
Junpei Zhang
Licheng Jiao
Xu Liu
Kuldeep Purohit
51
7
0
23 Sep 2024
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
DepthART: Monocular Depth Estimation as Autoregressive Refinement Task
Bulat Gabdullin
Nina Konovalova
Nikolay Patakin
Dmitry Senushkin
Anton Konushin
MDE
30
0
0
23 Sep 2024
MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse
  Input Views
MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views
Wangze Xu
Huachen Gao
Shihe Shen
Rui Peng
Jianbo Jiao
Ronggang Wang
3DGS
18
8
0
22 Sep 2024
@Bench: Benchmarking Vision-Language Models for Human-centered Assistive
  Technology
@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology
Xin Jiang
Junwei Zheng
Ruiping Liu
Jiahang Li
Jiaming Zhang
Sven Matthiesen
Rainer Stiefelhagen
VLM
21
0
0
21 Sep 2024
Towards Robust Automation of Surgical Systems via Digital Twin-based
  Scene Representations from Foundation Models
Towards Robust Automation of Surgical Systems via Digital Twin-based Scene Representations from Foundation Models
Hao Ding
Lalithkumar Seenivasan
Hongchao Shu
Grayson Byrd
Han Zhang
Pu Xiao
Juan Antonio Barragan
Russell H. Taylor
Peter Kazanzides
Mathias Unberath
32
5
0
19 Sep 2024
Reactive Collision Avoidance for Safe Agile Navigation
Reactive Collision Avoidance for Safe Agile Navigation
Alessandro Saviolo
Niko Picello
Rishabh Verma
Giuseppe Loianno
Giuseppe Loianno
30
0
0
18 Sep 2024
Depth-based Privileged Information for Boosting 3D Human Pose Estimation
  on RGB
Depth-based Privileged Information for Boosting 3D Human Pose Estimation on RGB
Alessandro Simoni
Francesco Marchetti
Guido Borghi
Federico Becattini
Davide Davoli
Lorenzo Garattoni
Gianpiero Francesca
Lorenzo Seidenari
R. Vezzani
3DH
MDE
29
0
0
17 Sep 2024
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Gonzalo Martin Garcia
Karim Abou Zeid
Christian Schmidt
Daan de Geus
Alexander Hermans
Bastian Leibe
37
24
0
17 Sep 2024
BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic
  Segmentation of Urban Remote Sensing Images
BAFNet: Bilateral Attention Fusion Network for Lightweight Semantic Segmentation of Urban Remote Sensing Images
Wentao Wang
Xili Wang
SSeg
31
3
0
16 Sep 2024
Towards Real-Time Generation of Delay-Compensated Video Feeds for Outdoor Mobile Robot Teleoperation
Towards Real-Time Generation of Delay-Compensated Video Feeds for Outdoor Mobile Robot Teleoperation
Neeloy Chakraborty
Yixiao Fang
Andre Schreiber
Tianchen Ji
Zhe Huang
Aganze Mihigo
Cassidy Wall
Abdulrahman Almana
Katherine Driggs-Campbell
28
0
0
16 Sep 2024
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Hugo Porta
Emanuele Dalsasso
Diego Marcos
D. Tuia
93
0
0
14 Sep 2024
Previous
12345...181920
Next