DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-Based Segmentation Networks

8 November 2022

Papers citing "DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-Based Segmentation Networks"

5 / 5 papers shown

Title
Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection F. Barbato Umberto Michieli J. Moon Pietro Zanuttigh Mete Ozay 35 2 0 01 Jul 2024
Continual Road-Scene Semantic Segmentation via Feature-Aligned Symmetric Multi-Modal Network F. Barbato Elena Camuffo Simone Milani Pietro Zanuttigh 6 5 0 09 Aug 2023
Source-Free Domain Adaptation for RGB-D Semantic Segmentation with Vision Transformers Giulia Rizzoli Donald Shenaj Pietro Zanuttigh ViT 24 8 0 23 May 2023
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions Wenhai Wang Enze Xie Xiang Li Deng-Ping Fan Kaitao Song Ding Liang Tong Lu Ping Luo Ling Shao ViT 263 3,538 0 24 Feb 2021
Indoor Semantic Segmentation using depth information Camille Couprie C. Farabet Laurent Najman Yann LeCun SSeg MDE 59 473 0 16 Jan 2013