ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2204.08721
  4. Cited By
Multimodal Token Fusion for Vision Transformers

Multimodal Token Fusion for Vision Transformers

19 April 2022
Yikai Wang
Xinghao Chen
Lele Cao
Wen-bing Huang
Fuchun Sun
Yunhe Wang
    ViT
ArXivPDFHTML

Papers citing "Multimodal Token Fusion for Vision Transformers"

18 / 18 papers shown
Title
Position: Foundation Models Need Digital Twin Representations
Position: Foundation Models Need Digital Twin Representations
Yiqing Shen
Hao Ding
Lalithkumar Seenivasan
Tianmin Shu
Mathias Unberath
AI4CE
35
0
0
01 May 2025
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework
Shuobin Wei
Zhuang Zhou
Zhengan Lu
Zizhao Yuan
Binghua Su
MDE
42
0
0
18 Apr 2025
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness
Chenfei Liao
Kaiyu Lei
Xu Zheng
Junha Moon
Zhixiong Wang
Y. Wang
Danda Pani Paudel
Luc Van Gool
Xuming Hu
VLM
68
2
0
24 Mar 2025
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation
Yunzhi Zhuge
Hongyu Gu
Lu Zhang
Jinqing Qi
Huchuan Lu
VOS
67
2
0
14 Jan 2025
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding
Yi Liu
Chengxin Li
Shoukun Xu
J. Han
ViT
35
2
0
19 Oct 2024
Order-aware Interactive Segmentation
Order-aware Interactive Segmentation
Bin Wang
Anwesa Choudhuri
Meng Zheng
Zhongpai Gao
Benjamin Planche
Andong Deng
Qin Liu
Terrence Chen
Ulas Bagci
Ziyan Wu
VLM
82
1
0
16 Oct 2024
SLAB: Efficient Transformers with Simplified Linear Attention and
  Progressive Re-parameterized Batch Normalization
SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Jialong Guo
Xinghao Chen
Yehui Tang
Yunhe Wang
ViT
47
9
0
19 May 2024
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Unleashing the Power of CNN and Transformer for Balanced RGB-Event Video Recognition
Xiao Wang
Yao Rong
Shiao Wang
Yuan Chen
Zhe Wu
Bowei Jiang
Yonghong Tian
Jin Tang
ViT
76
3
0
18 Dec 2023
ASY-VRNet: Waterway Panoptic Driving Perception Model based on
  Asymmetric Fair Fusion of Vision and 4D mmWave Radar
ASY-VRNet: Waterway Panoptic Driving Perception Model based on Asymmetric Fair Fusion of Vision and 4D mmWave Radar
Runwei Guan
Shanliang Yao
Xiaohui Zhu
Ka Lok Man
Yong Yue
Jeremy S. Smith
Eng Gee Lim
Yutao Yue
25
9
0
20 Aug 2023
From Sparse to Soft Mixtures of Experts
From Sparse to Soft Mixtures of Experts
J. Puigcerver
C. Riquelme
Basil Mustafa
N. Houlsby
MoE
121
114
0
02 Aug 2023
An Object SLAM Framework for Association, Mapping, and High-Level Tasks
An Object SLAM Framework for Association, Mapping, and High-Level Tasks
Yanmin Wu
Yunzhou Zhang
Delong Zhu
Zhiqiang Deng
Wenkai Sun
Xin Chen
Jian Zhang
16
34
0
12 May 2023
Impact of Pseudo Depth on Open World Object Segmentation with Minimal
  User Guidance
Impact of Pseudo Depth on Open World Object Segmentation with Minimal User Guidance
Robin Schon
K. Ludwig
Rainer Lienhart
VLM
MDE
22
2
0
12 Apr 2023
Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
Jun Yang
Lizhi Bai
Yaoru Sun
Chunqi Tian
Maoyu Mao
Guorun Wang
SSeg
11
16
0
23 Feb 2023
Emerging Threats in Deep Learning-Based Autonomous Driving: A
  Comprehensive Survey
Emerging Threats in Deep Learning-Based Autonomous Driving: A Comprehensive Survey
Huiyun Cao
Wenlong Zou
Yinkun Wang
Ting Song
Mengjun Liu
AAML
35
4
0
19 Oct 2022
ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in
  All Weather Conditions
ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Anjun Chen
Xiangyu Wang
Kun Shi
Shaohao Zhu
Bin Fang
Yingke Chen
Jiming Chen
Yuchi Huo
Qi Ye
3DH
25
20
0
04 Oct 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
P. Xu
Xiatian Zhu
David A. Clifton
ViT
41
522
0
13 Jun 2022
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with
  Transformers
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers
Jiaming Zhang
Huayao Liu
Kailun Yang
Xinxin Hu
Ruiping Liu
Rainer Stiefelhagen
ViT
21
295
0
09 Mar 2022
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
C. Qi
Xinlei Chen
Or Litany
Leonidas J. Guibas
3DPC
182
245
0
29 Jan 2020
1