ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.01210
  4. Cited By
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision
  Transformer
v1v2 (latest)

GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer

3 June 2024
Ding Jia
Jianyuan Guo
Kai Han
Han Wu
Chao Zhang
Chang Xu
Xinghao Chen
    ViT
ArXiv (abs)PDFHTMLGithub (81★)

Papers citing "GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer"

24 / 24 papers shown
Title
From Classical to Hybrid: A Practical Framework for Quantum-Enhanced Learning
From Classical to Hybrid: A Practical Framework for Quantum-Enhanced Learning
Silvie Illésová
Tomáš Bezděk
Vojtěch Novák
Ivan Zelinka
Stefano Cacciatore
Martin Beseda
8
0
0
11 Nov 2025
MMMS: Multi-Modal Multi-Surface Interactive Segmentation
MMMS: Multi-Modal Multi-Surface Interactive Segmentation
Robin Schon
Julian Lorenz
K. Ludwig
Daniel Kienzle
Rainer Lienhart
56
0
0
16 Sep 2025
Multimodal SAM-adapter for Semantic Segmentation
Multimodal SAM-adapter for Semantic SegmentationIEEE Access (IEEE Access), 2025
Iacopo Curti
Pierluigi Zama Ramirez
Alioscia Petrelli
Luigi Di Stefano
25
0
0
12 Sep 2025
DGFusion: Depth-Guided Sensor Fusion for Robust Semantic Perception
DGFusion: Depth-Guided Sensor Fusion for Robust Semantic Perception
Tim Broedermannn
Christos Sakaridis
Luigi Piccinelli
Wim Abbeloos
Luc Van Gool
MDE
96
0
0
11 Sep 2025
Multi-modal Uncertainty Robust Tree Cover Segmentation For High-Resolution Remote Sensing Images
Multi-modal Uncertainty Robust Tree Cover Segmentation For High-Resolution Remote Sensing Images
Yuanyuan Gui
Wei Li
Y Samuel Wang
X. Xia
M. Marty
C. Ginzler
Z. Wang
36
0
0
05 Sep 2025
HiddenObject: Modality-Agnostic Fusion for Multimodal Hidden Object Detection
HiddenObject: Modality-Agnostic Fusion for Multimodal Hidden Object Detection
Harris Song
Tuan-Anh Vu
Sanjith Menon
Sriram Narasimhan
M. Khalid Jawed
87
0
0
28 Aug 2025
MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
Thanh-Dat Truong
Christophe Bobda
Nitin Agarwal
Khoa Luu
124
1
0
13 Aug 2025
DMTrack: Spatio-Temporal Multimodal Tracking via Dual-Adapter
DMTrack: Spatio-Temporal Multimodal Tracking via Dual-Adapter
Weihong Li
Shaohua Dong
Haonan Lu
Yanhao Zhang
Heng Fan
L. Zhang
45
0
0
03 Aug 2025
DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects
Guanghu Xie
Zhiduo Jiang
Yonglong Zhang
Yang Liu
Zongwu Xie
Baoshi Cao
Hong Liu
137
0
0
11 Jun 2025
Semantics-aware Predictive Inspection Path Planning
Semantics-aware Predictive Inspection Path Planning
M. Dharmadhikari
Kostas Alexis
76
0
0
06 Jun 2025
DGIQA: Depth-guided Feature Attention and Refinement for Generalizable Image Quality Assessment
DGIQA: Depth-guided Feature Attention and Refinement for Generalizable Image Quality Assessment
Vaishnav Ramesh
Junliang Liu
Haining Wang
Md Jahidul Islam
164
2
0
29 May 2025
DepthMatch: Semi-Supervised RGB-D Scene Parsing through Depth-Guided Regularization
DepthMatch: Semi-Supervised RGB-D Scene Parsing through Depth-Guided RegularizationIEEE Signal Processing Letters (IEEE SPL), 2025
Jianxin Huang
Jiahang Li
S. Vityazev
Alexander Dvorkovich
Rui Fan
3DVMDE
133
3
0
26 May 2025
RMMSS: Towards Advanced Robust Multi-Modal Semantic Segmentation with Hybrid Prototype Distillation and Feature Selection
RMMSS: Towards Advanced Robust Multi-Modal Semantic Segmentation with Hybrid Prototype Distillation and Feature Selection
Jiaqi Tan
Xu Zheng
Yuhang Liu
188
0
0
19 May 2025
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation
DFormerv2: Geometry Self-Attention for RGBD Semantic SegmentationComputer Vision and Pattern Recognition (CVPR), 2025
Bo Yin
Jiao-Long Cao
Ming-Ming Cheng
Qibin Hou
3DPCMDE
190
8
0
07 Apr 2025
Balancing Task-invariant Interaction and Task-specific Adaptation for Unified Image Fusion
Balancing Task-invariant Interaction and Task-specific Adaptation for Unified Image Fusion
Xingyu Hu
Junjun Jiang
Chenyang Wang
Kui Jiang
Xianming Liu
Jiayi Ma
222
0
0
07 Apr 2025
Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception
Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles PerceptionComputer Vision and Pattern Recognition (CVPR), 2025
Luke Chen
Junyao Wang
Trier Mortlock
Pramod P. Khargonekar
M. A. Al Faruque
UQCV
256
1
0
25 Mar 2025
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Paul Koch
Jörg Krüger
Ankit Chowdhury
O. Heimann
MDE
192
0
0
25 Mar 2025
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness
Benchmarking Multi-modal Semantic Segmentation under Sensor Failures: Missing and Noisy Modality Robustness
Chenfei Liao
Kaiyu Lei
Xu Zheng
Junha Moon
Zhixiong Wang
Longji Xu
Danda Pani Paudel
Luc Van Gool
Xuming Hu
VLM
224
15
0
24 Mar 2025
Multimodal-Aware Fusion Network for Referring Remote Sensing Image SegmentationIEEE Geoscience and Remote Sensing Letters (GRSL), 2025
Leideng Shi
Juan Zhang
191
5
0
14 Mar 2025
Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language Guidance
Unveiling the Potential of Segment Anything Model 2 for RGB-Thermal Semantic Segmentation with Language Guidance
Jiayi Zhao
Fei Teng
Kai Luo
Guoqiang Zhao
Hui Yuan
Xu Zheng
Kailun Yang
VLM
237
9
0
04 Mar 2025
Unifying Light Field Perception with Field of Parallax
Fei Teng
Buyin Deng
Boyuan Zheng
Kai Luo
Kunyu Peng
Kailai Li
Kailun Yang
132
0
0
02 Mar 2025
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language
  Tuning
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language TuningInternational Journal of Computer Vision (IJCV), 2024
Zhiwei Hao
Jianyuan Guo
Li Shen
Yong Luo
Han Hu
Yonggang Wen
VLM
181
2
0
23 Oct 2024
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes
Jianqi Chen
Panwen Hu
Xiaojun Chang
Z. Shi
Michael C. Kampffmeyer
Xiaodan Liang
270
10
0
14 Oct 2024
ShareCMP: Polarization-Aware RGB-P Semantic Segmentation
ShareCMP: Polarization-Aware RGB-P Semantic Segmentation
Zhuoyan Liu
Bo Wang
Lizhi Wang
Chenyu Mao
Ye Li
247
2
0
06 Dec 2023
1