37
0

What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods

Abstract

Purpose High dimensional, multimodal data can nowadays be analyzed by huge deep neural networks with little effort. Several fusion methods for bringing together different modalities have been developed. Particularly, in the field of medicine with its presence of high dimensional multimodal patient data, multimodal models characterize the next step. However, what is yet very underexplored is how these models process the source information in detail. Methods To this end, we implemented an occlusion-based both model and performance agnostic modality contribution method that quantitatively measures the importance of each modality in the dataset for the model to fulfill its task. We applied our method to three different multimodal medical problems for experimental purposes. Results Herein we found that some networks have modality preferences that tend to unimodal collapses, while some datasets are imbalanced from the ground up. Moreover, we could determine a link between our metric and the performance of single modality trained nets. Conclusion The information gain through our metric holds remarkable potential to improve the development of multimodal models and the creation of datasets in the future. With our method we make a crucial contribution to the field of interpretability in deep learning based multimodal research and thereby notably push the integrability of multimodal AI into clinical practice. Our code is publicly available atthis https URL.

View on arXiv
@article{gapp2025_2503.01904,
  title={ What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods },
  author={ Christian Gapp and Elias Tappeiner and Martin Welk and Karl Fritscher and Elke Ruth Gizewski and Rainer Schubert },
  journal={arXiv preprint arXiv:2503.01904},
  year={ 2025 }
}
Comments on this paper