9
0

MultiTaskVIF: Segmentation-oriented visible and infrared image fusion via multi-task learning

Abstract

Visible and infrared image fusion (VIF) has attracted significant attention in recent years. Traditional VIF methods primarily focus on generating fused images with high visual quality, while recent advancements increasingly emphasize incorporating semantic information into the fusion model during training. However, most existing segmentation-oriented VIF methods adopt a cascade structure comprising separate fusion and segmentation models, leading to increased network complexity and redundancy. This raises a critical question: can we design a more concise and efficient structure to integrate semantic information directly into the fusion model during training-Inspired by multi-task learning, we propose a concise and universal training framework, MultiTaskVIF, for segmentation-oriented VIF models. In this framework, we introduce a multi-task head decoder (MTH) to simultaneously output both the fused image and the segmentation result during training. Unlike previous cascade training frameworks that necessitate joint training with a complete segmentation model, MultiTaskVIF enables the fusion model to learn semantic features by simply replacing its decoder with MTH. Extensive experimental evaluations validate the effectiveness of the proposed method. Our code will be released upon acceptance.

View on arXiv
@article{zhao2025_2505.06665,
  title={ MultiTaskVIF: Segmentation-oriented visible and infrared image fusion via multi-task learning },
  author={ Zixian Zhao and Andrew Howes and Xingchen Zhang },
  journal={arXiv preprint arXiv:2505.06665},
  year={ 2025 }
}
Comments on this paper