61
0

Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

Abstract

The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing deep learning-based RGB-T SOD models suffer from two major limitations. First, Transformer-based models with quadratic complexity are computationally expensive and memory-intensive, limiting their application in high-resolution bi-modal feature fusion. Second, even when these models converge to an optimal solution, there remains a frequency gap between the prediction and ground-truth. To overcome these limitations, we propose a purely Fourier transform-based model, namely Deep Fourier-Embedded Network (DFENet), for accurate RGB-T SOD. To address the computational complexity when dealing with high-resolution images, we leverage the efficiency of fast Fourier transform with linear complexity to design three key components: (1) the Modal-coordinated Perception Attention, which fuses RGB and thermal modalities with enhanced multi-dimensional representation; (2) the Frequency-decomposed Edge-aware Block, which clarifies object edges by deeply decomposing and enhancing frequency components of low-level features; and (3) the Fourier Residual Channel Attention Block, which prioritizes high-frequency information while aligning channel-wise global relationships. To mitigate the frequency gap, we propose Co-focus Frequency Loss, which dynamically weights hard frequencies during edge frequency reconstruction by cross-referencing bi-modal edge information in the Fourier domain. Extensive experiments on four RGB-T SOD benchmark datasets demonstrate that DFENet outperforms fifteen existing state-of-the-art RGB-T SOD models. Comprehensive ablation studies further validate the value and effectiveness of our newly proposed components. The code is available atthis https URL.

View on arXiv
@article{lyu2025_2411.18409,
  title={ Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection },
  author={ Pengfei Lyu and Pak-Hei Yeung and Xiaosheng Yu and Chengdong Wu and Jagath C. Rajapakse },
  journal={arXiv preprint arXiv:2411.18409},
  year={ 2025 }
}
Comments on this paper