|国家预印本平台
首页|Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection

来源:Arxiv_logoArxiv
英文摘要

The rapid development of deep learning has significantly improved salient object detection (SOD) combining both RGB and thermal (RGB-T) images. However, existing deep learning-based RGB-T SOD models suffer from two major limitations. First, Transformer-based models with quadratic complexity are computationally expensive and memory-intensive, limiting their application in high-resolution bi-modal feature fusion. Second, even when these models converge to an optimal solution, there remains a frequency gap between the prediction and ground-truth. To overcome these limitations, we propose a purely Fourier transform-based model, namely Deep Fourier-Embedded Network (DFENet), for accurate RGB-T SOD. To address the computational complexity when dealing with high-resolution images, we leverage the efficiency of fast Fourier transform with linear complexity to design three key components: (1) the Modal-coordinated Perception Attention, which fuses RGB and thermal modalities with enhanced multi-dimensional representation; (2) the Frequency-decomposed Edge-aware Block, which clarifies object edges by deeply decomposing and enhancing frequency components of low-level features; and (3) the Fourier Residual Channel Attention Block, which prioritizes high-frequency information while aligning channel-wise global relationships. To mitigate the frequency gap, we propose Co-focus Frequency Loss, which dynamically weights hard frequencies during edge frequency reconstruction by cross-referencing bi-modal edge information in the Fourier domain. Extensive experiments on four RGB-T SOD benchmark datasets demonstrate that DFENet outperforms fifteen existing state-of-the-art RGB-T SOD models. Comprehensive ablation studies further validate the value and effectiveness of our newly proposed components. The code is available at https://github.com/JoshuaLPF/DFENet.

Pengfei Lyu、Jagath C. Rajapakse、Xiaosheng Yu、Chengdong Wu、Pak-Hei Yeung

计算技术、计算机技术

Pengfei Lyu,Jagath C. Rajapakse,Xiaosheng Yu,Chengdong Wu,Pak-Hei Yeung.Deep Fourier-embedded Network for RGB and Thermal Salient Object Detection[EB/OL].(2024-11-27)[2025-06-23].https://arxiv.org/abs/2411.18409.点此复制

评论