CGMDRNet: Cross-Guided Modality Difference Reduction Network for RGB-T Salient Object Detection

2022
How to explore the interaction between the RGB and thermal modalities is the key success of the RGB-T saliency object detection (SOD). Most of the existing methods integrate multi-modality information by designing various fusion strategies. However, the modality gap between the RGB and thermal features will lead to unsatisfactory performances by simple feature concatenation. To solve this problem, we innovatively propose a cross-guided modality difference reduction network (CGMDRNet) to achieve intrinsic consistency feature fusion via reducing the modality differences. Specifically, we design a modality difference reduction (MDR) module, which is embedded in each layer of the backbone network. The module uses a cross-guided strategy to reduce the modality difference between the RGB and thermal features. Then, a cross-attention fusion (CAF) module is designed to fuse cross-modality features with small modality differences. In addition, we use a transformer-based feature enhancement (TFE) module to enhance the high-level feature representation that contributes more to performance. Finally, the high-level features guide the fusion of low-level features to obtain a saliency map with clear boundaries. Extensive experiments on three public RGB-T datasets show that the proposed CGMDRNet achieves competitive performance compared with state-of-the-art (SOTA) RGB-T SOD models.
    • Correction
    • Source
    • Cite
    • Save
    91
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map