Visible and thermal images fusion architecture for few-shot semantic segmentation

2021
Abstract Few-shot semantic segmentation (FSS) has drawn great attention in the community of computer vision, due to its remarkable potential for segmenting novel objects with few pixel-annotated samples. However, some interference factors, such as insufficient illumination and complex background, can impose more challenge to the segmentation performance than fully-supervised when the number of samples is insufficient. Therefore, we propose the visible and thermal (V-T) few-shot semantic segmentation task, which utilize the complementary and similar information of visible and thermal images to boost few-shot segmentation performance. As the first step, we build a novel outdoor city dataset Tokyo Multi-Spectral-4i for the V-T few-shot semantic segmentation task. In addition, a fusion architecture is proposed, which consists of an Edge Similarity fusion module (ES) and a Texture Edge Prototype module (TEP). The ES module fuses the bi-modal information by exploiting the edge similarity in the visible and thermal images. The TEP module extracts the prototype from two models by collaborating the representativeness and complementarity of the visible and thermal feature. Finally, extensive experiments conducted on the proposed datasets demonstrate that our architecture can achieve state-of-the-arts results.
    • Correction
    • Source
    • Cite
    • Save
    48
    References
    1
    Citations
    NaN
    KQI
    []
    Baidu
    map