Combining Cross Entropy Loss with Manually Defined Hard Example for Semantic Image Segmentation.

2019
Semantic image segmentation has been one of the fundamental tasks in computer vision, which aims to assign a label to each pixel in an image. Nowadays, approaches based on fully convolutional network (FCN) have shown state-of-the-art performance in this task. However, most of them adopt cross entropy as the loss function, which will lead to poor performance in regions near object boundary. In this paper, we introduce two region-based metrics to quantitatively evaluate the performance of segmentation detail, which provides insights about the bottleneck of model. Based on this analysis, by use of a modified multi-task learning scheme, we combine cross entropy loss with manually defined hard example to propose a simple yet effective loss function named \(\mathcal {L}_\mathrm{{cehe}}\), which helps model focus on the learning of segmentation detail. Experiments show that model using \(\mathcal {L}_\mathrm{{cehe}}\) can better utilize spatial information comparing with the conventional cross entropy loss \(\mathcal {L}_\mathrm{{ce}}\). Statistically, metrics indicate that the proposed method outperforms the widely used \(\mathcal {L}_\mathrm{{ce}}\) by \(1.12\%\) in terms of MIoU on Cityscapes validation set, and by \(4.15\%\) in terms of the region-based metric MIoUiER proposed in this paper, proving that \(\mathcal {L}_\mathrm{{cehe}}\) performs better in segmentation detail.
    • Correction
    • Source
    • Cite
    • Save
    19
    References
    0
    Citations
    NaN
    KQI
    []
    Baidu
    map