Cross-Lingual Text Image Recognition via Multi-Task Sequence to Sequence Learning

2021 
This paper considers recognizing texts shown in a source language and translating into a target language, without generating the intermediate source language text image recognition results. We call this problem Cross-Lingual Text Image Recognition (CLTIR). To solve this problem, we propose a multi-task system containing a main task of CLTIR and an auxiliary task of Mono-Lingual Text Image Recognition (MLTIR) simultaneously. Two different sequence to sequence learning methods, a convolution based attention model and a Bidirectional Long Short-Term Memory (BLSTM) model with Connectionist Temporal Classification (CTC), are adopted for these tasks respectively. We evaluate the system on a newly collected Chinese-English bilingual movie subtitle image dataset. Experimental results demonstrate the multi-task learning framework performs superiorly in both languages.
    • Correction
    • Source
    • Cite
    • Save
    31
    References
    1
    Citations
    NaN
    KQI
    []
    Baidu
    map