Cross-Lingual Text Image Recognition via Multi-Task Sequence to Sequence Learning
2021
This paper considers recognizing texts shown in a source language and translating into a target language, without generating the intermediate source language text image recognition results. We call this problem Cross-Lingual Text Image Recognition (CLTIR). To solve this problem, we propose a multi-task system containing a main task of CLTIR and an auxiliary task of Mono-Lingual Text Image Recognition (MLTIR) simultaneously. Two different sequence to sequence learning methods, a convolution based attention model and a Bidirectional Long Short-Term Memory (BLSTM) model with Connectionist Temporal Classification (CTC), are adopted for these tasks respectively. We evaluate the system on a newly collected Chinese-English bilingual movie subtitle image dataset. Experimental results demonstrate the multi-task learning framework performs superiorly in both languages.
Keywords:
- Correction
- Source
- Cite
- Save
31
References
1
Citations
NaN
KQI