Multilingual Neural Machine Translation With Soft Decoupled Encoding

2019
Multilingualtraining of neural machine translation(NMT) systems has led to impressive accuracy improvements on low-resource languages. However, there are still significant challenges in efficiently learning word representations in the face of paucity of data. In this paper, we propose Soft Decoupled Encoding (SDE), a multilinguallexicon encoding framework specifically designed to share lexical-level information intelligently without requiring heuristic preprocessing such as pre-segmenting the data. SDE represents a word by its spelling through a character encoding, and its semantic meaning through a latent embedding space shared by all languages. Experiments on a standard dataset of four low-resource languages show consistent improvements over strong multilingualNMT baselines, with gains of up to 2 BLEUon one of the tested languages, achieving the new state-of-the-art on all four language pairs.
    • Correction
    • Source
    • Cite
    • Save
    11
    References
    10
    Citations
    NaN
    KQI
    []
    Baidu
    map