A Tibetan Language Model That Considers the Relationship Between Suffixes and Functional Words

2021
The complete semantic representation of a Tibetan sentence is mainly determined by the addition of a specific functional word. The choice of Tibetan functional words is mainly influenced (both explicitly and implicitly) by the sequence of Tibetan suffixes. In this article, we propose an RNN-based Tibetan radical suffix unit (TRSU) to consider this relationship. Specifically, for the Tibetan radical suffix unit-explicit (TRSU-E) method, the fixed suffix in Tibetan is used to determine the virtual functional words. For the Tibetan radical suffix unit-implicit (TRSU-I) method, the decision is assisted by adding a specific suffix. To test the method, we design a standard Tibetan corpus, which consists of different genres. Our experimental results show that the complexity of our method is reduced by up to 22.2% relative to the best baseline. Furthermore, with the hidden semantic information and implicit suffix, TRSU-I outperforms TRSU-E by reducing the perplexity (PPL) by 3%. Moreover, good results are achieved on the English Penn Treebank data set.
    • Correction
    • Source
    • Cite
    • Save
    30
    References
    1
    Citations
    NaN
    KQI
    []
    Baidu
    map