Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages

Yulia Otmakhova,Karin M. Verspoor,Jey Han Lau

Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages

2022

Yulia Otmakhova
Karin M. Verspoor
Jey Han Lau

Though recently there have been an increased interest in how pre-trained language models encode different linguistic features, there is still a lack of systematic comparison between languages with different morphology and syntax. In this paper, using BERT as an example of a pre-trained model, we compare how three typologically different languages (English, Korean, and Russian) encode morphology and syntax features across different layers. In particular, we contrast languages which differ in a particular aspect, such as flexibility of word order, head directionality, morphological type, presence of grammatical gender, and morphological richness, across four different tasks.

Correction
Source
Cite
Save

References

Citations