Emanuel Pariasca
2021
Neural Borrowing Detection with Monolingual Lexical Models
John Miller
|
Emanuel Pariasca
|
Cesar Beltran Castañon
Proceedings of the Student Research Workshop Associated with RANLP 2021
Identification of lexical borrowings, transfer of words between languages, is an essential practice of historical linguistics and a vital tool in analysis of language contact and cultural events in general. We seek to improve tools for automatic detection of lexical borrowings, focusing here on detecting borrowed words from monolingual wordlists. Starting with a recurrent neural lexical language model and competing entropies approach, we incorporate a more current Transformer based lexical model. From there we experiment with several different models and approaches including a lexical donor model with augmented wordlist. The Transformer model reduces execution time and minimally improves borrowing detection. The augmented donor model shows some promise. A substantive change in approach or model is needed to make significant gains in identification of lexical borrowings.
Search