It takes two to borrow: a donor and a recipient. Who’s who?

Liviu Dinu, Ana Uban, Anca Dinu, Ioan-Bogdan Iordache, Simona Georgescu, Laurentiu Zoicas


Abstract
We address the open problem of automatically identifying the direction of lexical borrowing, given word pairs in the donor and recipient languages. We propose strong benchmarks for this task, by applying a set of machine learning models. We extract and publicly release a comprehensive borrowings dataset from the recent RoBoCoP cognates and borrowings database for five Romance languages. We experiment on this dataset with both graphic and phonetic representations and with different features, models and architectures. We interpret the results, in terms of F1 score, commenting on the influence of features and model choice, of the imbalanced data and of the inherent difficulty of the task for particular language pairs. We show that automatically determining the direction of borrowing is a feasible task, and propose additional directions for future work.
Anthology ID:
2024.findings-acl.360
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6023–6035
Language:
URL:
https://aclanthology.org/2024.findings-acl.360
DOI:
Bibkey:
Cite (ACL):
Liviu Dinu, Ana Uban, Anca Dinu, Ioan-Bogdan Iordache, Simona Georgescu, and Laurentiu Zoicas. 2024. It takes two to borrow: a donor and a recipient. Who’s who?. In Findings of the Association for Computational Linguistics ACL 2024, pages 6023–6035, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
It takes two to borrow: a donor and a recipient. Who’s who? (Dinu et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-acl.360.pdf