On the Intelligibility of Romance Language Varieties: Spanish and Portuguese in Europe and America

Liviu P. Dinu, Ana Sabina Uban, Teodor-George Marchitan, Ioan-Bogdan Iordache, Simona Georgescu


Abstract
Mutual intelligibility within language families presents a significant challenge for multilingual NLP, particularly due to the prevalence of dialectal variation and asymmetric comprehension. In this paper, we present a corpus-based computational analysis to quantify linguistic proximity across Romance language variants, with a focus on major Spanish (Argentine, Chilean and European) and Portuguese (Brazilian and European) varieties and the other main Romance languages (Italian, French, Romanian). We apply a computational metric of lexical intelligibility based on surface and semantic similarity of related words to measure mutual intelligibility for the five main Romance languages in relation to the Spanish and Portuguese varieties studied.
Anthology ID:
2026.vardial-1.11
Volume:
Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
March
Year:
2026
Address:
Rabat, Morocco
Venues:
VarDial | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
139–144
Language:
URL:
https://aclanthology.org/2026.vardial-1.11/
DOI:
Bibkey:
Cite (ACL):
Liviu P. Dinu, Ana Sabina Uban, Teodor-George Marchitan, Ioan-Bogdan Iordache, and Simona Georgescu. 2026. On the Intelligibility of Romance Language Varieties: Spanish and Portuguese in Europe and America. In Proceedings of the 13th Workshop on NLP for Similar Languages, Varieties and Dialects, pages 139–144, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
On the Intelligibility of Romance Language Varieties: Spanish and Portuguese in Europe and America (Dinu et al., VarDial 2026)
Copy Citation:
PDF:
https://aclanthology.org/2026.vardial-1.11.pdf