Aligning Alignments: Do Colexification and Distributional Similarity Align as Measures of cross-lingual Lexical Alignment?

Taelin Karidi, Eitan Grossman, Omri Abend


Abstract
The data-driven investigation of the extent to which lexicons of different languages align has mostly fallen into one of two categories:colexification-based and distributional. The two approaches are grounded in distinct methodologies, operate on different assumptions, and are used in diverse ways.This raises two important questions: (a) are there settings in which the predictions of the two approaches can be directly compared? and if so, (b) what is the extent of the similarity and what are its determinants? We offer novel operationalizations for the two approaches in a manner that allows for their direct comparison, and conduct a comprehensive analysis on a diverse set of 16 languages.Our analysis is carried out at different levels of granularity. At the word-level, the two methods present different results across the board. However, intriguingly, at the level of semantic domains (e.g., kinship, quantity), the two methods show considerable convergence in their predictions.A detailed comparison of the metrics against a carefully validated dataset of kinship terms shows that the distributional methods likely capture a more fine-grained alignment than their counterpart colexification-based methods, and may thus be more suited for settings where fewer languages are evaluated.
Anthology ID:
2024.conll-1.26
Volume:
Proceedings of the 28th Conference on Computational Natural Language Learning
Month:
November
Year:
2024
Address:
Miami, FL, USA
Editors:
Libby Barak, Malihe Alikhani
Venue:
CoNLL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
327–341
Language:
URL:
https://aclanthology.org/2024.conll-1.26
DOI:
Bibkey:
Cite (ACL):
Taelin Karidi, Eitan Grossman, and Omri Abend. 2024. Aligning Alignments: Do Colexification and Distributional Similarity Align as Measures of cross-lingual Lexical Alignment?. In Proceedings of the 28th Conference on Computational Natural Language Learning, pages 327–341, Miami, FL, USA. Association for Computational Linguistics.
Cite (Informal):
Aligning Alignments: Do Colexification and Distributional Similarity Align as Measures of cross-lingual Lexical Alignment? (Karidi et al., CoNLL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.conll-1.26.pdf