Word Sense Disambiguation for Ancient Greek: Sourcing a training corpus through translation alignment

Alek Keersmaekers, Wouter Mercelis, Toon Van Hal


Abstract
This paper seeks to leverage translations of Ancient Greek texts to enhance the performance of automatic word sense disambiguation (WSD). Satisfactory WSD in Ancient Greek is achievable, provided that the system can rely on annotated data. This study, acknowledging the challenges of manually assigning meanings to every Greek lemma, explores the strategies to derive WSD data from parallel texts using sentence and word alignment. Our results suggest that, assuming the condition of high word frequency is met, this technique permits us to automatically produce a significant volume of annotated data, although there are still significant obstacles when trying to automate this process.
Anthology ID:
2023.alp-1.18
Volume:
Proceedings of the Ancient Language Processing Workshop
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Adam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti
Venues:
ALP | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
148–159
Language:
URL:
https://aclanthology.org/2023.alp-1.18
DOI:
Bibkey:
Cite (ACL):
Alek Keersmaekers, Wouter Mercelis, and Toon Van Hal. 2023. Word Sense Disambiguation for Ancient Greek: Sourcing a training corpus through translation alignment. In Proceedings of the Ancient Language Processing Workshop, pages 148–159, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Word Sense Disambiguation for Ancient Greek: Sourcing a training corpus through translation alignment (Keersmaekers et al., ALP-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.alp-1.18.pdf