Toon Van Hal
Word Sense Disambiguation for Ancient Greek: Sourcing a training corpus through translation alignment
Alek Keersmaekers
Wouter Mercelis
Toon Van Hal
Proceedings of the Ancient Language Processing Workshop
This paper seeks to leverage translations of Ancient Greek texts to enhance the performance of automatic word sense disambiguation (WSD). Satisfactory WSD in Ancient Greek is achievable, provided that the system can rely on annotated data. This study, acknowledging the challenges of manually assigning meanings to every Greek lemma, explores the strategies to derive WSD data from parallel texts using sentence and word alignment. Our results suggest that, assuming the condition of high word frequency is met, this technique permits us to automatically produce a significant volume of annotated data, although there are still significant obstacles when trying to automate this process.
In Search of the Flocks: How to Perform Onomasiological Queries in an Ancient Greek Corpus?
Alek Keersmaekers
Toon Van Hal
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages
This paper explores the possibilities of onomasiologically querying corpus data of Ancient Greek. The significance of the onomasiological approach has been highlighted in recent studies, yet the possibilities of performing ‘word-finding’ investigations into corpus data have not been dealt with in depth. The case study chosen focuses on collective nouns denoting animate groups (such as flocks of people, herds of cattle). By relying on a large automatically annotated corpus of Ancient Greek and on token-based vector information, a longlist of collective nouns was compiled through morpho-syntactic extraction and successive clustering procedures. After reducing this longlist to a shortlist, the results obtained are evaluated. In general, we find that πλῆθος can be considered to be the default collective noun of both humans and animals, becoming especially prominent during the Hellenistic period. In addition, specific tendencies in the use of collective nouns are discerned for specific semantic classes (e.g. gods and insects) and over time. Throughout the paper, special attention is paid to methodological issues related to onomasiologically searching.
Creating, Enriching and Valorizing Treebanks of Ancient Greek
Alek Keersmaekers
Wouter Mercelis
Colin Swaelens
Toon Van Hal
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)