Predicate Sense Disambiguation for UMR Annotation of Latin: Challenges and Insights

Federica Gamba

doi:10.18653/v1/2024.ml4al-1.3

Predicate Sense Disambiguation for UMR Annotation of Latin: Challenges and Insights

Abstract

This paper explores the possibility to exploit different Pretrained Language Models (PLMs) to assist in a manual annotation task consisting in assigning the appropriate sense to verbal predicates in a Latin text. Indeed, this represents a crucial step when annotating data according to the Uniform Meaning Representation (UMR) framework, designed to annotate the semantic content of a text in a cross-linguistic perspective. We approach the study as a Word Sense Disambiguation task, with the primary goal of assessing the feasibility of leveraging available resources for Latin to streamline the labor-intensive annotation process. Our methodology revolves around the exploitation of contextual embeddings to compute token similarity, under the assumption that predicates sharing a similar sense would also share their context of occurrence. We discuss our findings, emphasizing applicability and limitations of this approach in the context of Latin, for which the limited amount of available resources poses additional challenges.

Anthology ID:: 2024.ml4al-1.3
Volume:: Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)
Month:: August
Year:: 2024
Address:: Hybrid in Bangkok, Thailand and online
Editors:: John Pavlopoulos, Thea Sommerschield, Yannis Assael, Shai Gordin, Kyunghyun Cho, Marco Passarotti, Rachele Sprugnoli, Yudong Liu, Bin Li, Adam Anderson
Venues:: ML4AL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19–29
Language:
URL:: https://aclanthology.org/2024.ml4al-1.3/
DOI:: 10.18653/v1/2024.ml4al-1.3
Bibkey:
Cite (ACL):: Federica Gamba. 2024. Predicate Sense Disambiguation for UMR Annotation of Latin: Challenges and Insights. In Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024), pages 19–29, Hybrid in Bangkok, Thailand and online. Association for Computational Linguistics.
Cite (Informal):: Predicate Sense Disambiguation for UMR Annotation of Latin: Challenges and Insights (Gamba, ML4AL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.ml4al-1.3.pdf

PDF Cite Search Fix data