Towards Coreference Resolution for Early Irish

Mark Darling, Marieke Meelen, David Willis


Abstract
In this article, we present an outline of some of the issues involved in developing a semi-supervised procedure for coreference resolution for early Irish as part of a wider enterprise to create a parsed corpus of historical Irish with enriched annotation for information structure and anaphoric coreference. We outline the ways in which existing resources, notably the POMIC historical Irish corpus and the Cesax annotation algorithm, have had to be adapted, the first to provide suitable input for coreference resolution, the second to cope with specific aspects of early Irish grammar. We also outline features of a part-of-speech tagger that we have developed for early Irish as part of the first task and with a view to expanding the size of the future corpus.
Anthology ID:
2022.cltw-1.12
Volume:
Proceedings of the 4th Celtic Language Technology Workshop within LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Theodorus Fransen, William Lamb, Delyth Prys
Venue:
CLTW
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
85–93
Language:
URL:
https://aclanthology.org/2022.cltw-1.12
DOI:
Bibkey:
Cite (ACL):
Mark Darling, Marieke Meelen, and David Willis. 2022. Towards Coreference Resolution for Early Irish. In Proceedings of the 4th Celtic Language Technology Workshop within LREC2022, pages 85–93, Marseille, France. European Language Resources Association.
Cite (Informal):
Towards Coreference Resolution for Early Irish (Darling et al., CLTW 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.cltw-1.12.pdf