Training and Evaluation of Named Entity Recognition Models for Classical Latin

Marijke Beersmans, Evelien de Graaf, Tim Van de Cruys, Margherita Fantoli


Abstract
We evaluate the performance of various models on the task of named entity recognition (NER) for classical Latin. Using an existing dataset, we train two transformer-based LatinBERT models and one shallow conditional random field (CRF) model. The performance is assessed using both standard metrics and a detailed manual error analysis, and compared to the results obtained by different already released Latin NER tools. Both analyses demonstrate that the BERT models achieve a better f1-score than the other models. Furthermore, we annotate new, unseen data for further evaluation of the models, and we discuss the impact of annotation choices on the results.
Anthology ID:
2023.alp-1.1
Volume:
Proceedings of the Ancient Language Processing Workshop
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Adam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti
Venues:
ALP | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
1–12
Language:
URL:
https://aclanthology.org/2023.alp-1.1
DOI:
Bibkey:
Cite (ACL):
Marijke Beersmans, Evelien de Graaf, Tim Van de Cruys, and Margherita Fantoli. 2023. Training and Evaluation of Named Entity Recognition Models for Classical Latin. In Proceedings of the Ancient Language Processing Workshop, pages 1–12, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Training and Evaluation of Named Entity Recognition Models for Classical Latin (Beersmans et al., ALP-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.alp-1.1.pdf