Did You Enjoy the Last Supper? An Experimental Study on Cross-Domain NER Models for the Art Domain

Alejandro Sierra-Múnera, Ralf Krestel


Abstract
Named entity recognition (NER) is an important task that constitutes the basis for multiple downstream natural language processing tasks. Traditional machine learning approaches for NER rely on annotated corpora. However, these are only largely available for standard domains, e.g., news articles. Domain-specific NER often lacks annotated training data and therefore two options are of interest: expensive manual annotations or transfer learning. In this paper, we study a selection of cross-domain NER models and evaluate them for use in the art domain, particularly for recognizing artwork titles in digitized art-historic documents. For the evaluation of the models, we employ a variety of source domain datasets and analyze how each source domain dataset impacts the performance of the different models for our target domain. Additionally, we analyze the impact of the source domain’s entity types, looking for a better understanding of how the transfer learning models adapt different source entity types into our target entity types.
Anthology ID:
2021.nlp4dh-1.20
Volume:
Proceedings of the Workshop on Natural Language Processing for Digital Humanities
Month:
December
Year:
2021
Address:
NIT Silchar, India
Editors:
Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter
Venue:
NLP4DH
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
173–182
Language:
URL:
https://aclanthology.org/2021.nlp4dh-1.20
DOI:
Bibkey:
Cite (ACL):
Alejandro Sierra-Múnera and Ralf Krestel. 2021. Did You Enjoy the Last Supper? An Experimental Study on Cross-Domain NER Models for the Art Domain. In Proceedings of the Workshop on Natural Language Processing for Digital Humanities, pages 173–182, NIT Silchar, India. NLP Association of India (NLPAI).
Cite (Informal):
Did You Enjoy the Last Supper? An Experimental Study on Cross-Domain NER Models for the Art Domain (Sierra-Múnera & Krestel, NLP4DH 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nlp4dh-1.20.pdf
Code
 hpi-information-systems/cross-domain-ner
Data
CoNLL 2003