Neural Machine Translation of Artwork Titles Using Iconclass Codes

Nikolay Banar, Walter Daelemans, Mike Kestemont


Abstract
We investigate the use of Iconclass in the context of neural machine translation for NL<->EN artwork titles. Iconclass is a widely used iconographic classification system used in the cultural heritage domain to describe and retrieve subjects represented in the visual arts. The resource contains keywords and definitions to encode the presence of objects, people, events and ideas depicted in artworks, such as paintings. We propose a simple concatenation approach that improves the quality of automatically generated title translations for artworks, by leveraging textual information extracted from Iconclass. Our results demonstrate that a neural machine translation system is able to exploit this metadata to boost the translation performance of artwork titles. This technology enables interesting applications of machine learning in resource-scarce domains in the cultural sector.
Anthology ID:
2020.latechclfl-1.5
Volume:
Proceedings of the The 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Month:
December
Year:
2020
Address:
Online
Venues:
CLFL | COLING | LaTeCH | LaTeCHCLfL
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
42–51
Language:
URL:
https://aclanthology.org/2020.latechclfl-1.5
DOI:
Bibkey:
Cite (ACL):
Nikolay Banar, Walter Daelemans, and Mike Kestemont. 2020. Neural Machine Translation of Artwork Titles Using Iconclass Codes. In Proceedings of the The 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 42–51, Online. International Committee on Computational Linguistics.
Cite (Informal):
Neural Machine Translation of Artwork Titles Using Iconclass Codes (Banar et al., LaTeCHCLfL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.latechclfl-1.5.pdf