Fast-Paced Improvements to Named Entity Handling for Neural Machine Translation

Pedro Mota, Vera Cabarrão, Eduardo Farah


Abstract
In this work, we propose a Named Entity handling approach to improve translation quality within an existing Natural Language Processing (NLP) pipeline without modifying the Neural Machine Translation (NMT) component. Our approach seeks to enable fast delivery of such improvements and alleviate user experience problems related to NE distortion. We implement separate NE recognition and translation steps. Then, a combination of standard entity masking technique and a novel semantic equivalent placeholder guarantees that both NE translation is respected and the best overall quality is obtained from NMT. The experiments show that translation quality improves in 38.6% of the test cases when compared to a version of the NLP pipeline with less-developed NE handling capability.
Anthology ID:
2022.eamt-1.17
Volume:
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2022
Address:
Ghent, Belgium
Editors:
Helena Moniz, Lieve Macken, Andrew Rufener, Loïc Barrault, Marta R. Costa-jussà, Christophe Declercq, Maarit Koponen, Ellie Kemp, Spyridon Pilos, Mikel L. Forcada, Carolina Scarton, Joachim Van den Bogaert, Joke Daems, Arda Tezcan, Bram Vanroy, Margot Fonteyne
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
141–149
Language:
URL:
https://aclanthology.org/2022.eamt-1.17
DOI:
Bibkey:
Cite (ACL):
Pedro Mota, Vera Cabarrão, and Eduardo Farah. 2022. Fast-Paced Improvements to Named Entity Handling for Neural Machine Translation. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 141–149, Ghent, Belgium. European Association for Machine Translation.
Cite (Informal):
Fast-Paced Improvements to Named Entity Handling for Neural Machine Translation (Mota et al., EAMT 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eamt-1.17.pdf