Domain Adaptation in Neural Machine Translation using a Qualia-Enriched FrameNet

Alexandre Diniz da Costa, Mateus Coutinho Marim, Ely Matos, Tiago Timponi Torrent


Abstract
In this paper we present Scylla, a methodology for domain adaptation of Neural Machine Translation (NMT) systems that make use of a multilingual FrameNet enriched with qualia relations as an external knowledge base. Domain adaptation techniques used in NMT usually require fine-tuning and in-domain training data, which may pose difficulties for those working with lesser-resourced languages and may also lead to performance decay of the NMT system for out-of-domain sentences. Scylla does not require fine-tuning of the NMT model, avoiding the risk of model over-fitting and consequent decrease in performance for out-of-domain translations. Two versions of Scylla are presented: one using the source sentence as input, and another one using the target sentence. We evaluate Scylla in comparison to a state-of-the-art commercial NMT system in an experiment in which 50 sentences from the Sports domain are translated from Brazilian Portuguese to English. The two versions of Scylla significantly outperform the baseline commercial system in HTER.
Anthology ID:
2022.lrec-1.1
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1–12
Language:
URL:
https://aclanthology.org/2022.lrec-1.1
DOI:
Bibkey:
Cite (ACL):
Alexandre Diniz da Costa, Mateus Coutinho Marim, Ely Matos, and Tiago Timponi Torrent. 2022. Domain Adaptation in Neural Machine Translation using a Qualia-Enriched FrameNet. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1–12, Marseille, France. European Language Resources Association.
Cite (Informal):
Domain Adaptation in Neural Machine Translation using a Qualia-Enriched FrameNet (Costa et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.1.pdf
Data
FrameNet