Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments

Dario Stojanovski; Alexander Fraser

doi:10.18653/v1/W18-6306

Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments

Abstract

Cross-sentence context can provide valuable information in Machine Translation and is critical for translation of anaphoric pronouns and for providing consistent translations. In this paper, we devise simple oracle experiments targeting coreference and coherence. Oracles are an easy way to evaluate the effect of different discourse-level phenomena in NMT using BLEU and eliminate the necessity to manually define challenge sets for this purpose. We propose two context-aware NMT models and compare them against models working on a concatenation of consecutive sentences. Concatenation models perform better, but are computationally expensive. We show that NMT models taking advantage of context oracle signals can achieve considerable gains in BLEU, of up to 7.02 BLEU for coreference and 1.89 BLEU for coherence on subtitles translation. Access to strong signals allows us to make clear comparisons between context-aware models.

Anthology ID:: W18-6306
Volume:: Proceedings of the Third Conference on Machine Translation: Research Papers
Month:: October
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ondřej Bojar, Rajen Chatterjee, Christian Federmann, Mark Fishel, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Matt Post, Lucia Specia, Marco Turchi, Karin Verspoor
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 49–60
Language:
URL:: https://aclanthology.org/W18-6306/
DOI:: 10.18653/v1/W18-6306
Bibkey:
Cite (ACL):: Dario Stojanovski and Alexander Fraser. 2018. Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments. In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 49–60, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: Coreference and Coherence in Neural Machine Translation: A Study Using Oracle Experiments (Stojanovski & Fraser, WMT 2018)
Copy Citation:
PDF:: https://aclanthology.org/W18-6306.pdf

PDF Cite Search Fix data