Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns

KayYen Wong, Sameen Maruf, Gholamreza Haffari


Abstract
The advent of context-aware NMT has resulted in promising improvements in the overall translation quality and specifically in the translation of discourse phenomena such as pronouns. Previous works have mainly focused on the use of past sentences as context with a focus on anaphora translation. In this work, we investigate the effect of future sentences as context by comparing the performance of a contextual NMT model trained with the future context to the one trained with the past context. Our experiments and evaluation, using generic and pronoun-focused automatic metrics, show that the use of future context not only achieves significant improvements over the context-agnostic Transformer, but also demonstrates comparable and in some cases improved performance over its counterpart trained on past context. We also perform an evaluation on a targeted cataphora test suite and report significant gains over the context-agnostic Transformer in terms of BLEU.
Anthology ID:
2020.acl-main.530
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Editors:
Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5971–5978
Language:
URL:
https://aclanthology.org/2020.acl-main.530
DOI:
10.18653/v1/2020.acl-main.530
Bibkey:
Cite (ACL):
KayYen Wong, Sameen Maruf, and Gholamreza Haffari. 2020. Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5971–5978, Online. Association for Computational Linguistics.
Cite (Informal):
Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns (Wong et al., ACL 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.acl-main.530.pdf
Video:
 http://slideslive.com/38928844
Code
 sameenmaruf/acl2020-contextnmt-cataphora
Data
OpenSubtitles