Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches

Shaohui Kuang, Deyi Xiong, Weihua Luo, Guodong Zhou


Abstract
Sentences in a well-formed text are connected to each other via various links to form the cohesive structure of the text. Current neural machine translation (NMT) systems translate a text in a conventional sentence-by-sentence fashion, ignoring such cross-sentence links and dependencies. This may lead to generate an incoherent target text for a coherent source text. In order to handle this issue, we propose a cache-based approach to modeling coherence for neural machine translation by capturing contextual information either from recently translated sentences or the entire document. Particularly, we explore two types of caches: a dynamic cache, which stores words from the best translation hypotheses of preceding sentences, and a topic cache, which maintains a set of target-side topical words that are semantically related to the document to be translated. On this basis, we build a new layer to score target words in these two caches with a cache-based neural model. Here the estimated probabilities from the cache-based neural model are combined with NMT probabilities into the final word prediction probabilities via a gating mechanism. Finally, the proposed cache-based neural model is trained jointly with NMT system in an end-to-end manner. Experiments and analysis presented in this paper demonstrate that the proposed cache-based model achieves substantial improvements over several state-of-the-art SMT and NMT baselines.
Anthology ID:
C18-1050
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
596–606
Language:
URL:
https://aclanthology.org/C18-1050/
DOI:
Bibkey:
Cite (ACL):
Shaohui Kuang, Deyi Xiong, Weihua Luo, and Guodong Zhou. 2018. Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches. In Proceedings of the 27th International Conference on Computational Linguistics, pages 596–606, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches (Kuang et al., COLING 2018)
Copy Citation:
PDF:
https://aclanthology.org/C18-1050.pdf