Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues

Sumire Honda, Patrick Fernandes, Chrysoula Zerva


Abstract
Despite the remarkable advancements in machine translation, the current sentence-level paradigm faces challenges when dealing with highly-contextual languages like Japanese. In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. As business dialogue involves complex discourse phenomena but offers scarce training resources, we adapted a pretrained mBART model, finetuning on multi-sentence dialogue data, which allows us to experiment with different contexts. We investigate the impact of larger context sizes and propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type. We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra sentential context. Overall, we find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation. Regarding translation quality, increased source-side context paired with scene and speaker information improves the model performance compared to previous work and our context-agnostic baselines, measured in BLEU and COMET metrics.
Anthology ID:
2023.mtsummit-research.23
Volume:
Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Month:
September
Year:
2023
Address:
Macau SAR, China
Editors:
Masao Utiyama, Rui Wang
Venue:
MTSummit
SIG:
Publisher:
Asia-Pacific Association for Machine Translation
Note:
Pages:
272–285
Language:
URL:
https://aclanthology.org/2023.mtsummit-research.23
DOI:
Bibkey:
Cite (ACL):
Sumire Honda, Patrick Fernandes, and Chrysoula Zerva. 2023. Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues. In Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, pages 272–285, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):
Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues (Honda et al., MTSummit 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.mtsummit-research.23.pdf