KYB General Machine Translation Systems for WMT23

Ben Li, Yoko Matsuzaki, Shivam Kalkar


Abstract
This paper describes our approach to constructing a neural machine translation system for the WMT 2023 general machine translation shared task. Our model is based on the Transformer architecture’s base settings. We optimize system performance through various strategies. Enhancing our model’s capabilities involves fine-tuning the pretrained model with an extended dataset. To further elevate translation quality, specialized pre- and post-processing techniques are deployed. Our central focus is on efficient model training, aiming for exceptional accuracy through the synergy of a compact model and curated data. We also performed ensembling augmented by N-best ranking, for both directions of English to Japanese and Japanese to English translation.
Anthology ID:
2023.wmt-1.10
Volume:
Proceedings of the Eighth Conference on Machine Translation
Month:
December
Year:
2023
Address:
Singapore
Editors:
Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
137–142
Language:
URL:
https://aclanthology.org/2023.wmt-1.10
DOI:
10.18653/v1/2023.wmt-1.10
Bibkey:
Cite (ACL):
Ben Li, Yoko Matsuzaki, and Shivam Kalkar. 2023. KYB General Machine Translation Systems for WMT23. In Proceedings of the Eighth Conference on Machine Translation, pages 137–142, Singapore. Association for Computational Linguistics.
Cite (Informal):
KYB General Machine Translation Systems for WMT23 (Li et al., WMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.wmt-1.10.pdf