Cogs in a Machine, Doing What They’re Meant to Do – the AMI Submission to the WMT24 General Translation Task

Atli Jasonarson, Hinrik Hafsteinsson, Bjarki Ármannsson, Steinthór Steingrímsson


Abstract
This paper presents the submission of the Arni Magnusson Institute’s team to the WMT24 General translation task. We work on the English→Icelandic translation direction. Our system comprises four translation models and a grammar correction model. For training our systems we carefully curate our datasets, aggressively filtering out sentence pairs that may detrimentally affect the quality of our systems output. Some of our data are collected from human translations and some are synthetically generated. A part of the synthetic data is generated using an LLM, and we find that it increases the translation capability of our system significantly.
Anthology ID:
2024.wmt-1.18
Volume:
Proceedings of the Ninth Conference on Machine Translation
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
253–262
Language:
URL:
https://aclanthology.org/2024.wmt-1.18
DOI:
Bibkey:
Cite (ACL):
Atli Jasonarson, Hinrik Hafsteinsson, Bjarki Ármannsson, and Steinthór Steingrímsson. 2024. Cogs in a Machine, Doing What They’re Meant to Do – the AMI Submission to the WMT24 General Translation Task. In Proceedings of the Ninth Conference on Machine Translation, pages 253–262, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Cogs in a Machine, Doing What They’re Meant to Do – the AMI Submission to the WMT24 General Translation Task (Jasonarson et al., WMT 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.wmt-1.18.pdf